Interface HFBTHO/HFODDandComments on Parallelization UTK-ORNL DFT group
Interface HFBTHO/HFODDandComments on Parallelization UTK-ORNL DFT group
Interface HFBTHO/HFODD
Principle
Unitary transformation Cylindrical to Cartesian
Phase transformation
Tweak HFODD to restart from HFB matrix elements instead of density fields on Gauss-Hermite mesh New HFODD and MPI_HFODD versions with HFBTHO as a module called (upon request) in initial stage
Automatic restart of HFODD
I/O required: HFB matrix + basis quantum numbers written/read on disk Open Issues:
too much memory required for large (N ≥ 18 shells) deformed bases
more tests for odd nuclei HFBTHO
Cylindrical HO Basis
Axial symmetry and time-reversal symmetry HFODD
Cartesian HO basis
Symmetry unrestricted
MPI_HFODD
Master-slave architecture: master defines a list of task, distributes the tasks to the slaves available (until list is empty) and collect the results
Compiles/runs with Intel Fortran, GNU Fortran, Portland and PathScale compilers
Can run on your laptop…! (most modern laptops are dual/quad cores) Super Computers: Increase number of cores at fixed memory
Available memory per core is decreasing ! 90% of CPU-time taken by only two subroutines:
DENSHF (calculation of fields on Gauss-Hermite mesh)
DIAMAT (diagonalization of HFB matrix
Future of HFODDApplications on Leadership Class computers
Future work:
Diagonalization of the HFB matrix can be parallelized “relatively” simply by the use of threading and ARPACK or ScaLAPACK specialized routines
Parallelization of density fields is more tricky
Include HFODD - MPI_HFODD in optimization codes
Asynchronous Dynamic Load Balancing (ADLB – UNEDF project): dynamic stack. List of task is updated on the fly based on results Good practice in programming
Remember: memory is expensive, CPU-time is fast and cheap
Use Fortran 90 for dynamic memory allocation
Avoid vectorization and think parallelization instead Example: Takagi factorization should decrease by ~4 the memory needed
Comments