next up previous contents index
Next: Combination of the O(N) Up: Large-scale calculations Previous: Large-scale calculations   Contents   Index

Conventional scheme

Using the conventional diagonalization method, OpenMX Ver. 3.7 is capable of performing geometry optimization for systems consisting of 1000 atoms if several hundreds processor cores are available. To demonstrate the capability, one can perform 'runtestL2' as follows:

     % mpirun -np 128 openmx -runtestL2 -nt 4
  
Then, OpenMX will run with 7 test files, and compare calculated results with the reference results which are stored in 'work/large2_example'. The following is a result of 'runtestL2' performed using 128 MPI processes and 4 OpenMP threads on CRAY-XC30.

1 large2_example/C1000.dat Elapsed time(s)= 1731.83 diff Utot= 0.000000002838 diff Force= 0.000000007504
2 large2_example/Fe1000.dat Elapsed time(s)=21731.24 diff Utot= 0.000000010856 diff Force= 0.000000000580
3 large2_example/GRA1024.dat Elapsed time(s)= 2245.67 diff Utot= 0.000000002291 diff Force= 0.000000015333
4 large2_example/Ih-Ice1200.dat Elapsed time(s)= 952.84 diff Utot= 0.000000000031 diff Force= 0.000000000213
5 large2_example/Pt500.dat Elapsed time(s)= 6831.16 diff Utot= 0.000000002285 diff Force= 0.000000004010
6 large2_example/R-TiO2-1050.dat Elapsed time(s)= 2259.97 diff Utot= 0.000000000106 diff Force= 0.000000001249
7 large2_example/Si1000.dat Elapsed time(s)= 1655.25 diff Utot= 0.000000001615 diff Force= 0.000000005764
Total elapsed time (s) 37407.95

The quality of all the calculations is at a level of production run where double valence plus a single polarization functions are allocated to each atom as basis functions. Except for 'Pt500.dat', all the systems include more than 1000 atoms, where the last number of the file name implies the number of atoms for each system, and the elapsed time implies that geometry optimization for systems consisting of 1000 atoms is possible if several hundreds processor cores are available. The input files used for the calculations and the output files are found in the directory 'work/large2_example'. The following information is compiled from the output files.

No. Input file SCF steps Elapsed time(s/SCF/spin) Dimension
1 large2_example/C1000.dat 44 35 13000
2 large2_example/Fe1000.dat 384 30 13000
3 large2_example/GRA1024.dat 54 35 13312
4 large2_example/Ih-Ice1200.dat 41 18 9200
5 large2_example/Pt500.dat 171 35 12500
6 large2_example/R-TiO2-1050.dat 35 57 15750
7 large2_example/Si1000.dat 48 34 13000

The dimension of the Kohn-Sham Hamiltonian is of the order of 10000, and the elapsed time per SCF step is around 40 seconds for all the systems, implying that the difference in the total elapsed time mainly comes from the difference in the SCF iterations to achieve the SCF convergence of 10e-10 (Hartree) for the band energy.


next up previous contents index
Next: Combination of the O(N) Up: Large-scale calculations Previous: Large-scale calculations   Contents   Index
t-ozaki 2013-05-22