In some cases, one may want to know machine performance for more time consuming calculations. For this purpose, an automatic running test with relatively large-scale systems can be performed by
For the serial running
% ./openmx -runtestLFor the MPI parallel running
% mpirun -np 4 openmx -runtestLFor the OpenMP/MPI parallel running
% mpirun -np 4 openmx -runtestL -nt 1Then, OpenMX will run with 16 test files, and compare calculated results with the reference results which are stored in 'work/large_example'. The comparison (absolute difference in the total energy and force) is stored in a file 'runtestL.result' in the directory 'work'. The reference results were calculated using 64 MPI processes of a 2.4 GHz Opteron cluster machine. If the difference is within last seven digits, we may consider that the installation is successful. As an example, 'runtestL.result' generated by the automatic running test is shown below:
1 | large_example/5_5_13COb2.dat | Elapsed time(s)= 625.96 | diff Utot= 0.000000000001 | diff Force= 0.000000000000 |
2 | large_example/B2C62_Band.dat | Elapsed time(s)= 4936.72 | diff Utot= 0.000000000020 | diff Force= 0.000000001359 |
3 | large_example/CG15c-Kry.dat | Elapsed time(s)= 818.90 | diff Utot= 0.000000000213 | diff Force= 0.000000000170 |
4 | large_example/DIA512-1.dat | Elapsed time(s)= 724.37 | diff Utot= 0.000000022703 | diff Force= 0.000000045168 |
5 | large_example/FeBCC.dat | Elapsed time(s)= 845.39 | diff Utot= 0.000000000029 | diff Force= 0.000000000001 |
6 | large_example/GEL.dat | Elapsed time(s)= 376.48 | diff Utot= 0.000000000000 | diff Force= 0.000000000001 |
7 | large_example/GFRAG.dat | Elapsed time(s)= 689.25 | diff Utot= 0.000000000001 | diff Force= 0.000000000001 |
8 | large_example/GGFF.dat | Elapsed time(s)=10650.43 | diff Utot= 0.000000000089 | diff Force= 0.000000000272 |
9 | large_example/MCCN.dat | Elapsed time(s)= 991.92 | diff Utot= 0.000000000507 | diff Force= 0.000000000643 |
10 | large_example/Mn12_148_F.dat | Elapsed time(s)= 841.40 | diff Utot= 0.000000000686 | diff Force= 0.000000000128 |
11 | large_example/N1C999.dat | Elapsed time(s)= 5312.34 | diff Utot= 0.000000047093 | diff Force= 0.000001883044 |
12 | large_example/Ni63-O64.dat | Elapsed time(s)= 900.03 | diff Utot= 0.000000011907 | diff Force= 0.000000000097 |
13 | large_example/Pt63.dat | Elapsed time(s)= 585.73 | diff Utot= 0.000000006067 | diff Force= 0.000000000139 |
14 | large_example/SialicAcid.dat | Elapsed time(s)= 162.20 | diff Utot= 0.000000000001 | diff Force= 0.000000000000 |
15 | large_example/ZrB2_2x2.dat | Elapsed time(s)= 1993.03 | diff Utot= 0.000000000009 | diff Force= 0.000000000001 |
16 | large_example/nsV4Bz5.dat | Elapsed time(s)= 2193.63 | diff Utot= 0.000000000140 | diff Force= 0.000000000003 |
The comparison was made using 32 processes by MPI and 4 threads
by OpenMP (totally 128 cores) on the same machine. Since the automatic
running test requires considerable memory size, you may encounter
a segmentation fault on computational environment with small memory.
Also the above example implies that the total elapsed time is more than
9 hours even using 128 cores.