In some cases, one may want to know machine performance for more time consuming calculations. For this purpose, an automatic running test with relatively large-scale systems can be performed by
For the MPI parallel running
% mpirun -np 112 openmx -runtestLFor the MPI/OpenMP parallel running
% mpirun -np 112 openmx -runtestL -nt 2Then, OpenMX will run with 16 test files, and compare calculated results with the reference results which are stored in 'work/large_example'. The comparison (absolute difference in the total energy and force) is stored in a file 'runtestL.result' in the directory 'work'. The reference results were calculated using 28 MPI processes of a 2.6 GHz Xeon cluster machine. If the difference is within last seven digits, we may consider that the installation is successful. As an example, 'runtestL.result' generated by the automatic running test is shown below:
1 | large_example/5_5_13COb2.dat | Elapsed time(s)= 52.78 | diff Utot= 0.000000000020 | diff Force= 0.000000000004 |
2 | large_example/B2C62_Band.dat | Elapsed time(s)= 403.51 | diff Utot= 0.000000000001 | diff Force= 0.000000063810 |
3 | large_example/CG15c-DC-LNO.dat | Elapsed time(s)= 103.31 | diff Utot= 0.000000000269 | diff Force= 0.000000000551 |
4 | large_example/DIA512-1.dat | Elapsed time(s)= 49.35 | diff Utot= 0.000000027379 | diff Force= 0.000000031436 |
5 | large_example/FeBCC.dat | Elapsed time(s)= 80.54 | diff Utot= 0.000000000016 | diff Force= 0.000000000001 |
6 | large_example/GEL.dat | Elapsed time(s)= 44.95 | diff Utot= 0.000000000009 | diff Force= 0.000000000004 |
7 | large_example/GFRAG.dat | Elapsed time(s)= 27.68 | diff Utot= 0.000000000001 | diff Force= 0.000000000001 |
8 | large_example/GGFF.dat | Elapsed time(s)= 643.36 | diff Utot= 0.000000000037 | diff Force= 0.000000000809 |
9 | large_example/MCCN.dat | Elapsed time(s)= 82.04 | diff Utot= 0.000000005885 | diff Force= 0.000000003486 |
10 | large_example/Mn12_148_F.dat | Elapsed time(s)= 74.25 | diff Utot= 0.000000000015 | diff Force= 0.000000000010 |
11 | large_example/N1C999.dat | Elapsed time(s)= 1212.42 | diff Utot= 0.000000000035 | diff Force= 0.000000000390 |
12 | large_example/Ni63-O64.dat | Elapsed time(s)= 70.90 | diff Utot= 0.000000000211 | diff Force= 0.000000000008 |
13 | large_example/Pt63.dat | Elapsed time(s)= 58.76 | diff Utot= 0.000000001297 | diff Force= 0.000000000242 |
14 | large_example/SialicAcid.dat | Elapsed time(s)= 16.75 | diff Utot= 0.000000000001 | diff Force= 0.000000000001 |
15 | large_example/ZrB2_2x2.dat | Elapsed time(s)= 133.10 | diff Utot= 0.000000000044 | diff Force= 0.000000000020 |
16 | large_example/nsV4Bz5.dat | Elapsed time(s)= 99.37 | diff Utot= 0.000000004771 | diff Force= 0.000000003167 |
The comparison was made using 112 MPI processes on the same Xeon cluster machine. Since the automatic running test requires large memory, you may encounter a segmentation fault in case that a small number of cores are used. Also the above example implies that the total elapsed time is about 53 minutes even using 112 cores. See also the Section 'Large-scale calculation' for another large-scale benchmark calculation.