Crys-MnO example from work |
- Date: 2020/07/08 20:02
- Name: Sergey
<slisenk@gmail.com>
- Hello,
I have a question regarding Crys-MnO.dat example from "work" directory, that is the part of "runtest" calculation.
When running OpenMX-3.9.2 using "-runtest" option, I noticed small difference between my output and reference. I believe instead of 2 non zero last digits as in other "runtest.result" outputs I have 5 or 6. It is still small, I understand.
What surprises me that in the reference file I see that convergence is achieved in 45 SCF steps, while in my case 80 wasn't enough. My computational environment is Cray XC40. Also, I could not get close numbers in terms of speed.
For example, "runtest.result_xc40" shows:
OpenMX Ver.3.9 icc version 17.0.7, compiler option -Dxt3 -O3 -axCOMMON-AVX512,CORE-AVX512,CORE-AVX2,CORE-AVX-I,AVX,SSE4.2,SSE4.1,SSE3,SSSE3,SSE2 -qopenmp
Cray-XC40 (Intel Xeon E5-2695v4 2.1GHz) 18 processes (MPI) x 2 thread (OpenMP)
1 input_example/Benzene.dat Elapsed time(s)= 4.23 diff Utot= 0.000000000040 diff Force= 0.000000000002 2 input_example/C60.dat Elapsed time(s)= 12.40 diff Utot= 0.000000000001 diff Force= 0.000000000001 3 input_example/CO.dat Elapsed time(s)= 9.09 diff Utot= 0.000000000150 diff Force= 0.000000009551 4 input_example/Cr2.dat Elapsed time(s)= 8.56 diff Utot= 0.000000000462 diff Force= 0.000000000004 5 input_example/Crys-MnO.dat Elapsed time(s)= 20.81 diff Utot= 0.000000000001 diff Force= 0.000000000014 6 input_example/GaAs.dat Elapsed time(s)= 31.99 diff Utot= 0.000000000001 diff Force= 0.000000000001 7 input_example/Glycine.dat Elapsed time(s)= 4.71 diff Utot= 0.000000000001 diff Force= 0.000000000002 8 input_example/Graphite4.dat Elapsed time(s)= 4.89 diff Utot= 0.000000000032 diff Force= 0.000000000004 9 input_example/H2O-EF.dat Elapsed time(s)= 4.03 diff Utot= 0.000000000001 diff Force= 0.000000000002 10 input_example/H2O.dat Elapsed time(s)= 3.83 diff Utot= 0.000000000001 diff Force= 0.000000001042 11 input_example/HMn.dat Elapsed time(s)= 12.73 diff Utot= 0.000000000064 diff Force= 0.000000000029 12 input_example/Methane.dat Elapsed time(s)= 3.24 diff Utot= 0.000000000004 diff Force= 0.000000000001 13 input_example/Mol_MnO.dat Elapsed time(s)= 8.32 diff Utot= 0.000000000576 diff Force= 0.000000000032 14 input_example/Ndia2.dat Elapsed time(s)= 6.12 diff Utot= 0.000000000000 diff Force= 0.000000000001
Total elapsed time (s) 134.96
In my case (I have 32 cores/node): 16 MPI, 2 OPENMP threads
1 input_example/Benzene.dat Elapsed time(s)= 14.09 diff Utot= 0.000000000045 diff Force= 0.000000000007 2 input_example/C60.dat Elapsed time(s)= 30.48 diff Utot= 0.000000000016 diff Force= 0.000000000006 3 input_example/CO.dat Elapsed time(s)= 59.03 diff Utot= 0.000000000132 diff Force= 0.000000000827 4 input_example/Cr2.dat Elapsed time(s)= 38.19 diff Utot= 0.000000000410 diff Force= 0.000000000051 5 input_example/Crys-MnO.dat Elapsed time(s)= 131.87 diff Utot= 0.000000017210 diff Force= 0.000000088999 6 input_example/GaAs.dat Elapsed time(s)= 116.48 diff Utot= 0.000000002764 diff Force= 0.000000000016 7 input_example/Glycine.dat Elapsed time(s)= 12.06 diff Utot= 0.000000000001 diff Force= 0.000000000001 8 input_example/Graphite4.dat Elapsed time(s)= 19.55 diff Utot= 0.000000000018 diff Force= 0.000000000061 9 input_example/H2O-EF.dat Elapsed time(s)= 12.08 diff Utot= 0.000000000001 diff Force= 0.000000000002 10 input_example/H2O.dat Elapsed time(s)= 10.45 diff Utot= 0.000000000000 diff Force= 0.000000000020 11 input_example/HMn.dat Elapsed time(s)= 25.32 diff Utot= 0.000000000190 diff Force= 0.000000000000 12 input_example/Methane.dat Elapsed time(s)= 9.42 diff Utot= 0.000000000001 diff Force= 0.000000000001 13 input_example/Mol_MnO.dat Elapsed time(s)= 20.89 diff Utot= 0.000000000389 diff Force= 0.000000000237 14 input_example/Ndia2.dat Elapsed time(s)= 14.11 diff Utot= 0.000000000088 diff Force= 0.000000000068
Total elapsed time (s) 514.02
We have different intel compilers (16,17,18), MKL or LibSci, but I still execution time is quite different.
Any ideas what it can be?
Thanks, Sergey
| |