Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.1 ) |
- Date: 2017/01/03 22:38
- Name: Artem Pulkin
- Just for my curiosity, do you use OpenMPI, the Intel one or something else?
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.2 ) |
- Date: 2017/01/04 22:21
- Name: Kylin
- To Artem Pulkin
For my MPB, Mpich2 was employed for test with the GNU compiler, however for the Cluster and the workstation the different versions of Intel MPI with Intel compiler was employed. However the problem would be the same, the unconvergent results between runs.
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.3 ) |
- Date: 2017/01/04 23:18
- Name: Artem Pulkin
- Can you try compiling and running with OpenMPI as well?
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.4 ) |
- Date: 2017/01/09 18:34
- Name: Kylin
- In my MPB with OpenMPI+gcc-6+openBlas, the Force-test also failed.
Cheers Kylin
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.5 ) |
- Date: 2017/03/02 09:23
- Name: T. Ozaki
- Hi,
I think that GaAs_LDA.dat has no problem. 1e-8~9 would be considered to be enough. In parallel calculations it is not guaranteed that the sequence of summation is always the same in any case even using the same number of cores. This means that rounding-off error varies depending on the trial. In addition, in OpenMX some of arrays are allocated as single float to reduce memory consumption, resulting in more serious dependency of rounding-off error on trials. Did you check the case of F2_GGA carefully? I guess that the variation comes from difference in the SCF convergence. It has been already known that the history of the SCF convergence depends on the number of MPI processes and also runs. If you got the sufficient convergence for F2_GGA and still have large difference between analytic and numerical forces, this might be attributed to the way of compilation, or more serious program bugs.
Regards,
TO
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.6 ) |
- Date: 2017/03/09 19:03
- Name: Kylin
- Thanks for your help TO.
For F2_GGA I think the problem would be attributed to the older version of Intel Compiler in the workstation. It is really out of date and I cannot update it which belongs to someone else.
BTW in my point of view, the repeatability would be really important. Ideally in the same machine with the same code, you should always obtained the same result. As least for my experience with lammps code, if we set the same random seed, the output would be always constant. But it seems that the round-off error cannot be fixed that varies from case to case? Thus we cannot obtain the constant result?
Cheers Kylin
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.7 ) |
- Date: 2017/03/10 00:15
- Name: T. Ozaki
- Hi,
> BTW in my point of view, the repeatability would be really important. > Ideally in the same machine with the same code, you should always obtained the same > result. As least for my experience with lammps code, if we set the same random seed, > the output would be always constant. But it seems that the round-off error cannot be > fixed that varies from case to case? Thus we cannot obtain the constant result?
Yes, I agree with you that the repeatability would be really important. However, it is known that the result would vary run to tun due to the absence of associativity of floating-point depending on the implementation of MPI package, while the result should be equivalent run-to-run if we can control the sequence of summation in MPI, as discussed in https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler
This is also discussed even for another MD code: Gromacs as shown in http://www.gromacs.org/Documentation/Terminology/Reproducibility
I did twice 'runtest' on the same machine with 8 MPI processes and 2 OMP threads, and got the following runtest.result files:
* the first run:
1 input_example/Benzene.dat Elapsed time(s)= 5.24 diff Utot= 0.000000000003 diff Force= 0.000000000002 2 input_example/C60.dat Elapsed time(s)= 15.11 diff Utot= 0.000000000003 diff Force= 0.000000000004 3 input_example/CO.dat Elapsed time(s)= 8.13 diff Utot= 0.000000000000 diff Force= 0.000000000003 4 input_example/Cr2.dat Elapsed time(s)= 8.80 diff Utot= 0.000000002143 diff Force= 0.000000000029 5 input_example/Crys-MnO.dat Elapsed time(s)= 19.95 diff Utot= 0.000000000006 diff Force= 0.000000000000 6 input_example/GaAs.dat Elapsed time(s)= 24.58 diff Utot= 0.000000000010 diff Force= 0.000000000000 7 input_example/Glycine.dat Elapsed time(s)= 5.04 diff Utot= 0.000000000054 diff Force= 0.000000000003 8 input_example/Graphite4.dat Elapsed time(s)= 4.41 diff Utot= 0.000000000000 diff Force= 0.000000000001 9 input_example/H2O-EF.dat Elapsed time(s)= 4.00 diff Utot= 0.000000000000 diff Force= 0.000000000001 10 input_example/H2O.dat Elapsed time(s)= 3.81 diff Utot= 0.000000000000 diff Force= 0.000000000001 11 input_example/HMn.dat Elapsed time(s)= 13.57 diff Utot= 0.000000000000 diff Force= 0.000000000000 12 input_example/Methane.dat Elapsed time(s)= 3.37 diff Utot= 0.000000000000 diff Force= 0.000000000000 13 input_example/Mol_MnO.dat Elapsed time(s)= 8.85 diff Utot= 0.000000000539 diff Force= 0.000000000201 14 input_example/Ndia2.dat Elapsed time(s)= 5.29 diff Utot= 0.000000000000 diff Force= 0.000000000001
* the second run:
1 input_example/Benzene.dat Elapsed time(s)= 5.05 diff Utot= 0.000000000003 diff Force= 0.000000000000 2 input_example/C60.dat Elapsed time(s)= 15.04 diff Utot= 0.000000000003 diff Force= 0.000000000003 3 input_example/CO.dat Elapsed time(s)= 8.09 diff Utot= 0.000000000000 diff Force= 0.000000000003 4 input_example/Cr2.dat Elapsed time(s)= 9.16 diff Utot= 0.000000002143 diff Force= 0.000000000029 5 input_example/Crys-MnO.dat Elapsed time(s)= 19.96 diff Utot= 0.000000000005 diff Force= 0.000000000001 6 input_example/GaAs.dat Elapsed time(s)= 24.56 diff Utot= 0.000000000010 diff Force= 0.000000000000 7 input_example/Glycine.dat Elapsed time(s)= 4.92 diff Utot= 0.000000000054 diff Force= 0.000000000003 8 input_example/Graphite4.dat Elapsed time(s)= 4.57 diff Utot= 0.000000000000 diff Force= 0.000000000001 9 input_example/H2O-EF.dat Elapsed time(s)= 3.98 diff Utot= 0.000000000000 diff Force= 0.000000000001 10 input_example/H2O.dat Elapsed time(s)= 3.78 diff Utot= 0.000000000000 diff Force= 0.000000000001 11 input_example/HMn.dat Elapsed time(s)= 14.67 diff Utot= 0.000000000000 diff Force= 0.000000000000 12 input_example/Methane.dat Elapsed time(s)= 3.35 diff Utot= 0.000000000000 diff Force= 0.000000000000 13 input_example/Mol_MnO.dat Elapsed time(s)= 9.62 diff Utot= 0.000000000539 diff Force= 0.000000000201 14 input_example/Ndia2.dat Elapsed time(s)= 5.27 diff Utot= 0.000000000000 diff Force= 0.000000000001
A close look informs us that the last digit is different in some cases. Actually this order of difference happens for such a sized system. If we treat large-scale systems and perform more molecular dynamics steps, the difference for run-to-run becomes magnified.
Of course, the statement mentioned above does not exclude a possibility that OpenMX has program bugs, while we have been trying to enhance the reliability of OpenMX.
Regards,
TO
|
Re: Another possible erratic problem in the Forcetest for F2_GGA.dat and GaAs_LDA.dat ( No.8 ) |
- Date: 2017/03/10 16:11
- Name: T. Ozaki
- Hi,
I also performed the forcetest for F2_GGA.dat and GaAs_LDA.dat with 8 MPI processes repeatedly in my computational environment, and obtained exactly the same result within double precision.
Based on my trial, I would suspect the math library you used (I guess that you used MKL). Could you try ACML instead of MKL, and report what happens?
Regards,
TO
|