Geometry optimization and restart files in parallel (on many nodes)

Top Page > Browsing

Geometry optimization and restart files in parallel (on many nodes)

Date: 2011/06/14 23:03
Name: Mauro Sgroi <maurosgroi@yahoo.it>: Dear all,
I'm looking for an advice to set up Openmx on a calculation cluster.
I succeeded to compile and install the code but I still have the following problem.
Launching Openmx on more then 1 node the code writes different the _rst files on each node. I can use that files and the .dat# to restart a simple calculation.

When I launch a geometry optimization the code is not able to read automatically the _rst files on each MD step and complains with the following error:

Failed (2) in reading the restart file ./c60_rst/c60.rst33

If I launch the same calculation on one node, Openmx reads smoothly the restart file from the input_rst directory.

Can you help me to understand where I'm wrong?

The described behaviour is the same with pure MPI and hybrid MPI/OpenmP parallelization.

Thanks a lot and best regards,
Mauro Sgroi.

Page: [1]

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.1 )

Date: 2011/06/16 21:05
Name: T.Ozaki

Hi,

A directory where the restart files are stored has to be shared by NFS, and
can be specified by a keyword:

System.CurrrentDirectory

whose default is ./

I guess that the directory you used is a local work directory which can be
accessed from only each node.

Regards,

TO

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.2 )

Date: 2011/06/16 23:10
Name: Mauro Sgroi <maurosgroi@yahoo.it>

Dear Prof. Ozaki,
thanks a lot for the reply.
I will configure my input to save the files on a shared NFS. My only doubt was related to the speed of this type of filesystem and to the related performance of the calculation.
Best regards,
Mauro.

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.3 )

Date: 2011/06/17 10:41
Name: T.Ozaki

Hi,

The reduction of performance depends on the performance of file system you use.
In our experiences, the significant reduction of performance has not been observed
in parallel calculations using less than 100 cores, although this part could be
a bottleneck using many cores, say 1000 cores, as you mentioned.

Regards,

TO

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.4 )

Date: 2011/06/17 16:43
Name: Mauro Sgroi <maurosgroi@yahoo.it>

Dear T.Ozaki,
thanks a lot for the information. I will install the code on our calculation cluster following your suggestion.
Best regards,
Mauro Sgroi.

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.5 )

Date: 2011/06/17 18:59
Name: Mauro Sgroi <maurosgroi@yahoo.it>

Dear T.Ozaki,
I'm executing an geometry optimization run (Valorphin_DC.dat) using a NFS shared to all processors and I get the same error message:

******************* MD= 2 SCF= 1 *******************
Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst24
Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst29
Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst30

Is this a normal behavior for a geomtry optimization?

Best regards,
Mauro.

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.6 )

Date: 2011/06/20 18:05
Name: T.Ozai

Hi,

I do not see such a case using the input file with a little modification
so that the geometry optimization can be performed.

If you are really sure that the directory is shared by NFS,
I guess that time for synchronization in NFS is too long.

Regards,

TO

Re: Geometry optimization and restart files in parallel (on many nodes) ( No.7 )

Date: 2011/06/21 19:25
Name: Mauro Sgroi <maurosgroi@yahoo.it>

Hi,
you are right. It was a problem with the permissions to write on the NFS from one of the nodes. Now the code works smoothly.
I'm checking the performances because I suspect that our network is too slow to get good scaling. Maybe we need to install an high performance NFS.

Best regards,
Mauro Sgroi.

Page: [1]