|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.1 ) | 
|  Date: 2011/06/16 21:05 Name: T.Ozaki
 
Hi, 
 A directory where the restart files are stored has to be shared by NFS, and
 can be specified by a keyword:
 
 System.CurrrentDirectory
 
 whose default is ./
 
 I guess that the directory you used is a local work directory which can be
 accessed from only each node.
 
 Regards,
 
 TO
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.2 ) | 
|  Date: 2011/06/16 23:10 Name: Mauro Sgroi  <maurosgroi@yahoo.it>
 
Dear Prof. Ozaki,thanks a lot for the reply.
 I will configure my input to save the files on a shared NFS. My only doubt was related to the speed of this type of filesystem and to the related performance of the calculation.
 Best regards,
 Mauro.
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.3 ) | 
|  Date: 2011/06/17 10:41 Name: T.Ozaki
 
Hi, 
 The reduction of performance depends on the performance of file system you use.
 In our experiences, the significant reduction of performance has not been observed
 in parallel calculations using less than 100 cores, although this part could be
 a bottleneck using many cores, say 1000 cores, as you mentioned.
 
 Regards,
 
 TO
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.4 ) | 
|  Date: 2011/06/17 16:43 Name: Mauro Sgroi  <maurosgroi@yahoo.it>
 
Dear T.Ozaki,thanks a lot for the information. I will install the code on our calculation cluster following your suggestion.
 Best regards,
 Mauro Sgroi.
 
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.5 ) | 
|  Date: 2011/06/17 18:59 Name: Mauro Sgroi  <maurosgroi@yahoo.it>
 
Dear T.Ozaki,I'm executing an geometry optimization run (Valorphin_DC.dat) using a NFS shared to all processors and I get the same error message:
 
 ******************* MD= 2 SCF= 1 *******************
 Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst24
 Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst29
 Failed (2) in reading the restart file /tmp/openmx/mount/val_dc_rst/val_dc.rst30
 
 Is this a normal behavior for a geomtry optimization?
 
 Best regards,
 Mauro.
 
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.6 ) | 
|  Date: 2011/06/20 18:05 Name: T.Ozai
 
Hi, 
 I do not see such a case using the input file with a little modification
 so that the geometry optimization can be performed.
 
 If you are really sure that the directory is shared by NFS,
 I guess that time for synchronization in NFS is too long.
 
 Regards,
 
 TO
 | 
|  Re: Geometry optimization and restart files in parallel (on many nodes) ( No.7 ) | 
|  Date: 2011/06/21 19:25 Name: Mauro Sgroi  <maurosgroi@yahoo.it>
 
Hi,you are right. It was a problem with the permissions to write on the NFS from one of the nodes. Now the code works smoothly.
 I'm checking the performances because I suspect that our network is too slow to get good scaling. Maybe we need to install an high performance NFS.
 
 Best regards,
 Mauro Sgroi.
 
 |