ScaLAPACK version

It is possible to enhance the performance of OpenMX using ScaLAPACK, PBLAS, and BLACS. The performance improvement will be obtained in terms of not only computational speed, but also memory usage when 'Cluster' and 'Band' for the keyword 'scf.EigenvalueSolver' is employed for large-scale systems including more than 500 atoms. To compile the ScaLAPACK version, you need to include an option '-Dscalapack' for CC. For example, you may specify it as follows:

  CC = mpicc -O3 -Dscalapack -xHOST -ip -no-prec-div -openmp -I/opt/intel/mkl/include/fftw
  FC = mpif90 -O3 -xHOST -ip -no-prec-div -openmp
  LIB= -L/opt/intel/mkl/lib -mkl=parallel -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 \ 
  -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -lpthread -lifcore -lmpi -lmpi_f90 -lmpi_f77

It is noted that in addition to '-Dscalapack', libraries related to ScaLAPACK have to be properly linked. The ScaLAPACK version is compatible with OpenMP, enabling the ScaLAPACK/OpenMP calculations. The comparison of the performance will be discussed in the section 'Large-scale calculations'.