In the cluster calculation, a double parallelization is made for two loops: spin multiplicity and eigenstates, where the spin multiplicity is one for the spin-unpolarized and non-collinear calculation, and two for the spin-polarized calculation, respectively. The priority of parallelization is in order of spin multiplicity and eigenstates. OpenMX Ver. 3.8 employs ELPA  to solve the eigenvalue problem in the cluster calculation, which is a highly parallelized eigevalue solver. Figure 21 (b) shows the speed-up ratio as a function of processors in the elapsed time for a spin-polarized calculation of a single molecular magnet consisting of 148 atoms. The input file 'Mn12.dat' is found in the directory 'work'. It is found that the speed-up ratio is 11 and 17 using 32 and 64 processes, respectively.