Geometry Optimization: Ver. 1.0
1 Newton method
The total energy of a system can be expanded by the Taylor series with respect to atomic coordinates around with as follows:
(1) |
where the derivatives mean the partial derivatives at , and is the number of atoms. By differentiating Eq. (1) with respect to , to the second order we have
(2) |
In case the coordinates give a local minimum, assuming , we have the following matrix equation:
(3) |
The short notation is
(4) |
where the matrix consisting of the second derivatives in the left-hand side is called Hessian . Using Eq. (4), can be updated by
(5) |
This is the well known Newton method.
2 RMM-DIIS
In the OpenMX, and in Eq. (5) are replaced by and given by the residual minimization method in the direct inversion of iterative subspace (RMM-DIIS) [1, 2] as follows:
(6) |
where is a tuning parameter for acceleration of the convergence, which can be small (large) for a large (small) . in the RMM-DIIS can be found by a linear combination of previous upto p-th gradients as
(7) |
where is found by minimizing with a constraint . According to Lagrange’s multiplier method, is defined by
(8) | |||||
Considering and , an optimum set of can be found by solving the following linear equation:
(9) |
An optimum choice of may be obtained by the set of coefficients as
(10) |
If the Hessian is approximated by the unity , Eq. (6) becomes
(11) |
This scheme in the Cartesian coordinate has been implemented as ’DIIS’ in OpenMX.
3 Broyden-Fletcher-Goldfarb-Shanno (BFGS) method
Define
(12) | |||||
(13) |
Then, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method [3] gives the following rank-2 update formula for :
(14) | |||||
where . An optimization scheme using Eq. (6) and the BFGS update formula for the inverse of an approximate Hessian matrix in the Cartesian coordinate has been implemented as ’BFGS’ in OpenMX.
4 Rational function (RF) method
The BFGS update by Eq. (14) without any care gives an ill-conditioned approximate inverse of Hessian having negative eigenvalues in many cases. This leads to the optimization to saddle points rather than the optimization to a minimum. The rational function (RF) method [4] can avoid the situation in principle. Instead of Eq. (1), we may consider the following expression:
(15) |
Then, the equation corresponding to Eq. (4) becomes
(16) |
Therefore, a large assures that is positive definite. If is given by
(17) |
With Eq. (17), Eq. (16) may be equivalent to
(18) |
where the size of the matrix in the left-hand side is , and called the augmented Hessian. The lowest eigenvalue of the eigenvalue problem defined by Eq. (18) may give an optimum choice for , and the corresponding eigenvector, the last component is scaled to 1, gives an optimization step . In Eq. (18), the approximate Hessian can be estimated by the following BFGS formula:
(19) |
where . An optimization scheme using Eq. (18) and the BFGS update formula Eq. (19) in the Cartesian coordinate has been implemented as ’RF’ in OpenMX.
5 Eigenvector following (EF) method
By diagonalizing the approximate Hessian given by Eq. (19), the ill-conditioned situation can be largely reduced [5]. The approximate Hessian is diagonalized as
(20) |
where is a diagonal matrix of which diagonal parts are eigenvalues of . If the eigenvalue of the approximate Hessian is smaller than a threshold (0.02 a.u. in OpenMX3.3), the eigenvalue is set to the threshold. The modification of eigenvalues gives a corrected matrix instead of . Then, we have the inverse of a corrected Hessian matrix being a positive definite as
(21) |
A optimization scheme using the inverse Eq. (21) in Eq. (6) in the Cartesian coordinate has been implemented as ’EF’ in OpenMX.
In addition, there are two important prescriptions for the stable optimization: (i) If is positive in the update of Hessian by Eq. (19), it is assured that the updated Hessian is positive definite. Therefore, if is negative, the update should not be performed. (ii) The maximum step should be always monitored, so that an erratic movement of atomic position can be avoided.
References
- [1] P. Csaszar and P. Pulay, J. Mol. Struc. 114, 31 (1984).
- [2] F. Eckert, P. Pulay, and H.-J. Werner, J. Comp. Chem. 18, 1473 (1997).
- [3] C. G. Broyden, J. Inst. Math. Appl. 6, 76 (1970); R. Fletcher, Comput. J. 13, 317 (1970); D. Goldrarb, Math. Comp. 24, 23 (1970); D. F. Shanno, Math. Comp. 24, 647 (1970).
- [4] A. Banerjee et al., J. Phys. Chem. 89, 52 (1986).
- [5] J. Baker, J. Comput. Chem. 7, 385 (1986).