Taisuke Ozaki (RCIS-JAIST), Kengo Nishio (RICS-AIST), Hiori Kino (NIMS)
An efficient implementation of the nonequilibrium Green function (NEGF) method combined with the density functional theory (DFT) using localized pseudo-atomic orbitals (PAOs) is presented for electronic transport calculations of a system connected with two leads under a finite bias voltage. In the implementation, accurate and efficient methods are developed especially for evaluation of the density matrix and treatment of boundaries between the scattering region and the leads. Equilibrium and nonequilibrium contributions in the density matrix are evaluated with very high precision by a contour integration with a continued fraction representation of the Fermi-Dirac function and by a simple quadrature on the real axis with a small imaginary part, respectively. The Hartree potential is computed efficiently by a combination of the two dimensional fast Fourier transform (FFT) and a finite difference method, and the charge density near the boundaries is constructed with a careful treatment to avoid the spurious scattering at the boundaries. The efficiency of the implementation is demonstrated by rapid convergence properties of the density matrix. In addition, as an illustration, our method is applied for zigzag graphene nanoribbons, a Fe/MgO/Fe tunneling junction, and a LaMnOSrMnO superlattice, demonstrating its applicability to a wide variety of systems.
The non-equilibrium Green function (NEGF) method[1,2,3,4,5,6] potentially has several advantages to investigate electronic transport properties of nano-scale materials such as single molecules,[7,8] atomic wires,[9,10] carbon based materials,[11,12] and thin layers.[13,14] The potential advantages are summarized by the following features of the NEGF method: (i) the source and drain contacts are treated based on the same theoretical framework as for the scattering region. [3,4,6] (ii) the electronic structure of the scattering region under a finite source-drain bias voltage is self-consistently determined by combining with first principle electronic structure calculation methods such as the density functional theory (DFT) and the Hartree-Fock (HF) method. [17,18,19,20,21,22,23,24,25,26,27] (iii) many body effects in the transport properties, e.g., electron-phonon[28,29,30,31,32,33] and electron-electron interactions,[34,35,36,37] might be included through self-energies without largely deviating the theoretical framework. (iv) its applicability to large-scale systems can be anticipated, since the NEGF method relies practically on the locality of basis functions in real space, resulting in computations for sparse matrices.[15] Due to those potential advantages, recently several groups have implemented the NEGF method coupled with the DFT or HF method using atomic-type or the other local basis functions with successful applications for calculations of the electronic transport properties. [15,16,17,18,19,20,21,22,23,24,25,26,27]
However, a highly accurate and efficient implementation method must be still developed from the following two reasons: The first obvious reason is to extend the applicability of the NEGF method to large-scale systems. The efficient implementation might lead to more challenging applications of the NEGF method to very large-scale complicated systems. The majority part in the computational effort of the NEGF method mainly comes from the evaluation of the density matrix which is decomposed into the evaluation of Green functions and numerical integrations. Thus, the efficient calculation of the part is a key factor for extending the applicability to large-scale systems. Nevertheless, accurate and efficient methods for evaluating density matrix within the NEGF method have not been fully developed, although several methods have been already proposed. [17,18,38] To extend the applicability of the NEGF method to large-scale systems, a remarkably efficient method that we have recently developed[45] will be applied for the problem in this study, and we will show that the new method is much faster than the other method.[18] The second reason is that spurious scattering should be negligible when the NEGF method is extended to include the many body effects beyond the one particle picture.[28,29,30,31,32,33,34,35,36,37] The spurious scattering accompanied by the inaccurate implementation might make the many body effects indistinct in the electronic transport properties. One can imagine that the spurious scattering can be easily produced in the NEGF method, since unlike the conventional band structure calculations NEGF has to be evaluated by a patch work that the self consistent field (SCF) calculations of the source and drain leads are performed beforehand, and the calculated results are incorporated in the NEGF calculations through the self energy and the boundary conditions between the scattering region and leads. Therefore, a careful treatment to handle the boundary conditions should be developed to avoid the spurious scattering.
In this paper, to address the above two issues we present an accurate and efficient implementation of the NEGF method, in combination with DFT using pseudo-atomic orbitals (PAOs) and pseudopotentials, using a contour integration method which is based on a continued fraction representation of the Fermi-Dirac function. For the accurate treatment of the boundary conditions between the scattering region and leads, we also develop a method for calculating the Hartree potential by a combination of the two dimensional fast Fourier transform (FFT) and a finite difference method so that the boundary condition can be correctly reproduced. In addition, we discuss a careful treatment to construct the charge density near the boundaries. The efficiency and accuracy of our implementation are demonstrated by several numerical test calculations on convergence of the density matrix.
This paper is organized as follows: In Sec. II the details of our implementation for treating the equilibrium state of the scattering region are discussed by focusing on the evaluation of the equilibrium density matrix, the treatment for constructing the charge density near the boundaries, and an efficient method for calculating the Hartree potential. In Sec. III our implementation for the nonequilibrium state is described. In Sec. IV we demonstrate the accuracy and efficiency of the implementation by a series of numerical calculations and several applications. In Sec. V our implementation of the NEGF method is summarized.
Since most of practical aspects in the implementation of the NEGF method coupled with DFT using the localized PAOs[42,43,44] appear in the ground state calculation of the system at equilibrium, we start our discussion from the electronic structure calculation of the equilibrium ground state by using the Green function method with DFT.
Let us consider a system where one dimensional infinite cells
are arranged with two dimensional periodicity as shown in Fig. 1.
Throughout the paper, we assume that the electronic transport
along the a-axis is of interest, and that the two dimensional
periodicity spreads over the bc-plane.
The one dimensional infinite cell consists of the central region
denoted by and the cells denoted by and ,
where . All the cells and ,
arranged semi-infinitely, contain the same number of atoms
with the same structural configuration, respectively,
but the cells and can be different from each other.
In the equilibrium state with a common chemical potential
everywhere in the system, the electronic structure of the
system may be determined by DFT.[40,41,65]
Due to the periodicity of the bc-plane the one-particle Kohn-Sham (KS)
wave function in the system is expressed by the Bloch function
on the bc-plane using PAOs
located on
site as:
It is worth pointing out that there is an energy functional which can be variationally minimized with respect to charge density if Eq. (6) is self-consistently solved. The details of derivation for the functional is given in Appendix A.
In general, the surface Green function
is defined by
,
where and are the Hamiltonian and overlap matrices
for the lead regions, and the suffix, , is or .
It is noted that due to the two conditions (i) and (ii) mentioned above
the Hamiltonian and overlap matrices for the lead
regions can be written by a block tridiagonal form as follows:
One of practical difficulties in the implementation of
the Green function method is how the equilibrium density
matrix is evaluated efficiently and accurately.[17,18,38,47,48,49,50,51]
In our implementation, the equilibrium density matrix is
highly efficiently computed using the contour integration method
with a special treatment of the Fermi-Dirac function .[45]
If the Hamiltonian and overlap matrices associated with Eq. (6)
are -dependent, it turns out that the spectrum function
in the Lehmann representation of the central Green function
is complex number in general as discussed in Appendix B
of Ref. [45].
Then, the density matrix
,
where one of the associated basis orbitals is in the central cell
and the other is in the cell denoted by
,
is given by making use of
both the retarded and advanced Green functions
and
as
For the efficient integration in Eq. (23), in our implementation the Fermi-Dirac function is expressed by a continued fraction representation derived from a hypergeometric function[45] so that the structure of poles can be suitable for the integration associated with the Green function as follows:
If forces on atoms are calculated based on the conventional DFT scheme using the non-orthogonal basis orbitals,[18] the evaluation of the energy density matrix is needed. In Appendix B, we derive the calculation scheme of the equilibrium energy density matrix based on the contour integration method.
Even though the basis functions we used are strictly localized
in real space, there is the nonnegligible contribution for
the charge density near the boundary between the central and
lead regions from the basis functions located in the lead regions.
Note that any treatment for the contribution to the charge density
has not been clarified
in the other implementations.[17,18,19,20,21,22,23]
Thus, we carefully calculate the charge density in the central
region by considering three contributions:
We add a note that the same consideration has to be applied even for the
calculation of the density of states (DOS). In this case,
the contribution from off-diagonal block Green functions connecting
the central and lead regions should be added to the DOS of the central
region calculated by
.
The off-diagonal block Green functions can be calculated
from
and the surface Green functions,
and
, by making use of the identity
as follows:
The Hartree potential in the central region is calculated
under the boundary condition that the Hartree potential
at the boundary between the central and
regions is same as that of the lead,
where the Hartree potential in both the lead regions
is calculated using the conventional band structure calculation before
the calculation of the infinite in Fig. 1(b) using the
Green function. In our implementation,
the Hartree potential for the central
region with the boundary condition is efficiently
evaluated by a combination of the two dimensional FFT
and a finite difference method, while other schemes are used
in the other implementations.[17,18]
The majority part of the Hartree potential in our treatment
is accurately calculated by considering the neutral atom
potential which is the sum of the local potential of the
pseudopotential and the Hartree potential by the confined
charge for the neutralization.[52]
The neutral atom potential depends on only the atomic
structure and atomic species,
and has no relation with the boundary condition.
The effect of the relaxation of charge distribution on the
Hartree potential is taken into account by the remaining
minority part of the Hartree potential
given by
Our implementation of the NEGF method with DFT is based on the strictly localized PAOs[42,43,44] and a norm-conserving pseudopotential method.[59] Within the scheme, the calculation of the matrix elements such as the overlap and kinetic energy integrals, consisting of two center integrals, is performed using a Fourier transform method,[60] while the other matrix elements for and , which cannot be decomposed into two center integrals, are evaluated by the numerical integration on the regular mesh in real space.[63] The further details on how the elements of the Hamiltonian and overlap matrices are calculated can be found in Ref. [52].
In addition to the above evaluation of the Hamiltonian and overlap matrix elements, the Hamiltonian matrix elements associated with the basis orbitals situated at near the boundary are treated in a special way as explained below. If the tails of two basis orbitals located on atoms in the central region go beyond the boundary between the central and the lead regions, the associated Hamiltonian matrix element is replaced by the corresponding element in the lead region calculated by the conventional band structure calculation. The case can happen only if the two basis orbitals are located in the region () because of the condition (i) so that the replacement of the Hamiltonian matrix element can be always possible. The replacement is made by assuming that the potential profile near the boundary is similar to that near the boundary between the regions and , and can be justified if the size of the region is large enough.
Compared to conventional band structure calculations, it seems that
the NEGF method tends to suffer from difficulty in obtaining the SCF
convergence. Our observation in several cases suggests that the difficulty
may come from charge sloshing along the a-axis during the SCF iteration.
The difference Hartree potential
change largely
by imposition of the boundary condition even for a small variation
in the charge density distribution, resulting in a serious charge sloshing
along the a-axis. Thus, we consider suppression of the charge sloshing
along the a-axis by introducing the following weight factor :
A technical remark should also be added to avoid a local trap problem in the SCF calculation. In systems having a long a-axis, a unphysical charge distribution, corresponding to a large charge separation in real space, tends to be obtained even after achieving the self consistency. In this case the Hamiltonian of the central region at the first SCF iteration, which are calculated via superposition of atomic charge density, is far from the self-consistently converged one, while the Hamiltonian matrix used in the calculation for the self-energy is determined in a self-consistent manner beforehand. The inconsistency between the two matrices tends to produce a unphysical charge distribution at the first SCF iteration. Once the situation happens at the first SCF iteration, in many cases the electronic structure keeps trapped during the subsequent SCF iteration, which is a serious problem in practical applications. However, the local trap problem can be overcome by a simple scheme that the first few SCF iterations are performed by using the conventional band scheme and then onward the solver is switched from the band scheme to the NEGF method. In the band structure calculation for the first few iterations, such a unphysical charge distribution does not appear due to no self-energy involved. In most cases we find that the simple scheme works well to avoid the local trap problem in the SCF convergence.
Based on the NEGF theory mainly developed
by Schwinger[1] and Keldysh[2],
the density matrix in the nonequilibrium state of the central
region is evaluated by[17,18,21,22]
The integrand in Eq. (44) is not analytic apart from the real axis, since the integrand is a function of both and . Thus, one cannot apply the contour integration method that we use for the equilibrium density matrix. Instead, a simple rectangular quadrature scheme is applied to the integration of Eq. (44) on the real axis with a small imaginary part . Since the integrand contains the difference between two Fermi-Dirac functions, the energy range for the integration can be effectively reduced to a narrow range that the difference is larger than a threshold, where the threshold of is used in this study. With the threshold and the step width of 0.01 (eV), the number of meshes on the real axis is 152 for (eV) at . The convergence speed depends on the shape of the integrand and how large is employed for smearing the integrand, which will be discussed later.
The source-drain bias voltage applied to the left and right leads is easily incorporated
by adding a constant electric potential to the Hartree
potential in the right lead region. The effect of the bias voltage
appears at three places. The first effect is that the Hamiltonian matrix in the
right region given by Eq. (9) is replaced using
Eq. (10) as
In our implementation, the gate voltage is treated by adding
an electric potential defined by
The spin resolved transmission is evaluated by the Landauer formula
for the non-interacting central region connected with two leads:
All the calculations in this study were performed by the DFT code, OpenMX.[64] The PAOs centered on atomic sites are used as basis functions.[42,43,44] The PAO basis functions we used, generated by a confinement scheme,[42,43] are specified by H5.5-s2, C4.5-s2p2, O5.0-s2p2d1, Fe5.0-s2p2d1, Mg5.5-s2p2, La7.0-s3p2d1f1, Sr7.0-s3p2d1f1, and Mn6.0-s3p2d2, where the abbreviation of basis functions, such as C4.5-s2p2, represents that C stands for the atomic symbol, 4.5 the cutoff radius (bohr) in the generation by the confinement scheme, and s2p2 means the employment of two primitive orbital for each of s- and p-orbitals. Norm-conserving pseudopotentials are used in a separable form with multiple projectors to replace the deep core potential into a shallow potential.[59] Also, a local density approximation (LDA) to the exchange-correlation potential is employed.[61], while a generalized gradient approximation (GGA)[62] is used only for calculations of the LaMnOSrMnO superlattice. The real space grid techniques are used with the cutoff energies of 120-200 Ry in numerical integrations and the solution of Poisson equation using FFT.[63] In addition, the projector expansion method is employed in the calculation of three-center integrals for the deep neutral atom potentials.[52]
The accuracy and efficiency of the implementation are mainly determined by the evaluation of density matrix given by Eq. (42) which consists of two contributions: the equilibrium and non-equilibrium terms given by Eqs. (22) and (43), respectively. In this subsection, we discuss the convergence properties of the equilibrium and non-equilibrium terms in the density matrix as a function of numerical parameters.
Since the majority part of the density matrix given by Eq. (42) is the equilibrium contribution, let us first discuss convergency of the equilibrium density matrix as a function of poles. The absolute error in of a carbon linear chain is shown in Fig. 2(a) as a function of the number of poles in order to illustrate the convergence property for the equilibrium density matrix under zero bias voltage, where can be regarded as a conventional expression of the total energy in DFT, and the definitions of the two energy terms and with the Fermi-Dirac function are given in Appendix A. For comparison the result calculated by a closed contour method is also shown.[18,67] One can see that the accuracy of eV per atom is obtained using 140, 100, and 70 poles for the electronic temperature of 300, 600, and 1200 K, respectively, while about 400 energy points are needed to obtain the same accuracy using the closed contour method at 600 K.[18] For most cases, we find that the convergence rate is similar to the case shown in Fig. 2(a). In general, the number of poles to achieve the accuracy of eV per atom must be proportional to the inverse of , since the interval between neighboring poles of the continued fraction given by Eq. (25) is scaled by . This fact implies that the computational effort increases as the electronic temperature decreases. However, we generally use electronic temperature from 300 to 1000 K for practical calculations, which means that the use of 100 poles is enough for practical purposes. Therefore, it can be concluded that the most contribution of the density matrix can be very accurately evaluated with a small number of poles, i.e., 100.
To demonstrate the proper treatment of the boundary between the lead and the central regions in our implementation, in Fig. 2(b) we show a comparison between the conventional band structure and the EGF calculations with respect to DOS of the carbon linear chain. The comparison provides a severe test to check whether the EGF method is properly implemented or not. It can be confirmed that DOS calculated by the EGF method is nearly equivalent to that by the conventional band structure calculation, which clearly shows the proper treatment of the boundary between the lead and the central regions in our scheme.
As explained before, the integration of Eq. (44) required for the evaluation of the nonequilibrium term in the density matrix has to be performed on the real axis with a small imaginary part because contour integration schemes may not be applied due to the non-analytic nature of the integrand. The treatment might suffer from numerical instabilities in the SCF iteration, since the integrand can rapidly vary due to the existence of poles of Green function located on the real axis. A remedy to avoid the numerical problem is to smear the Green function by introducing a relatively large imaginary part.[17,18,21,22]
To investigate convergency of the nonequilibrium term in the density matrix, in Fig. 3(a) we show the absolute error in of the same infinite carbon chain as in Fig. 2 but under a finite bias voltage of 0.5 eV as a function of the number of regular grid points used for the evaluation of the nonequilibrium term in the density matrix given by Eqs. (43) and (44). We also tested the Gauss-Legendre quadrature for the integration of the nonequilibrium term besides the integration using the regular grid, but found that the convergence rate of the Gauss-Legendre quadrature is rather slower than the simple scheme possibly due to the spiky structure of the integrand. Thus, we have decided to use the simple scheme using the regular grid. As expected, it turns out that the number of grid points to achieve the accuracy of eV per atom increases as the imaginary part becomes smaller. However, the accuracy of eV is attainable using about 100 grid points in case of the imaginary part of 0.01 eV, while a few thousands grid points have to be used to achieve the same accuracy for the imaginary part of 0.0001 eV.
Although the accuracy of eV can be achieved by introducing the smearing scheme, however, one may consider that results can be affected by the introduction of an imaginary part. In order to find a compromise between the accuracy and efficiency, the Mulliken population of the carbon linear chain under the finite bias voltage of 0.5 eV is shown in Fig. 3(b). We see that the use of the imaginary part of 0.01 eV gives a result comparable to that obtained by the use of 0.0001 eV. Thus, the imaginary part of 0.01 eV can be a compromise between the accuracy and efficiency in this case. As a result, one can find that the energy points of 200 (100 and 100 for the equilibrium and nonequilibrium density matrices, respectively) are enough to achieve the accuracy of eV per atom in at 600 K. However, it should be mentioned that a proper choice of the imaginary part must depend on the electronic structure of systems, and the careful consideration must be taken into account especially for a case that the spiky DOS appears in between two chemical potentials of the leads, while for the equilibrium part of the density matrix the convergence property is insensitive to the electronic structure.
We also note that the accurate evaluation of the density matrix makes the SCF calculation stable even under a finite bias voltage. In fact, for the case with the bias voltage of 0.5 V the number of the SCF iterations to achieve the residual norm of for the charge density difference is 29 which is nearly equivalent to that, 30, for the zero bias case.
Since for large-scale systems it is very time-consuming to perform
the SCF calculation at each bias voltage, here we propose
an interpolation scheme to reduce the computational cost in the
calculations by the NEGF method.
The interpolation scheme is performed in the following way:
(i) the SCF calculations are performed
for a few bias voltages which are selected in the regime of the bias
voltage of interest.
(ii) when the transmission and current are calculated,
a linear interpolation is made for the Hamiltonian block elements,
and
, of
the central scattering region and the right lead, and the chemical
potential, , of the right lead by
As an illustration of our implementation, we investigate transport properties of zigzag graphene nanoribbons (ZGNRs) with different magnetic configurations. A characteristic feature in the band structure of ZGNR is the appearance of flat bands around X-point near the Fermi level, resulting in spin-polarization of associated states located at the zigzag edges.[68,69] Thus, so far several intriguing transport properties have been theoretically predicted especially for ZGNRs among GNRs by focusing on the spin polarized edge states.[70,11,12,71,72,73,74,75,76,77] For instance, it is found that ZGNRs might exhibit an extraordinary large magnetoresistance (MR) effect and a spin polarized current.[12]
Here we focus on the current-bias voltage (-) characteristic of - and -ZGNRs with two magnetic configurations: ferromagnetic (FM) and antiferromagnetic (AFM) junctions as shown in Figs. 5(a) and (b), respectively, where the number, or , is the number of carbon atom in the sublattice being across ZGNR along the lateral direction. The odd and even cases will also be referred to as asymmetric and symmetric, respectively. The extended central region consists of one sublattice and four unit cells, and contains 72 and 82 atoms for 7- and 8-ZGNRs, respectively. The poles of 100 is used for the evaluation of the equilibrium density matrix with the electronic temperature of 300 K, while the nonequilibrium term in the density matrix is evaluated using the simple quadrature method with the imaginary part of 0.01 eV and the grid spacing of 0.02 eV. The geometric structures used are optimized under the periodic boundary condition until the maximum force is less than hartree/bohr. At each bias voltage the electronic structure of ZGNR is self-consistently determined.
Figures 6(a) and (b) show the current-voltage (-) curves for the up- and down-spin states in the FM junctions of 7- and 8-ZGNRs. It is found that the current for 7-ZGNR linearly depends on the bias voltage, while the current for 8-ZGNR is saturated at the bias voltage of about . The distinct behavior of 8-ZGNR from 7-ZGNR can be more definitely seen in the AFM junction as shown in Figs. 6(c) and (d). Interestingly, 8-ZGNR with the AFM junction exhibits a diode behavior for the spin resolved current. Only the up-spin state contributes substantially to the current in the negative bias regime. In contrary in the positive bias regime only the down-spin state contributes to the current. It is worth mentioning that the effect can also be regarded as a spin filter effect. On the other hand, the - characteristics for 7-ZGNR is nearly equivalent to that for the FM junction. The considerable difference between 7- and 8-ZGNRs in the - characteristics can be attributed to the symmetry of two wave functions, and states, around the Fermi level. For 8-ZGNR the wave functions of the and states are antisymmetric and symmetric with respect to the mirror plane which is the mid plane between two edges, respectively, while those wave functions for 7-ZGNR are neither symmetric nor antisymmetric. From a detailed analysis[80], it can be concluded that the unique distinction in the - characteristics arises from an interplay between the symmetry of wave functions and band structures of ZGNRs. In addition, we find that spin moments are reduced by applying the finite bias voltage as shown in Fig. 7. Since the flat bands around X-point of the minority spin state are located about 0.25 eV above the Fermi level, the spin moments at the zigzag edges are largely reduced by increasing the occupation of the flat bands for the minority spin states when the bias voltage exceeds the threshold as shown in Fig. 7.[78,79] The details of the analysis on the unique spin diode and filter effect of ZGNRs is discussed elsewhere.[80]
The applicability of our implementation to bulk systems is demonstrated by an application to a tunneling junction consisting of MgO(100) layers sandwiched by iron. The magneto-tunneling junction was theoretically predicted to exhibit a large tunneling-magnetoresistance (TMR),[81] and subsequently the TMR effect has been experimentally confirmed.[82] We consider four MgO(100) layers sandwiched by iron with the (100) surface of which atomic configuration is shown in Fig. 8(a), where the lattice constant of the bc-plane used is 2.866 Å, and they are 2.866 and 4.054 Å in iron and MgO regions along the a-axis, and the distance between the MgO and iron layers is assumed to be 2.160 Å. The four MgO(100) layers corresponds to the region in Fig. 1(a), and four Fe layers of the left and right hand correspond the region and , respectively. The SCF calculations were performed at 1000 K under zero bias voltage using k-points of and 130 poles for the integration of the equilibrium density matrix. It is found that obtaining the SCF is much harder than the case of ZGNR discussed before, and that a careful and modest treatment for the charge mixing is required.
As shown in Fig. 8(b), the net charge of iron atoms in the interfacial layer is positive due to the coordination to an oxygen atom, and the reduction of electrons leads to the increase of the magnetic moment of iron atoms at the interfacial layer. The increase of the magnetic moment at the interfacial layer makes the distinction of the majority and minority spin states at the Fermi energy clear, i.e., the nature of the majority and minority spin states at the Fermi energy can be assigned to - and -states, respectively. The k-resolved transmissions at the chemical potential for the majority and minority spin states for the parallel magnetic configuration between the left and right leads are shown in Figs. 9(a) and (b), respectively. The large peak at the point in the majority spin state can be attributed to the -state, while sharp peaks around four pillars come from the -state as discussed in Ref.[81]. Note that the position of the sharp peaks is rotated by 45 degrees because of the unit cell rotated by 45 degrees compared to that in Ref.[81]. The conductances, and , for the majority and minority spin states, calculated from the average transmission integrated over the first Brillouin zone, are 11.99 and 2.82 , respectively, which implies that the tunneling junction may behave as a spin filter. The distinction in the conductance should be attributed to decay properties of states in the insulating MgO region coupled with the two states.[81] For the antiparallel magnetic configuration the k-resolved transmission at the chemical potential is understood as a multiplication of the transmissions for the majority and minority spin states in the parallel configuration as shown in Fig. 9(c). The conductance, , of the antiparallel magnetic configuration is 0.34 ( ) which is smaller than those of the parallel case. By defining TMR , we obtain TMR of 2082 %, which is compared to 3700 % for a 5 layer MgO case reported in Ref.[83].
When the transmission of a system with the periodicity along the a-axis as well as the periodicity of the bc-plane is evaluated under zero bias voltage, it can be easily obtained by making use of the Hamiltonian calculated by the conventional band structure calculation without employing the Green function method described in the paper. This scheme enables us to explore transport properties for a wide variety of possible geometric and magnetic structures with a low computational cost, and thereby can be very useful for many materials such as supperlattice structures. Once the Hamiltonian and overlap matrices are obtained from the conventional band structure calculation for the periodic structure, the transmission is evaluated by Eq. (51), where all the necessary information to evaluate Eq. (51) can be reconstructed by the result of the band structure calculation. As an example of the scheme, we calculate the conductance of a (LaMnO)/(SrMnO) superlattice with four different magnetic structures, i.e., ferromagnetic, A-type, G-type, and C-type antiferromagnetic configurations of Mn sites.[84,85]
In recent years, it has been found experimentally that the superlattice structures consisting of LaMnO and SrMnO layers exhibit a metal-insulator transition in terms of the layer thickness.[84] Bhattacharya et al. fabricated (LaMnO)/(SrMnO) superlattices on a SrTiO (001) substrate, and measured the in-plane resistivity as a function of temperature.[84] The resistivity measurement indicates that the superlattices are metallic and insulating for and , respectively. On the other hand, the (LaMnO)/(SrMnO) superlattices fabricated on the same substrate by Nakano et al. exhibit a sample dependence in the resistivity measurement, i.e., one of three samples is metallic and the others are insulating.[85] They argued that the metallic behavior observed in the one sample may be attributed to a certain structural incompleteness in the superlattice structure, and that the ideal superlattice should become insulating based on their experimental results. A theoretical model calculation for the (LaMnO)/(SrMnO) superlattices suggests that the metal-insulator transition at can be explained by existence of the G-type antiferromagnetic barrier in the SrMnO layers sandwiched by the LaMnO layers with the ferromagnetic configuration.[86] Since the analysis by Nakano et al. suggests that the charge transfer between the LaMnO and SrMnO layers is rather localized in the vicinity of the interface,[85] the model should be applicable to the case of (LaMnO)/(SrMnO) without significantly depending on the ratio between the thicknesses of (LaMnO) and (SrMnO) layers, indicating that (LaMnO)/(SrMnO) is metallic. However, the naive consideration evidently contradicts the experimental result.[85]
As a first step towards comprehensive understanding of transport properties of the superlattice structures by the first principle calculations, we consider the simplest superlattice, i.e., (LaMnOSrMnO). In the calculations, the in-plane lattice constant is fixed to be 3.905 Å which is equivalent to that of the SrTiO substrate. The out-of-plane lattice constant is assumed to be 7.735 Å, since those are experimentally determined to be 3.959 Å and 3.776 Å for the LaMnO and the SrMnO layers, respectively, grown on the SrTiO substrate and the average out-of-plane lattice constant for the superlattices is nearly equivalent to the average of the two values.[85] With those lattice constants internal structural parameters are optimized for each magnetic configuration without any constraint until the maximum force is less than hartree/bohr. The optimized structure for the ferromagnetic configuration is shown in Fig. 10(a). It is found that the position of oxygen atoms is largely distorted due to the different ionic radii between La and Sr atoms, showing that Mn atoms are located in the center of each distorted octahedron. Also, bond angles of Mn-O-Mn are found to be 167.4 and 161.6 (degree) for the in-plane and out-of-plane, respectively. The total energies relative to the ferromagnetic configuration are listed in Table 1. The calculated ground state is the ferromagnetic configuration, and the A-type antiferromagnetic configuration lies just above 5 meV per formula unit. It may be considered that the nearly degeneracy between the two configurations corresponds to the neighborhood of the boundary at and in the phase diagram for the tetragonal LaSrMnO.[87] The two configurations have both a metallic DOS, while the ferromagnetic configuration is half-metallic as shown in Figs. 10(b) and (c), reflecting the large in-plane and out-of-plane conductances as shown in Table 1. From a detailed analysis (not shown) of DOSs, we see that the electronic states at the Fermi level are composed of orbitals of Mn atoms and orbitals of oxygen atoms. Also, the Mulliken population analysis (not shown) suggests that the charge state of Mn atoms in the superlattice is in between those in the LaMnO and SrMnO bulks. This implies that the metallic band structures are induced by partial filling of the bands. Since the bond angle of Mn-O-Mn for the out-of-plane is slightly acute, therefore, this may be attributed to reduction of the conductance of the ferromagnetic state in the out-of-plane as shown in Table 1. A systematic study for the thicker cases and the effect of Coulomb interaction[88] in the orbitals is highly desirable, and the details will be discussed elsewhere.
F | A | C | G | |
Energy | 0 | 5.0 | 163.8 | 248.2 |
2262 | 1433 | 1169 | 1646 | |
1425 | 1105 | 1646 | ||
1741 | 664 | 1127 | 678 | |
655 | 1128 | 677 |
The computation in the the NEGF method can be parallelized in many aspects such as k-points, energies in the complex plane at which the Green functions are evaluated, spin index, matrix multiplications, and calculation of the inverse of the matrix. Here we demonstrate a good scalability of the NEGF in the parallel computation by a hybrid scheme using the message passing interface (MPI) and OpenMP which are used for inter-nodes and intra-node parallelization, respectively. The Green function defined by Eq. (6) is specified by the k-point, energy , and spin index . Since the calculation of the Green function specified by each set of three indices can be independently performed without any communication among the nodes, we parallelize triple loops corresponding to the three indices using MPI. Each node only has to calculate the Green functions for an allocated domain of the set of indices, and partly sum up Eq. (26) or (44) in a discretized form. After all the calculations finish, a global summation among the nodes is required to complete the calculation of Eq. (26) or (44), which, in most cases, is a very small fraction of the computational time even including the MPI communication among the nodes. Thus, the reduction of scalability for the parallelization of the three indices is mainly due to imbalance in the allocation of the domain of the set of indices. The imbalance can happen in the case that the number of combination for the three indices and the number of processes in the MPI parallelization are relatively small and large, respectively. In addition to the three indices, one may notice that the matrix multiplications and the calculation of the matrix inverse can be parallelized, which are situated in the inner loops of the three indices. The evaluation of the central Green function given by Eq. (6) and the surface Green functions given by Eq. (21), and the self energies given by Eqs. (7) and (8) are mainly performed by the matrix multiplications and the calculation of the matrix inverse. We parallelize these two computations using OpenMP in one node. Since the memory is shared by threads in the node, the communication of the data is not required unlike the MPI parallelization. However, the conflict in the data access to the memory can reduce the scalability in the OpenMP parallelization. As a whole, in our implementation the k-point, energy , and spin index are parallelized by MPI, and the matrix multiplications and the calculation of the matrix inverse are parallelized by OpenMP.
In Fig. 11 we show the speed-up ratio in the elapsed time for the evaluation of the density matrix of 8-ZGNR under a finite bias voltage of 0.5 eV. The geometric and magnetic structures and computational conditions for 8-ZGNR are same as before. The energy points of 197 (101 and 96 for the equilibrium and nonequilibrium terms, respectively) are used for the evaluation of the density matrix. Only the point is employed for the k-point sampling, and the spin polarized calculation is performed. Thus, the combination of 394 for the three indices are parallelized by MPI. It is found that the speed-up ratio of the flat MPI parallelization, corresponding to 1 thread, reasonably scales up to 64 processes. Furthermore, it can be seen that the hybrid parallelization, corresponding to 2 and 4 threads, largely improves the speed-up ratio. By fully using 64 quad core processors, corresponding to 64 processes and 4 threads, the speed-up ratio is about 140, demonstrating the good scalability of the NEGF method.
We have presented an efficient and accurate implementation of the NEGF method for electronic transport calculations in combination with DFT using pseudo-atomic orbitals and pseudopotentials. In the implementation, we have developed accurate methods for the evaluation of the density matrix and the treatment of the boundary between the scattering region and the leads.
A contour integration method with a continued fraction representation of the Fermi-Dirac function has been successfully applied for the evaluation of the equilibrium term in the density matrix, which evidently outperforms the previous method,[18] while a simple quadrature scheme on the real axis with a small imaginary part is employed for that for the nonequilibrium term in the density matrix. It has been demonstrated by numerical calculations that the accuracy of eV per atom in is attainable using the energy points of 200 in the complex plane even under a finite bias voltage of 0.5 V at 600 K. However, the evaluation of the nonequilibrium density matrix still requires a careful treatment where the pole structure of the Green functions has to be smeared by introducing a finite imaginary part. The numerical calculations suggest that the number of energy points required for the convergence can be largely reduced by introducing the imaginary part of 0.01 eV without largely changing the calculated results in a practical sense. We also note that the accurate evaluation of the density matrix provides another advantage that the SCF calculations even under a finite bias voltage smoothly converge in a similar fashion as the conventional band structure calculation does.
We have also developed an efficient method for calculating the Hartree potential by a combination of the two dimensional FFT and a finite difference method without any ambiguity in reproducing the boundary conditions. In addition, a careful evaluation of the charge density near the boundary between the scattering region and the leads is presented in order to avoid the spurious scattering accompanied by the inaccurate construction of the charge density. The proper treatment for the charge evaluation in our implementation can definitely be verified by a comparison between the conventional band structure calculation and the EGF method with respect to the DOS of the carbon chain.
Finally, we have demonstrated the applicability of our implementation by calculations of spin resolved - characteristics of ZGNRs, showing that the - characteristics depend on the symmetry of ZGNR, and that the symmetric ZGNR exhibits a unique spin diode and filter effect. Also, the applicability of our implementation to bulk systems is demonstrated by applications to a Fe/MgO/Fe tunneling junction and a LaMnOSrMnO superlattice. Based on the above discussions and the good parallel efficiency in the hybrid parallelization shown in the study, it is concluded that our implementation of the NEGF method can be applicable to challenging problems related to large-scale systems, and can be a starting point, apart from numerical spurious effects, to include many body effects beyond the one particle picture in the electronic transport.
Acknowledgments
The authors would like to thank H. Kondo for providing a prototype FORTRAN code of the NEGF method. This work is partly supported by CREST-JST, the Next Generation Supercomputing Project, Nanoscience Program, and NEDO (as part of the Nanoelectronics project).
Let us introduce the following density functional which may define
the total energy of the central region being a part of
the extended system at equilibrium
with a common chemical potential :
We now consider the variation of the energy with respect
to .
The variations of and are simply given by
The generalization of the functional to the metallic case with a finite temperature and the nonequilibrium state might be an important direction in the future study so that forces on atoms can be variationally calculated from the functional, since the existence of a variational functional has been recently suggested for the nonequilibrium steady state.[89,90,91]
The equilibrium energy density matrix
,
where one of the associated basis orbitals is in the central cell
and the other is in the cell denoted by
,
is calculated using the contour integration method applied to
the equilibrium density matrix as follow:
For the nonequilibrium Green function, the nonequilibrium contribution in the the energy density matrix can be calculated using the simple quadrature scheme in the same way as for the nonequilibrium term in the density matrix.