• Nem Talált Eredményt

4.2 ADC(2) excitation energies and transition moments

5.1.1 Theory and implementation

Chapter 5

Reduced-scaling approximations for excited states

5.1 Local density fitting approach for TDDFT and CIS methods

In this section an efficient scheme is presented for calculating excitation energies and transition moments at the CIS and TDDFT levels, as well as for the related TD-HF and TDA-TDDFT methods. We have successfully extended the local DF approximation to excited states, which was originally developed for ground-state self-consistent field (SCF) calculations. The construction of the subsets of auxiliary functions is discussed, and the details of the implementation are presented. To assess the efficiency of the algorithm, an extensive test set was compiled incorporating well-known examples relevant in the field of photochemistry [265].

addition, if an MO-based algorithm is employed, one has to read the entire JabP list from the hard disk in each iteration resulting in considerable I/O load.

It is well-known that an existing integral-direct SCF [32] program, after slight mod-ifications, is also applicable to calculate CIS excitation energies. In this case the con-tributions to the sigma vector are computed in the AO basis, and any significant I/O operation is avoided. The transformation between the AO and MO integrals can be easily performed with the

IpqQ =X

µν

CµpCνqIµνQ (5.2)

expression, where Cµp is the corresponding MO coefficient. The most efficient way to carry out the Ac matrix-vector multiplication and the working equations are presented in Table 5.1. As it can be seen, the rate-determining operations of the Coulomb term

Table 5.1: Algorithm and working equations for the sigma vector calculation

Step Scaling

1 Compute the P

bj(ia|jb)cbj Coulomb contribution a XQ =P

µνIµνQ P

bjCµjCνbcbj noccnvirtnAO+n2AOnocc+n2AOnaux b YP =P

QVP Q−1XQ n2aux

c Zλσ =P

P IλσP YP n2AOnaux

2 Compute the P

bj(ij|ab)cbj exchange contribution a XijQ =P

P VP Q−1P

µνCµiCνjIµνP noccn2AOnaux+n2occnAOnaux+n2occn2aux b YλjQ =P

σIλσQ P

bcbjCσb noccnvirtnAO+noccn2AOnaux c Wλi =P

QjXijQYλjQ n2occnAOnaux

d σia ←P

λCλa(2P

σCσiZλσ−αXWλi) noccn2AO+noccnAO+noccnvirtnAO 3 Compute further contributions

a σia ←P

bfabcbi noccn2virt

b σia ← −P

jfijcaj n2occnvirt

calculation (step 1) are steps 1.a and 1.c, which scale as n2AOnaux, where nAO stands for the number of AOs. The bottleneck of the entire procedure is the evaluation of the exchange term (step 2). Its most demanding step (2.b) needs n2AOnauxnocc operations.

For both the Coulomb and the exchange term, the intermediates of Table 5.1 must be evaluated for each excited state separately. The advantage of this algorithm is that the I/O operation count is sufficiently low, and the IµνQ integrals can be calculated once in an iteration step, independently of the number of excited states. Additionally, the calculation of the exchange term can easily be blocked according to occupied index j

depending on the available main memory. The above algorithm can be applied to TDA-TDDFT calculations without any changes, just the exchange-correlation contribution, the last two term terms of Eq. (3.27), must be added to the sigma vector. The evaluation of the P

bj(ib|ja)cbj term for the TDHF and TDDFT methods can be carried out by an algorithm similar to step 2: theCνj MO coefficient in step 2.a should be interchanged by the half-transformed trial vector, P

bcbjCσb, in step 2.b.

The HF exchange contribution (see step 2 in Table5.1) in the AO basis can be recast as

σia← −X

jb

(ij|ab)cbj =−X

µνλσ

X

P Q

X

jb

CµiCνjIµνP VP Q−1IλσQCλaCσbcbj =−X

µλ

CµiWµλCλa . (5.3) Since the above expression of intermediate W is invariant to the unitary transformation of the occupied MOs among themselves, the canonical occupied dummy index j can be replaced by a localized one, hereafter denoted by j0. Introducing the local density fitting approximations [235–238], this term can then be obtained as a sum of the individual contributions of each occupied localized MO j0. An auxiliary basis function domain, P(j0), can be defined for LMOj0 in which the generalized densities in the (ij0|ab) integrals can be properly expanded. The density fitting can be more efficiently performed with the auxiliary basis set assembled as the number of the functions, due to the locality of the MOs, is lower than for the entire molecule. The construction of appropriate fitting domains is a relatively straightforward task for ground-state SCF calculations, but it is not unequivocal for excited states, and to the best of our knowledge, no attempt had been made previously to employ local fitting domains in the latter case.

Similarly to the ground-state problem, the goal is to accurately calculate the inter-mediate

Wµλ ≈X

νσ

X

j0b

X

P,Q∈P(j0)

Cνj0IµνP VP Q−1IλσQCσbcbj0. (5.4) For this purpose, we have developed a procedure to construct the suitable local fitting domainsP(j0). In the first step, after the SCF procedure the canonical MOs are localized using the Boys algorithm. Obviously the P(j0) set should contain the auxiliary functions which are located on the atoms on which LMO j0 is localized. To determine these atoms [domain PL(j0)], L¨owdin atomic charges are calculated for each LMO, and the atoms whose charge is greater than 0.1 are selected. The PL(j0) set, which typically contains 2 to 3 atoms in average, is suitable for a rough fitting ofµ(r)j0(r)-type generalized densities appearing in the P

νCνj0IµνP =IµjP0 contribution of Eq. (5.4). In the case of ground-state SCF calculations the PL(j0) domains are supplemented with the neighboring atoms to increase the quality of the fitting. We also tested this option, but our numerical experience

showed that no additional atoms are needed for excited states because the other atoms added to the local fitting domain as detailed below will do the job.

Subsequently, we assemble a fitting domain, Pex(j0), taking into account the infor-mation carried by the excitation vector to ensure that the λ(r)σ(r) AO product densities in Eq. (5.4) are also properly approximated. For this purpose, we carry out the summa-tion for index b and analyze the half-transformed solution vector, cσj0 = P

bCσbcbj0. The elements of the normalized vector are sorted into descending order, and starting from the largest one the squares of the elements are summed up until the sum becomes larger than thresholdTLDF. If thecσj0 element of the vector is included in the sum, we add those auxiliary functions to Pex(j0) which reside on the atom of AO σ. In this case, of course, the fitting of thoseλ(r)σ(r) generalized densities is not perfect where the atom belonging to AO σ is not selected or λ is far from σ. However, for the former contributions, the integrals are multiplied by a very smallcσj0 coefficient, while for the latter, the integralIλσQ is a small number, and thus the error of the approximation will not be significant. Again, domain Pex(j0) could, in principle, be extended by adding the nearby atoms to improve the fitting of the λ(r)σ(r) products, but our experience suggests that it is not necessary.

Finally, the fitting domain of LMO j0 is defined as P(j0) = PL(j0)∪Pex(j0). Following similar arguments, it can be easily shown that the P(j0) domains are also appropriate for the evaluation of the P

bj(ib|ja)cbj term for the TDHF and the full TDDFT methods without any changes or additions.

To minimize the computational resources for the calculations, we designed the fol-lowing algorithm:

1. Solve the SCF equations and perform the localization

2. Construct the PL(j0) domains utilizing L¨owdin atomic charges

3. Perform a preoptimization for all the roots simultaneously using a multi-root David-son algorithm and the PL(j0) domains as local fitting domains

4. Loop over excited states

Loop over Davidson iterations

(a) Construct thePex(j0) domains based on the solution vector,P(j0) =PL(j0)∪ Pex(j0)

(b) Calculate the sigma vector using the P(j0) domains End loop

End loop

This algorithm has numerous advantages. At the preoptimization, since the PL(j0) sets used instead of P(j0) do not contain any state-specific information, a multi-root David-son solver can be employed, which requires fewer sigma vector evaluations, and the IµνQ integrals can be calculated once in a single iteration step, independently of the number of excited states. In addition, the fitting domains, the PL(j0) sets contain in average 2 to 3 atoms regardless of the size of the system. Accordingly, the preoptimization can be performed in negligible time but provides suitable starting vectors for the final calcula-tions. Subsequently, we solve the equations in a state-specific manner using a method developed by H¨attig and co-workers [219,221]. Since the constructed reduced auxiliary sets contain information about the excitation, each P(j0) set will be state-dependent, and we have to calculate the sigma vector and the IµνQ integrals for all the states one by one. In practice, the eigenvalue equations are solved in the canonical basis, only the HF exchange contribution (step 2 in Table 5.1) is evaluated in the localized basis, and the corresponding contribution to the sigma vector is transformed back to the canonical basis in each iteration step. This usually results in fewer iteration steps as if the iteration was performed in the localized MO basis. The size of the P(j0) domains and the accuracy of the LDF approximation can be arbitrarily controlled with threshold TLDF.

As the extent of the fitting domains is independent of the system size, it is easily seen that the LDF approximation reduces the formal quartic scaling of the considered methods to cubic. We note that, at the calculation of the three-center Coulomb integrals and their transformation by the MO coefficients or excitation amplitudes, we employ the well-known integral prescreening techniques. Consequently, these steps scale asymptot-ically quadratasymptot-ically with the systems size. The fitting of the transformed two-electron integrals and the assembly of intermediate Wµλ from the resulting quantities are still cubic-scaling operations in the limit of large molecules. In principle, the sparsity of the intermediates could also be exploited in the later operations to further reduce the scaling.

However, these are relatively fast operations, and our experience with the ground-state SCF exchange suggests that the sparsity of the intermediates is rather low for the tractable systems with basis sets including diffuse functions.

It is important to mention that the eigenvalue equation defined by Eq. (3.7) is Hermitian in the case of the conventional CIS and TDA-TDDFT methods. However, if the P(j0) fitting domains do not contain the same functions for each j0, the matrix will no longer be strictly Hermitian. Consequently, it would be necessary to determine the corresponding left-hand eigenvectors to calculate properties other than excitation energies. Note, however, that our LDF approach converges to the exact case as TLDF goes to 1.0, and thus one expects that with a sufficiently tight TLDF the left- and right-hand side solutions are appropriately close to each other. Therefore, we omit the solution of the left-hand equations and calculate the transition moments using the conventional

expressions with solely the right-hand eigenvectors. Another unfavorable consequence of our approximation is that it may slightly break the spatial symmetry, hence non-zero oscillator strengths may show up for forbidden transitions, or achiral molecules may acquire rotator strengths. We again expect that this effect can also be restrained by a properly chosen truncation parameter. These problems will be addressed in the following subsection.