Perturbation theory

4. Perturbation theory#

The hydrogen atom Hamiltonian with the Coulomb potential term between the nucleus and the electron is the most complicated example for which we know the exact solutions. It is however not complete, as it ignores some physical terms, including the interaction between the spin and the orbital angular momentum and a relativistic correction necessary because the electrons can move quite fast. Fortunately, the additional terms that account for these physical effects are relatively small, and can be treated as perturbations to the zeroth-order Coulomb interaction result. They do however have an important consequence: they cause degenerate energy levels to split up, thus (as it is usually stated) ‘lifting the degeneracy’. Indeed, we find multiple wavelengths for, say, the transition from the \(n=2\) to the \(n=1\) state of hydrogen. To see how this works, we will start by the simplest case: a perturbation to a non-degenerate state, then build up to degenerate cases.

4.1. Non-degenerate perturbation theory#

Suppose that we have a Hamiltonian that can be written as the sum of two terms, with one much larger than the other, i.e.

(4.1)#\[ \hat{H} = \hat{H}^0 + \varepsilon \hat{H}', \]

where \(\varepsilon \ll 1\). We’ll say that \(\hat{H}'\) is a perturbation of the unperturbed Hamiltonian \(\hat{H}^0\). Suppose moreover that we have a complete set of orthogonal eigenfunctions \(\psi_n^0(x)\) of \(\hat{H}^0\), with eigenvalues \(E_n^0\):

(4.2)#\[ \hat{H}^0 \psi_n^0 = E_n^0 \psi_n^0, \qquad \text{with} \quad \Braket{\psi_m^0 | \psi_n^0} = \delta_{mn}. \]

The Schrödinger equation for the full system reads

(4.3)#\[ \hat{H} \psi_n = E_n \psi_n, \]

which we’d like to solve for the eigenvalues \(E_n\) and eigenfunctions \(\psi_n\). If doing so exactly is not possible (as will usually be the case), we can approximate the solutions by series expansions in the parameter \(\varepsilon\). We then write

(4.4)#\[\begin{split}\begin{align*} E_n &= E_n^0 + \varepsilon E_n^1 + \varepsilon^2 E_n^2 + \ldots, \\ \psi_n &= \psi_n^0 + \varepsilon \psi_n^1 + \varepsilon^2 \psi_n^2 + \ldots. \end{align*}\end{split}\]

We call \(E_n^1\) and \(\psi_n^1\) the first-order corrections to the \(n\)th energy and wavefunction, respectively. It turns out that finding an expression for \(E_n^1\) is pretty easy. We simply substitute our expansion back in the Schrödinger equation (4.3) and collect terms that are of the same power in \(\varepsilon\):

(4.5)#\[\begin{split}\begin{align*} \hat{H} \psi_n = \left( \hat{H}^0 + \varepsilon \hat{H}' \right) \left(\psi_n^0 + \varepsilon \psi_n^1 + \varepsilon^2 \psi_n^2 + \ldots \right) &= E_n \psi_n = \left( E_n^0 + \varepsilon E_n^1 + \varepsilon^2 E_n^2 + \ldots \right) \left(\psi_n^0 + \varepsilon \psi_n^1 + \varepsilon^2 \psi_n^2 + \ldots \right) \\ \hat{H}^0 \psi_n^0 + \varepsilon \left( \hat{H}^0 \psi_n^1 + \hat{H}' \psi_n^0 \right) + \varepsilon^2 \left( \hat{H}^0 \psi_n^2 + \hat{H}' \psi_n^1 \right) + \ldots &= E_n^0 \psi_n^0 + \varepsilon \left( E_n^0 \psi_n^1 + E_n^1 \psi_n^0 \right) \\ & \qquad + \varepsilon^2 \left( E_n^0 \psi_n^2 + E_n^1 \psi_n^1 + E_n^2 \psi_n^0 \right) + \ldots \end{align*}\end{split}\]

To zeroth order, we retrieve the unperturbed equation (4.2). To first order, we get

(4.6)#\[ \hat{H}^0 \psi_n^1 + \hat{H}' \psi_n^0 = E_n^0 \psi_n^1 + E_n^1 \psi_n^0. \]

We now take the inner product of (4.6) with \(\psi_n^0\), which gives:

(4.7)#\[ \Braket{\psi_n^0 | \hat{H}^0 \psi_n^1} + \Braket{\psi_n^0 | \hat{H}' \psi_n^0} = E_n^0 \Braket{\psi_n^0 | \psi_n^1} + E_n^1 \Braket{\psi_n^0 | \psi_n^0}. \]

Because \(\hat{H}^0\) is Hermitian, the first term on the left-hand side of equation (4.7) can be rewritten as

(4.8)#\[ \Braket{\psi_n^0 | \hat{H}^0 \psi_n^1} = \Braket{\hat{H}^0 \psi_n^0 | \psi_n^1} = E_n^0 \braket{\psi_n^0 | \psi_n^1} \]

and it thus equals the first term on the right-hand side. Moreover, as \(\psi_n^0\) is normalized, the inner product in the second term on the right-hand side of (4.7) equals \(1\). We thus find the simple result that the first-order correction to the energy, \(E_n^1\), is given by the expectation value of the perturbation \(\hat{H}'\) in the unperturbed eigenstate \(\psi_n^0\):

(4.9)#\[ E_n^1 = \Braket{\psi_n^0 | \hat{H}' | \psi_n^0}. \]

Equation (4.9) is the central result of this section, and will be used when calculating corrections to the hydrogen energy levels in Section 4.3. We can also calculate the corrections to the wavefunction itself. These are not nearly as useful, unless you want to get the second-order corrections to the energies. The results (see Exercise 4.1) are

(4.10)#\[\begin{align*} \psi_n^1 &= \sum_{m \neq n} \frac{\Braket{\psi_m^0|\hat{H}'|\psi_n^0}}{E_n^0 - E_m^0} \psi_m^0, \ \end{align*}\]

(4.11)#\[\begin{align*} E_n^2 &= \sum_{m \neq n} \frac{\left|\Braket{\psi_m^0|\hat{H}'|\psi_n^0}\right|^2}{E_n^0 - E_m^0}. \end{align*}\]

One can iterate further and find higher-order corrections, though their usefullness (as values that you could actually measure) quickly declines.

4.1.1. Application: Van der Waals interactions#

Suppose we have two hydrogen atoms, located a distance \(R\) from each other, with \(R\) significantly larger than the Bohr radius \(a_0\). Although both atoms are electrically neutral, the positively charged nucleus of one atom still exerts a Coulomb force on the negatively charged electron of the other, and vice versa. Likewise, the electrons and nuclei exert forces on each other. These forces are much smaller than the force between each nucleus and its corresponding electron. We can therefore split the Hamiltonian of the joint system into a term \(\hat{H}^0\) containing the two atoms as isolated parts, and a perturbation term \(\hat{H}'\) for their interaction:

(4.12)#\[\begin{split}\begin{align*} \hat{H} &= \hat{H}^0 + \hat{H}' = \hat{H}_\mathrm{H}^{(1)} + \hat{H}_\mathrm{H}^{(2)} + \hat{H}' \\ &= \left[ \frac{\hbar^2}{2 \me} \nabla_1^2 + \frac{\hbar^2}{2 \me} \nabla_2^2 - \frac{e^2}{r_1} - \frac{e^2}{r_2} \right] + e^2 \left[ \frac{1}{R} + \frac{1}{r_{12}} - \frac{1}{r_{1\mathrm{B}}} - \frac{1}{r_{2\mathrm{A}}} \right], \end{align*}\end{split}\]

where A and B label the two nuclei and 1 and 2 the two electrons, see Fig. 4.1(a). The solution to the unperturbed Hamiltonian \(\hat{H}^0\) is simply the product of the eigenstates of the two hydrogen Hamiltonians. As we’re only interested in the energy here, we’ll consider wavefunctions \(\psi_n(\bm{r})\) with energies \(E_n = E_1 / n^2\) (thus ignoring the \(l\) and \(m\) quantum numbers) and write, with a little abuse of notation

(4.13)#\[\begin{split}\begin{align*} \psi_{nm}(\bm{r}_1, \bm{r}_2) &= \psi_n(\bm{r}_1) \psi_m(\bm{r}_2), \\ \hat{H}^0 \psi_{nm}(\bm{r}_1, \bm{r}_2) &= \left(E_n + E_m\right) \psi_{nm}(\bm{r}_1, \bm{r}_2). \end{align*}\end{split}\]

As we’re in the limit that \(R \gg a_0\), the two wavefunctions hardly overlap, and we don’t have to worry about exchange terms. Moreover, in this large-separation limit, the perturbation term to the Hamiltonian becomes a dipole-dipole interaction^[1]

(4.14)#\[ \hat{H}' = \frac{e^2}{R^3} \left(x_1 x_2 + y_1 y_2 - 2 z_1 z_2 \right). \]

If both atoms are in the ground state, the expectation value of this perturbation term is zero, and therefore the first order correction to the energies vanishes. However, the square of the expectation value is not zero, and we thus get a correction to the ground state energy by summing over the excited states:

(4.15)#\[\begin{split}\begin{align*} E_1^2 &= \sum_{nm \neq 11} \frac{\left|\braket{\psi_{nm}(\bm{r}_1, \bm{r}_2)|\hat{H}'|\psi_{11}(\bm{r}_1, \bm{r}_2)}\right|^2}{E_{11} - E_{nm}} \\ &= \frac{e^4}{R^6} \sum_{nm \neq 11} \frac{1}{E_{11}-E_{nm}} \left[ \braket{\psi_n(\bm{r}_1)|x_1|\psi_1(\bm{r}_1)} \braket{\psi_m(\bm{r}_2)|x_2|\psi_1(\bm{r}_2)} + \ldots \right]^2 \\ &\sim - \frac{\alpha_1 \alpha_2}{R^6}, \end{align*}\end{split}\]

where

(4.16)#\[ \alpha_i = e^2 \sum_{n>1} \frac{\left|\braket{\psi_n(\bm{r})|x|\psi_1(\bm{r}}\right|^2}{E_n-E_1} \]

is the atomic polarizability of the atom. The atomic polarizability is a measure for the strength of the induced dipole in an otherwise symmetric atom^[2]. We thus understand the resulting attractive force as the result of a mutually induced dipole in the two hydrogen atoms. It is known as the van der Waals force. Although our calculation here is for two hydrogen atoms, attractive van der Waals forces are present between any two neutral atoms or molecules.

Van der Waals forces are quite weak, as their energy decays as \(1/R^6\). Nonetheless, they are key factors in many chemical, biological and (condensed matter) physical systems. One particular striking example often attributed to van der Waals forces is the ability of geckos to climb walls and run upside-down on a ceiling. Specifically, it’s the van der Waals forces between the \(\beta\)-keratin proteins in the lamellar, spatula shaped setae on the gecko’s feet (see Fig. 4.1(b)) that are credited with the van der Waals interactions [Autumn et al., 2002]. However, other studies suggests that gecko adhesion could be due to induced electrification of surfaces (and thus electrostatic interactions) [Izadi et al., 2014], or to the surface chemistry of the gecko feet, specifically the presence of lipids [Rasmussen et al., 2022]. Although the definitive explanation for the gecko’s climbing ability is thus still open, the possibility that van der Waals forces are at play has inspired various applications [Northen and Turner, 2005], including biologically inspired glue and adhesive tape.

../_images/vanderWaalsforces.svg — Fig. 4.1 Van der Waals forces. (a) Coordinates for the long-distance interaction between two hydrogen atoms. (b) Close-up of the foot of a gecko as it climbs a glass wall ^[3]. The lamellar structure of the skin is clearly visible.#

4.2. Degenerate perturbation theory#

In Section 4.1 we assumed that all the eigenstates of the unperturbed Hamiltonian were non-degenerate. If the states are degenerate, we cannot use equation (4.10), as the denominator will diverge if \(E_m^0 = E_n^0\) for any pair \(m \neq n\). Fortunately, the effect of the perturbation will typically be to lift the degeneracy. Based on the perturbed energies, we will even be able to construct ‘good’ eigenstates of the unperturbed Hamiltonian (limit cases of the perturbed state when taking the perturbation parameter \(\varepsilon\) to zero) for which we can use equation (4.9), as we’ll see below.

4.2.1. Eigenstates of the perturbation matrix#

To illustrate how to deal with degenerate eigenstates of an unperturbed Hamiltonian, we start with a system that has two-fold degeneracy (the procedure for higher degeneracies will be analogous): suppose \(\hat{H}^0\) has two orthogonal eigenstates \(\psi_a^0\) and \(\psi_b^0\) that satisfy

(4.17)#\[ \hat{H}^0 \psi_a^0 = E^0 \psi_a^0, \quad \hat{H}^0 \psi_b^0 = E^0 \psi_b^0, \quad \Braket{\psi_a^0 | \psi_b^0} = 0. \]

Note that any linear combination

(4.18)#\[ \psi^0 = \alpha \psi_a^0 + \beta \psi_b^0 \]

is also an eigenstate of \(\hat{H}^0\) with energy \(E_0\). Our result will not only be the corrections to the energy, but also which linear combinations correspond to the ‘good’ eigenstates, i.e. which values of \(\alpha\) and \(\beta\) we should choose for the perturbed system to consist of proper eigenstates.

We calculate the corrections to the energy in much the same way as in non-degenerate perturbation theory. We again write \(\hat{H} = \hat{H}^0 + \varepsilon \hat{H}'\) and expand both the energy and the wavefunction of \(\hat{H}\) in \(\varepsilon\):

(4.19)#\[\begin{split}\begin{align*} E &= E^0 + \varepsilon E^1 + \varepsilon^2 E^2 + \ldots, \\ \psi &= \psi^0 + \varepsilon \psi^1 + \varepsilon^2 \psi^2 + \ldots. \end{align*}\end{split}\]

The zeroth-order term of the Schrödinger equation with the perturbed Hamiltonian \(\hat{H}\) again gives us the unperturbed system, and the first-order term the first variation:

(4.20)#\[ \hat{H}^0 \psi^1 + \hat{H}' \psi^0 = E^0 \psi^1 + E^1 \psi^0. \]

If we now take the inner product of equation (4.20) with the original eigenstate \(\psi_a^0\), the first terms on the left and right-hand side are again equal, but the remaining terms are different than in the non-degenerate case. We get

(4.21)#\[\begin{split}\begin{align*} \Braket{\psi_a^0 | \hat{H}' \psi^0} &= E^1 \Braket{\psi_a^0 | \psi^0} \\ \alpha \Braket{\psi_a^0 | \hat{H}' \psi_a^0} + \beta \Braket{\psi_a^0 | \hat{H}' \psi_b^0} &= \alpha E^1. \end{align*}\end{split}\]

Defining \(Q_{ij} = \Braket{\psi_i^0 | \hat{H}' \psi_j^0}\) as the ‘matrix elements’ of the perturbation \(\hat{H}'\) in the basis of the unperturbed eigenstates \(\left\lbrace\psi_a^0, \psi_b^0\right\rbrace\), we can thus write

(4.22)#\[\begin{split}\begin{align*} \alpha E^1 &= \alpha Q_{aa} + \beta Q_{ab}, \\ \beta E^1 &= \alpha Q_{ba} + \beta Q_{bb}, \end{align*}\end{split}\]

or, in matrix form

(4.23)#\[\begin{split} Q \begin{pmatrix} \alpha \\ \beta \end{pmatrix} = E^1 \begin{pmatrix} \alpha \\ \beta \end{pmatrix}. \end{split}\]

In other words, the first-order corrections \(E^1\) to the energy are the eigenvalues of the ‘perturbation matrix’ \(Q\) (the matrix whose elements are obtained from \(\hat{H}'\) in the unperturbed state). For a \(2 \times 2\) matrix \(Q\), these eigenvalues are given by

(4.24)#\[\begin{split}\begin{align*} E^1_\pm &= \frac12 \mathrm{Tr}(Q) \pm \frac12 \sqrt{\left(\mathrm{Tr}(Q)\right)^2-4\det(Q)} \\ &= \frac12 \left(Q_{aa} + Q_{bb}\right) \pm \frac12 \sqrt{\left(Q_{aa}-Q_{bb}\right)^2 + 4 Q_{ab}Q_{ba}}. \end{align*}\end{split}\]

Equation (4.24) gives us the first order corrections \(E^1_\pm\) to the energy. From equation (4.23) we can then solve for the coefficients \(\alpha\) and \(\beta\), which give us the corresponding linear combinations of the ‘good’ unperturbed states, i.e., the states which are the limit cases of the perturbed ones if we take the magnitude of the perturbation back to zero. Note that for the ‘good’ states, the matrix \(Q\) becomes diagonal, and we retrieve equation (4.9) for the energies of the perturbations. Extending to higher-order degeneracies is straightforward: for an \(n\)-fold degenerate energy, we get an \(n \times n\) Hermitian matrix \(Q\), with \(n\) real eigenvalues and corresponding eigenstates.

In practice, \(\hat{H}^0\) will usually have a higher degree of symmetry than \(\hat{H}'\). The ‘lifting of the degeneracy’ of the eigenstates of \(\hat{H}^0\) then corresponds to a ‘breaking of the symmetry’. A good example is what happens if you put a hydrogen atom in a magnetic or electric field. While at zero field all directions are equal (full rotational symmetry), in the presence of a constant field, the direction in which the field increases is different from the direction(s) in which it is constant. In three dimensions, for a homogeneous field in the \(z\) direction, there is still rotational symmetry in the \(xy\) plane, but we have broken the full spherical symmetry. A further reduction of symmetry could lead to a further lifting of a remaining degeneracy. This correspondence between symmetry and degeneracy is a key concept in quantum mechanics; it also suggests a way of picking ‘good’ eigenstates from the start.

4.2.2. Joint eigenstates#

While we can find the ‘good’ eigenstates for any perturbation with equation (4.23), there is an easier way if we have a second physical quantity we can measure. For example, for any but the ground state of hydrogen we have degenerate eigenstates of the Hamiltonian, but we can distinguish between states with different values of the orbital quantum number \(l\) by measuring the magnitude of the state’s total angular momentum. The corresponding states with different eigenvalues will then correspond to ‘good’ eigenstates, i.e. states for which the perturbation matrix \(Q\) is diagonal.

We can prove the above statement in a more general setting. We’ll work out the details for a doubly degenerate state again, as higher degeneracies follow easily. Suppose therefore that we have a perturbed Hamiltionian of the form (4.1), where the unperturbed part has two degenerate eigenstates as in (4.17). Suppose moreover that we have another Hermitian operator \(\hat{A}\) which commutes with both \(\hat{H}^0\) and \(\hat{H}'\). Because \(\hat{A}\) commutes with \(\hat{H}^0\), they have a shared set of eigenfunctions. These eigenfunctions need not be the \(\psi_a^0\) and \(\psi_b^0\) of equation (4.17), but they can be written as linear combinations of them, as in equation (4.18). Moreover, suppose that the joint eigenfunctions are not degenerate as eigenfunctions of \(\hat{A}\), i.e., we have

(4.25)#\[\begin{split}\begin{alignat*}{3} \hat{H}^0 \psi_\lambda^0 &= E_0 \psi_\lambda^0 &\qquad & \hat{H}^0 \psi_\mu^0 &= E_0 \psi_\mu^0 \\ \hat{A} \psi_\lambda^0 &= \lambda \psi_\lambda^0 &\qquad & \hat{A} \psi_\mu^0 &= \mu \psi_\mu^0 \end{alignat*}\end{split}\]

It is now easy to show that the perturbation matrix \(Q\) is diagonal in the basis \(\left\lbrace \psi_\lambda^0, \, \psi_\mu^0\right\rbrace\). We simply calculate one of the elements of this matrix, times an eigenvalue of \(\hat{A}\):

(4.26)#\[\begin{split}\begin{align*} \lambda Q_{\lambda \mu} &= \lambda \Braket{\psi_\lambda^0 | \hat{H}' \psi_\mu^0} = \Braket{\lambda \psi_\lambda^0 | \hat{H}' \psi_\mu^0} = \Braket{\hat{A} \psi_\lambda^0 | \hat{H}' \psi_\mu^0} = \Braket{\psi_\lambda^0 | \hat{A} \hat{H}' \psi_\mu^0} \\ &= \Braket{\psi_\lambda^0 | \hat{H}' \hat{A} \psi_\mu^0} = \mu \Braket{\psi_\lambda^0 | \hat{H}' \psi_\mu^0} = \mu Q_{\lambda \mu}, \end{align*}\end{split}\]

where we used that \(\lambda\) (as the eigenvalue of a Hermitian operator) is real in the second equality, that \(\hat{A}\) is Hermitian in the fourth, and that \(\hat{A}\) commutes with \(\hat{H}'\) in the fifth. From equation (4.26) we find that either \(\lambda = \mu\), or \(Q_{\lambda \mu} = 0\), and thus the perturbation matrix is diagonal.

4.3. The fine structure of hydrogen#

When writing down the Hamiltonian of an electron in a hydrogen atom, we accounted for two effects: the kinetic energy of the electron, and the Coulomb interaction potential. To lowest order, these two terms are indeed dominant, but there are other physical effects that come into play as corrections. The two largest ones, together known as the fine structure, are a relativistic correction to the kinetic energy, and an extra potential term due to the coupling between the spin and orbital angular momentum of the electron (known as spin-orbit coupling). To assess the magnitude of these effects, we should compare them to the Bohr energy levels, which are proportional to^[4] \(\alpha^2 m_\mathrm{e} c^2\), where

(4.27)#\[ \alpha = \frac{e^2}{4 \pi \varepsilon_0 \hbar c} \approx \frac{1}{137} \]

is known as the fine structure constant. Both the relativistic and spin-orbit correction have energies of the order \(\alpha^4 m_\mathrm{e} c^2\), and are thus an order \(\alpha^2\) smaller than the Bohr energies, so perturbation theory will certainly apply.

4.3.1. Relativistic correction to Bohr energies#

Thus far, we’ve been using quantum-mechanical analogs of classical mechanics expressions in our Hamiltonians. In particular, we’ve used that the kinetic energy operator is given by \(\hat{K} = \hat{p}^2 / 2m_\mathrm{e}\), both in one and in three dimensions. However, when the speed of a particle approaches that of light, this expression for the kinetic energy is no longer correct (it is merely the limit at low velocities). Even if you’ve never seen special relativity, you will no doubt have seen the most famous equation in physics, Einstein’s \(E = m_\mathrm{e}c^2\). That equation is also a special case though: it holds only for stationary particles. For moving particles, we instead have a more general version^[5]:

(4.28)#\[ E^2 = m_\mathrm{e}^2 c^4 + p^2 c^2. \]

The energy in equation (4.28) is the sum of the kinetic and the rest energy (the \(mc^2\) part) of the particle. For the kinetic energy we can then write, in terms of the momentum:

(4.29)#\[\begin{split}\begin{align*} K &= E - m_\mathrm{e} c^2 = \sqrt{m_\mathrm{e}^2 c^4 + p^2 c^2} - m_\mathrm{e}c^2 = m_\mathrm{e}c^2 \left[ \sqrt{1+\left(\frac{p}{m_\mathrm{e}c}\right)^2} - 1 \right] \\ &= m_\mathrm{e} c^2 \left[1 + \frac12 \left(\frac{p}{m_\mathrm{e}c}\right)^2 - \frac18 \left(\frac{p}{m_\mathrm{e}c}\right)^4 + \cdots - 1\right] = \frac{p^2}{2m_\mathrm{e}} - \frac{p^4}{8 m_\mathrm{e}^3 c^2} + \cdots, \end{align*}\end{split}\]

where we can make the Taylor expansion in the second line if \(p \ll m_\mathrm{e}c\), i.e., if the speed of the particle is small compared to that of light. As we see in equation (4.29)a, the lowest-order term is indeed the classical expression for the kinetic energy, but there are also higher-order corrections. As you can check easily, the rest energy^[6] of an electron (about \(0.5\;\mathrm{MeV}\)) far exceeds the typical energies of the orbitals (which are of the order of \(10\;\mathrm{eV}\)), so the expansion holds, and we can suffice with the lowest-order correction. The Hamiltonian then reads:

(4.30)#\[ \hat{H} = \hat{H}_0 + \hat{H}_\mathrm{r}' = \frac{\hat{p}^2}{2m_\mathrm{e}} - \frac{\hat{p}^4}{8 m_\mathrm{e}^3 c^2} + \hat{V}, \]

where the relativistic correction (indicated by the subscript \(\mathrm{r}\)) is given by \(\hat{H}_\mathrm{r}' = - \hat{p}^4 / 8 m_\mathrm{e}^3 c^2\). For the unperturbed state, the Schrödinger equation gives

(4.31)#\[ \hat{H}_0 \psi^0 = \left(\frac{\hat{p}^2}{2m_\mathrm{e}} + \hat{V} \right)\psi^0 = E^0 \psi^0. \]

Now we’re in luck: although (as we know) the eigenstates of the unperturbed Hamiltonian of hydrogen are highly degenerate, our perturbation \(\hat{H}_\mathrm{r}'\) is radially symmetric, and hence commutes with both \(\hat{L}^2\) and \(\hat{L}_z\). Therefore, the only quantum number we have to consider here is \(n\), and for different values of \(n\) we have different energies, also in the unperturbed case. We can therefore find the relativistic correction to the energies through application of non-degenerate perturbation theory. Applying the expression for the first-order correction to the energy, equation (4.9), and the fact that \(\hat{p}^2\) is Hermitian, we readily find

(4.32)#\[\begin{split}\begin{align*} E_\mathrm{r}^1 &= \Braket{\psi^0 | \hat{H}_\mathrm{r}' | \psi^0} = - \frac{1}{8 m_\mathrm{e}^3 c^2} \Braket{\psi^0 | \hat{p}^4 \psi^0} = - \frac{1}{8 m_\mathrm{e}^3 c^2} \Braket{\hat{p}^2 \psi^0 | \hat{p}^2 \psi^0} \\ &= - \frac{1}{8 m_\mathrm{e}^3 c^2} \Braket{2m_\mathrm{e}(E^0-V)\psi^0 | 2m_\mathrm{e}(E^0-V)\psi^0} \\ &= - \frac{1}{2m_\mathrm{e}c^2} \left[(E^0)^2 - 2 E^0 \Braket{V} + \Braket{V^2} \right], \end{align*}\end{split}\]

where we used the unperturbed Schrödinger equation (4.31) to rewrite \(\hat{p}^2 \psi^0\) in the second line. Note that equation (4.32) holds for an arbitrary pair (\(\psi_n, E_n\)) of eigenfunctions and associated energies of the unperturbed Hamiltonian. Also, equation (4.32) holds for an arbitrary radially symmetric potential. For the hydrogen atom, we have \(V = -e^2/(4\pi\varepsilon_0 r)\), so we have to evaluate the expectation value of \(1/r\) and \(1/r^2\) in an arbitrary eigenstate \(\psi_{nlm}\) of the unperturbed hydrogen Hamiltonian. The calculations are a bit involved (see Exercise 4.3); the results are:

(4.33)#\[ \Braket{\frac{1}{r}} = \frac{1}{n^2 a}, \qquad \Braket{\frac{1}{r^2}} = \frac{1}{n^3 \left(l+\frac12\right) a^3}, \]

where \(a\) is the Bohr radius. Substituting these in equation (4.32), we find for the relativistic correction to the energy \(E_n\) of the state \(\psi_{nlm}\)

(4.34)#\[ E_\mathrm{r}^1 = - \frac{(E_n^0)^2}{2 m_\mathrm{e} c^2} \left( \frac{4n}{l + \frac12} - 3 \right). \]

Note that \(E_\mathrm{r}^1\) depends on the value of the orbital quantum number \(l\) as well as the principal quantum number \(n\), so the relativistic correction does indeed (partially) lift the degeneracy of the eigenstates of the hydrogen atom.

4.3.2. Spin-orbit coupling#

../_images/spinorbitcoupling.svg — Fig. 4.2 Spin-orbit coupling in a hydrogen atom. (a) The classical point of view: an electron orbits a single-proton nucleus. (b) From the electron’s point of view, the proton is orbiting it. As the proton is a moving electrical charge, it generates a magnetic field \(\bm{B}\), which couples to the magnetic dipole \(\bm{\mu}\) originating from the electron’s spin.#

As we’ve seen in Section 3.4.2, because an electron has a nonzero spin, it has a nonzero magnetic dipole moment \(\bm{\mu}\) (equation (3.56)), which can couple to an external magnetic field. As you hopefully remember from electromagnetism, moving electric charges generate magnetic fields. We typically consider electrons in atoms to be moving, and thus they will generate such magnetic fields. That field however cannot interact with the electron’s magnetic moment; if it would, the electron would be generating a force on itself from nothing. However, from the point of view of the electron, it’s the nucleus that’s moving, and as the nucleus also has an electric charge, it does generate a magnetic field that can interact with the electron (cf. Fig. 4.2).

Before we calculate the effect of the interaction between the magnetic field \(\bm{B}\) due to the moving proton with the magnetic dipole of the electron, we first consider a caveat: because the electron’s motion around the proton is an orbit, there’s a force acting on the electron, keeping it in orbit, which it can only do by accelerating the electron (i.e., changing its velocity). The same is true for the moon’s orbit around the Earth: Earth’s gravitational pull exerts a force on the moon, which changes the direction of the moon’s velocity such that it stays in its closed orbit. Now while (by Einsteins equivalence postulate) the laws of physics are the same in all inertial frames of reference, we should expect extra terms when transforming between accelerating frames of reference. Again, the same happens in classical mechanics, where going to a rotating frame causes the emergence of fictitious forces like the centrifugal and Coriolis forces (which, among many other things, strongly influence the weather on the spinning sphere on which we live). As it turns out, we’re somewhat in luck: there will be no additional fictitious forces here, only a numerical correction factor.

Given the magnetic dipole moment \(\bm{\mu} = \gamma \bm{S}\), the Hamiltonian for the interaction between the dipole and the magnetic field is simply \(\hat{H} = - \bm{\mu} \cdot \bm{B}\). For the magnitude of the magnetic field, we invoke the Biot-Savart law from electromagnetism, which states that a current \(I\) in a closed loop of radius \(r\) generates a field of magnitude \(B = \mu_0 I / 2 r\), with \(\mu_0\) the permeability of free space. If the electron takes time \(T\) to complete a loop, we have \(I = e/T\). The magnitude of the angular momentum of the electron in its circular orbit is \(L = r m_\mathrm{e} v = 2 \pi m r^2 / T\). As the magnetic field and the angular momentum moreover point in the same direction, we can thus write

(4.35)#\[ \bm{B} = \frac{1}{4 \pi \varepsilon_0} \frac{e}{m c^2 r^3} \bm{L}, \]

where we used \(c = 1/\sqrt{\varepsilon_0 \mu_0}\). The Hamiltonian will thus contain a dot product between the spin and the orbital angular momentum; the resulting perturbation to the hydrogen Hamiltonian is therefore known as spin-orbit coupling. As I’ve stated above, we should expect the actual correction to include an additional factor due to the transition to a non-inertial frame^[7]; this additional factor turns out to be \(\frac12\). Finally, we use again that for a relativistic electron, \(\gamma = - e / m_\mathrm{e}\). Putting everything together, the spin-orbit correction Hamiltonian reads:

(4.36)#\[ \hat{H}'_\mathrm{SO} = \left(\frac{e^2}{8 \pi \varepsilon_0} \right)^2 \frac{1}{m_\mathrm{e}^2 c^2 r^3} \hat{\bm{S}} \cdot \hat{\bm{L}}. \]

Unlike the relativistic correction, the spin-orbit Hamiltonian (4.36) is not radially symmetric, and therefore does not commute with either \(\bm{L}\) or \(\bm{S}\) (and specifically, not with either \(\hat{L}_z\) or \(\hat{S}_z\). However, it does still commute with both \(\hat{L}^2\) and \(\hat{S}^2\), and it commutes with the sum of the orbital and spin angular momentum, \(\bm{J} = \bm{L} + \bm{S}\). Therefore, eigenstates of \(\hat{L}_z\) and \(\hat{S}_z\) are no longer ‘good’ eigenstates, but eigenstates of \(\hat{L}^2\), \(\hat{S}^2\), \(\hat{J}^2\) and \(\hat{J}_z\) are. Moreover, we can relate the first three to \(\bm{L} \cdot \bm{S}\), making the calculation of the eigenvalues of our perturbed Hamiltonian in these eigenstates easy. We have:

(4.37)#\[\begin{split}\begin{align*} \hat{J}^2 &= \left(\hat{\bm{L}} + \hat{\bm{S}}\right) \cdot \left(\hat{\bm{L}} + \hat{\bm{S}}\right) = \hat{L}^2 + \hat{S}^2 + 2 \hat{\bm{L}} \cdot \hat{\bm{S}} \\ \hat{\bm{L}} \cdot \hat{\bm{S}} &= \frac12 \left( \hat{J}^2 - \hat{L}^2 - \hat{S}^2 \right), \end{align*}\end{split}\]

and therefore the eigenvalues of \(\hat{\bm{L}} \cdot \hat{\bm{S}}\) are

(4.38)#\[ \frac12 \hbar^2 \left[ j(j+1) - l(l+1) - s(s+1) \right]. \]

Calculating the expectation value of the perturbation (4.36) in terms of the unperturbed eigenstates takes some work, but is in principle straightforward. From it, we get for the energies (setting \(s=\frac12\) for our electron):

(4.39)#\[\begin{split}\begin{align*} E_\mathrm{SO}^1 &= \Braket{\hat{H}'_\mathrm{SO}} = \left(\frac{e^2}{8 \pi \varepsilon_0} \right)^2 \frac{1}{m_\mathrm{e}^2 c^2} \Braket{\frac{1}{r^3} \hat{\bm{S}} \cdot \hat{\bm{L}}} \\ &= \left(\frac{e^2}{8 \pi \varepsilon_0} \right)^2 \frac{1}{m_\mathrm{e}^2 c^2} \Braket{\frac{1}{r^3}} \frac12 \hbar^2 \left[ j(j+1) - l(l+1) - s(s+1) \right] \\ &= \frac{e^2}{8 \pi \varepsilon_0} \frac{1}{(m_\mathrm{e}c)^2} \frac{\hbar^2}{2} \frac{j(j+1) - l(l+1) - \frac34}{l(l+\frac12)(l+1) n^3 a_0^3} \\ &= \frac{\left(E_n^0\right)^2}{m c^2} \frac{j(j+1) - l(l+1) - \frac34}{l(l+\frac12)(l+1)} n. \end{align*}\end{split}\]

Remarkably, the relativistic correction and the spin-orbit coupling effect are thus of the same order of magnitude: both scale as \(\alpha^2 = (E_n^0)^2/2mc^2\). We therefore need to consider them together, which actually simplifies the final correction energies to

(4.40)#\[ E_{FS}^1 = \frac{\left(E_n^0\right)^2}{2 m c^2} \left( 3 - \frac{4n}{j + \frac12}\right). \]

Including fine structure, the energy levels of hydrogen are no longer just functions of the principal quantum number \(n\); they now also contain contributions from the combined angular momentum quantum number \(j\):

(4.41)#\[ E_{nj} = - \frac{R_\mathrm{E}}{n^2} \left[1 + \frac{\alpha^2}{n^2} \left( \frac{n}{j + \frac12} - \frac34 \right) \right]. \]

Including fine structure thus breaks some of the symmetry of the excited states of the hydrogen atom: the degeneracy of the \(l\) orbitals for \(n>1\) is lifted. Some symmetry is preserved though, as states with the same value of \(j\) still have the same energy. Moreover, \(m_l\) and \(m_s\) are no longer ‘good’ quantum numbers; the eigenstates of hydrogen including fine structure are characterized by their values of \(n\), \(l\), \(s\), \(j\) and \(m_j\). The corrections to energy levels for the first three states are plotted in Fig. 4.3.

../_images/a02f36dcaefc5ee6008364f0fd3202f472411dfdbb4555e850883b49e9e38290.svg — Fig. 4.3 Fine-structure corrections (equation (4.40) to the first three energy levels of the hydrogen atom, as a function of the value of the fine-structure constant \(\alpha\). Note that the actual value of \(\alpha\) is very small (approximately \(1/137\)), so the corrections are also small. The fine-structure correction lifts the degeneracy in the quantum number \(l\), but not in \(j\).#

4.4. Problems#

Exercise 4.1 (Corrections to the wave function and energy in non-degenerate perturbation theory)

In this problem, we’ll derive the expressions (4.10) and (4.11) for the first-order correction to the wave function and the second-order correction to the energy of a non-degenerate system. First, we re-write equation (4.6) as

(4.42)#\[\left( \hat{H}^0 - E_n^0 \right) \psi_n^1 = - \left( \hat{H}' - E_n^1 \right) \psi_n^0.\]

We already have an expression for \(E_n^1\), and we assumed we knew \(\psi_n^0\), so the right-hand side of equation (4.42) is a known function, and we’re left with a differential equation for the unknown function \(\psi_n^1\). As the \(\psi_n^0\) are the eigenfunctions of the unperturbed Hamiltonian \(\hat{H}^0\), which is a Hermitian operator, they form a complete set, and we can therefore express \(\psi_n^1\) as a linear combination of the \(\psi_n^0\):

(4.43)#\[\psi_n^1 = \sum_{k=0}^\infty c_k \psi_k^0.\]

Argue why we can leave out the term with \(k=n\) in the sum in equation (4.43).
Substitute (4.43), without the \(k=n\) term, into equation (4.42), then take the inner product with \(\psi_m^0\), to show that you get \(0\) if \(m=n\) and, for \(m \neq n\),

(4.44)#\[\left( E_m^0 - E_n^0 \right) c_m = - \Braket{\psi_m^0 | \hat{H}' | \psi_n^0}.\]
From equations (4.44) and (4.43), derive equation (4.10).
Repeat the calculation in Section 4.1, now retaining terms to order \(\varepsilon^2\), to derive equation (4.11) for the second-order correction to the energy. You’ll need to use the fact that (by part (a)) \(\psi_n^0\) and \(\psi_n^1\) are orthogonal.

Exercise 4.2 (The induced dipole perturbation)

In Section 4.1.1 we studied the perturbation to a hydrogen atom due to the presence of another hydrogen atom nearby, and found that this perturbation leads to an attractive van der Waals force. In this problem we’ll derive the dipole-dipole interaction energy, equation (4.14), in the limit that the separation \(R\) between the atoms is much larger than the Bohr radius \(a_0\).

We choose coordinates such that the two atomic nuclei are on the \(z\) axis, separated by a distance \(R\). Let \(\bm{r_1} = (x_1, y_1, z_1)\) be the position of the electron in atom A, in a coordinate system where the nucleus of atom A is at the origin. Likewise, let \(\bm{r_2} = (x_2, y_2, z_2)\) the position of the electron in atom B, in a (shifted) coordinate system where the nucleus of atom B is at the origin. Express the distances \(r_{12}\), \(r_{1\mathrm{B}}\) and \(r_{2\mathrm{A}}\) between the electrons and the electrons and opposite nuclei (see Fig. 4.1(a)) in terms of the coordinates \(x_1, x_2, y_1, y_2, z_1, z_2\) and the nuclear separation \(R\).
Your expressions from (a) should contain one term that’s linear in \(R\), and others of order \(a_0\) (e.g. \(x_1\) and \(z_1\)). Pull out a factor \(R\) from these expressions, and expand \(1/r_{1\mathrm{B}}\), \(1/r_{2\mathrm{A}}\) and \(1/r_{12}\) to second order in terms of magnitude \(a_0/R\).
Finally, substitute your answers to (b) in the perturbation term \(\hat{H}'\) in the van der Waals Hamiltonian (4.12) to get the dipole-dipole interaction of equation (4.14).

Exercise 4.3 (Relativistic corrections to the hydrogen atom)

In this problem, we’ll derive the expressions in equation (4.33) for the expectation values of \(1/r\) and \(1/r^2\) in the unperturbed eigenstates of the hydrogen atom.

For the expectation value of \(1/r\), use the virial Theorem 2.1 to derive the expression given in equation (4.33). We can find the expectation values of both \(1/r\) and \(1/r^2\) from the Hellmann-Feynman theorem (see Exercise 1.13), which states that if the Hamiltonian depends on a parameter \(u\), then its eigenvalues (i.e., the energies) are also functions of \(u\), and we have

(4.45)#\[\frac{\mathrm{d}E}{\mathrm{d}u} = \Braket{\frac{\partial \hat{H}}{\partial u}}.\]

We’ll apply the Hellmann-Feynman theorem to the effective Hamiltonian of the radial part of the wavefunction of a hydrogen atom, see equation (2.69):

(4.46)#\[\hat{H} = -\frac{\hbar^2}{2\me} \frac{\mathrm{d}^2}{\mathrm{d}r^2} + \frac{\hbar^2}{2\me} \frac{l(l+1)}{r^2} - \frac{e^2}{4 \pi \varepsilon_0} \frac{1}{r}.\]

As the Hamiltonian in equation (4.46) is given in terms of the quantum number \(l\), for the purposes of this problem, we’d also like to write the (unperturbed!) energies in terms of \(l\). We can do so by invoking equation (2.82), which states that a series solution to the Schrödinger equation that terminates must satisfy \(\rho_0 = 2(j_\mathrm{max}+l+1) = 2n\). We used this equation to define the quantum number \(n\), but now we’ll use it to convert \(n\) to \(l\), as \(n = l + j_\mathrm{max} + 1 = N + l\) (where, for a given wavefunction, \(N\) is a fixed integer). Substituting this expression into the energy (2.83) of the unperturbed state with quantum number \(n\), we can write \(E_n\) as

(4.47)#\[E_n = - \frac{\me e^4}{2 (4 \pi \varepsilon_0 \hbar)^2 (N+l)^2}.\]
Apply the Hellmann-Feynman theorem to the Hamiltonian (4.46) with energies (4.47) with \(\lambda = e\) to re-derive your expression from (a) for the expectation value of \(1/r\).
Now apply the Hellmann-Feynman theorem to the same Hamiltonian and energies with \(\lambda = l\) to get \(\Braket{1/r^2}\).