9. Lagrangian mechanics*#
In the preceding chapters, we studied mechanics based on Newton’s laws of motion. From these laws we can derive equations of motion that describe the dynamics of particles under the action of forces or torques. We also found that the laws of motion lead to three conservation laws. In this chapter, we will construct and apply an alternative approach, which will also allow us to derive conservation laws and equations of motion, in a somewhat more general way than we did so far. In particular, the method will allow us to easily find the equation of motion of a system with constraints or systems in which the choice of coordinates is not obvious, and identify the relation between conserved quantities and the symmetries of a system.
9.1. Hamilton’s principle and the Euler-Lagrange equations#
To illustrate the basic idea of Lagrangian mechanics, we start with a simple case: a particle of mass \(m\) that can move in one dimension under the action of a conservative force \(F(x)\). Writing the force as the derivative of a potential energy \(U(x)\), Newton’s second law of motion gives us the equation of motion for this system:
As we’ve seen in Section 3, the sum of the kinetic energy \(K = \frac12 m \dot{x}^2\) and the potential energy \(U\) of the particle is conserved. Here we’ll work with the difference between the kinetic and potential energy, known as the Lagrangian:
In equation (9.2) we’ve written the Lagrangian[1] as a function of the position \(x\) and the velocity \(\dot{x}\) of the particle. Note that, although the potential and kinetic energy (and thus the Lagrangian) do not depend explicitly on time (there is no \(t\) in their definition), they do depend on time implicitly because \(x(t)\) and \(\dot{x}(t)\) do; ultimately, time is the only free parameter in this system. By writing \(L\) as a function of the functions \(x(t)\) and \(\dot{x}(t)\), we have created a functional: a function of a function.
Because of the (implicit) time dependence of \(\mathcal{L}\), we can integrate it over time; this integral gives us another functional of \(x\), the action:
In equation (9.3) we used the square brackets to indicate that \(S\) is a functional rather than a function; by substituting the explicit dependence of \(x(t)\) and \(\dot{x}(t)\) on time, \(\mathcal{L}\) in equation (9.3) has become an ordinary function of time.
Equation (9.3) allows us to calculate a value for the action for any trajectory \(x(t)\) that takes a particle from a point \(x_A\) to a point \(x_B\) in a time interval \(T\). Newtonian mechanics tells us that, once we know all the forces, there are only two conditions we can freely select (because Newton’s second law of motion is of second order), which, once set, uniquely determine the particle’s path. So far, we’ve usually set initial conditions: the position and velocity of a particle at \(t=0\), but we could also have set boundary conditions (e.g. the position of the particle at \(t=0\) and \(t=T\)), as we do here. As we know[2] that Newtonian mechanics gives correct predictions about the path taken, calculating an action for alternative paths may seem like a rather futile exercise[3]. However, it turns out that the actual path is always at an extremum of the action (known as a stationary point for functionals). The foundation of Lagrangian mechanics lies in exactly this property, which can be formulated as an alternative axiom:
(Hamilton’s principle)
The time evolution \(x(t)\) of a mechanical system corresponds to a stationary point of the action functional (9.3).
To find the stationary point(s) of the action functional, we need to find a function \(x(t)\) for which the value of \(S[x]\) is at an extremum. Perhaps surprisingly, doing so is fairly straightforward. Suppose we have a function \(x(t)\) for which \(S[x]\) is at an extremum, and we have a collection of other functions \(h(t)\) that satisfy \(h(0) = h(T) = 0\). If the magnitudes of both \(h(t)\) and its derivative \(\dot{h}(t)\) are relatively small, we can expand the expression in the action functional in \(h\):
where \(\mathcal{O}(h^2)\) stands for terms that are quadratic in either \(h(t)\) or \(\dot{h}(t)\), and the partial derivatives of \(\mathcal{L}(x, \dot{x})\) treat the Lagrangian \(\mathcal{L}\) as a (regular) function of \(x\) and \(\dot{x}\). Since the zeroth-order term in the expansion in (9.4) is simply the action functional of \(x\), we can express the difference between \(S[x+h]\) and \(S[x]\) in terms of \(h\) and \(\dot{h}\); if we then integrate the term with \(\dot{h}\) by parts, we arrive at an expression that depends on \(h(t)\) alone:
Now \(x(t)\) is a stationary point of \(S[x]\) if and only if the left-hand side[4] of equation (9.5) vanishes for any function \(h(t)\). For that to hold, the right-hand side must also vanish, which means that the expression inside the square brackets in the integral must vanish. We have thus arrived at a differential equation involving the Lagrangian; it is known as the Euler-Lagrange equation:
For the Lagrangian in equation (9.2), the Euler-Lagrange equation reproduces Newton’s second law, equation (9.1). For conservative systems, we can thus replace Newton’s second law as an axiom with Hamilton’s principle. So far, however, that just involves extra maths; you may well wonder why we bother re-deriving what we already know. The answer is twofold. First, Hamilton’s principle applies to any set of coordinates, not just Cartesian ones, which makes finding expressions in other coordinate systems much easier. Second, the formalism allows us to easily include constraints on the motion (e.g. fixing the motion to a given surface in three-dimensional space), while the equivalent approach in Newtonian mechanics would yield very cumbersome equations.
9.2. Euler-Lagrange equations in general coordinates#
Before we go to arbitrary coordinate systems, we first expand the Euler-Lagrange equations to multiple dimensions. Fortunately, that is easy: we can write the Lagrangian in terms of vectors \(\bm{x}(t)\) and \(\dot{\bm{x}}(t)\), which gives us an action that is a functional of \(\bm{x}\). The variation follows the same steps, which gives for the equivalent of equations (9.4) and (9.5):
In equation (9.7), \(x_i\) is the \(i\)th component of \(\bm{x}\), \(h_i\) the \(i\)th component of \(\bm{h}\), and \(|h| = |\bm{h}|\) (again under the assumption that both \(\bm{h}\) and \(\dot{\bm{h}}\) are small). As the integral in equation (9.7) has to vanish for any function \(\bm{h}(t)\), it has to vanish for each choice of \(h_i(t)\), and thus we find a separate Euler-Lagrange equation for each variable \(x_i\):
Joseph-Louis Lagrange (1736-1813)
Joseph-Louis Lagrange (1736-1813) was an Italian/French mathematician, physicist and astronomer, and one of the creators of both the calculus of variations and of analytical mechanics. Born in Italy as the son of a law professor and a mother from a wealthy family, Lagrange initially studied classical antiquity, with (as directed by his father) the objective of becoming a lawyer. Triggered by a paper by Halley that he found by accident, Lagrange self-studied mathematics, and quickly got a position as a mathematics professor, teaching the theories of Euler, with whom he also corresponded. The flow of information went both ways, with Lagrange proposing what became the Euler-Lagrange equations, as well as the method of Lagrange multipliers for dealing with constraints. By Euler’s recommendation, Lagrange succeeded him as director of mathematics of the Prussian Academy of Sciences in Berlin in 1766, where he published extensively on analytical mechanics. A key contribution is his discovery of what are now known as Lagrange points in celestial mechanics, which explains the localization of the trojan asteroids that share their orbits with Jupiter, as well as being an ideal position for satellites like the James Webb telescope.Lagrange became a member of the French Academy of Sciences in 1787, when he also moved to Paris (and was later naturalized to French). In Paris, Lagrange was one of the founders and first professors of the École Polytechnique and the Bureau des Longuitudes. In the latter capacity, he played a central role in the introduction of the metric system.
Now suppose we have another set of coordinates \(\bm{q}\) (these are often referred to as general or generalized coordinates). We can then express the coordinates \(\bm{q}\) in the Cartesian coordinates \(\bm{x}\) through a coordinate transformation, which simply means that the components of \(\bm{q}\) are functions of \(\bm{x}\):
Inversely, we can express the Cartesian coordinates in terms of the general ones as
We can now re-write the action in terms of the generalized coordinates:
where \(\tilde{\mathcal{L}}(\bm{q}(t), \dot{\bm{q}}(t))\) is the Lagrangian in terms of the generalized coordinates (which we will usually just write as \(\mathcal{L}\)). We can now apply Hamilton’s principle to the action \(S[\bm{q}]\) in terms of the generalized coordinates, which gives us a set of Euler-Lagrange equations for \(q_i(t)\):
9.2.1. Example: Equations of motion in polar coordinates#
In Section 6.2 we derived the equations for Newton’s second law in polar coordinates. Doing that using the Lagrangian formalism is considerably easier. We take \(q_1 = r = \sqrt{x^2 + y^2}\) and \(q_2 = \theta = \arctan(y/x)\). In polar coordinates, the Lagrangian reads
Substituting this Lagrangian into equation (9.12) reproduces equations (6.7) and (6.8):
9.3. Euler-Lagrange equations for a system with constraints#
Many mechanical systems have constraints: conditions which limit their degrees of freedom. Such constraints are often difficult to account for in a force-based description, as it is often not obvious which force you need to ensure that the system satisfies the constraint. In Lagrangian mechanics however, we can easily include ‘constraint rules’ which ensure that constraints are satisfied, and, as an added bonus (but with a caveat, as we’ll see below), can give you the required forces.
To illustrate the concept, we’ll start with an almost tautological example. Suppose a bead of mass \(m\) is fixed to the end of a massless wire of length \(L\), which you spin around with angular velocity \(\omega\). We know of course that the bead will then execute a circular motion. For the sake of illustration, we will ignore this piece of knowledge, to see it emerge instead from the Lagrangian approach, under the action of a constraint: putting the free end of the wire at the origin, we impose that the mass must be at distance \(L\), i.e., we have \(L = \sqrt{x^2 + y^2}\), taking the mass to move in the \(xy\) plane. In Lagrangian mechanics, we treat the constraint as an integral part of the problem. We absorb it into an extended version of the Lagrangian, with the addition of an extra variable, a Lagrange multiplier. As the only energy in the problem is the kinetic energy of the mass, \(K = \frac12 m \dot{x}^2 + \frac12 m \dot{y}^2\), our extended Lagrangian becomes:
Note that we wrote the constraint in the form \(\mathrm{function} = 0\). To find the constrained equations of motion, we now apply the Euler-Lagrange equations to the extended Lagrangian \(\mathcal{L}'\), which gives us three equations (for \(x\), \(y\), and \(\lambda\)):
Equation (9.15)C reproduces our constraint, which we’ve used to put equations (9.15)A and (9.15)B in simpler form. We now get equations of motion for the \(x\) and \(y\) coordinates that satisfy the constraint. In this case, we can immediately write down the solutions, as the equations are those of the simple harmonic oscillator:
where \(\omega = \sqrt{m R / \lambda}\). If we put the initial position of the bead at \(x=L, y=0\), we get \(B = R\) and \(D = 0\); from the condition that the angular velocity is \(\omega\), we get \(A = 0\) and \(C = R\). Note that by specifying the angular velocity, we get the value of the Lagrange multiplier \(\lambda\). Moreover, the terms on the right hand side of equations (9.15)A and (9.15)B represent forces, which are exactly those forces that will keep the bead in its circular motion. For the current example, \(\lambda\) represents the tension in the wire, equal to \(m R / \omega^2\). Note however that for the interpretation of the Lagrange multiplier as a force, we have to be careful about how we include the constraint. In the current example, we could also have written the constraint as \(x^2 + y^2 - L^2 = 0\). Had we included that condition, the corresponding Lagrange multiplier would not be a force (it wouldn’t even have the dimension of a force), but we would still have gotten the correct equations of motion.
(Rolling cylinder)
Suppose we take a solid cylinder of mass \(m\) and radius \(r\) which we place on the inner surface of a larger cylinder with radius \(R\), see Fig. 9.2. We make sure that the axes of both cylinders are horizontal and keep the larger cylinder fixed. The smaller cylinder can then roll along the larger cylinder’s inner surface under the force of gravity. Find the equation of motion of the rolling cylinder, and the frequency of the resulting oscillatory motion.
Solution We need two coordinates to describe a point on the surface of the smaller cylinder: the angle \(\theta\) that its center makes as measured from the vertical, and the rotation angle \(\phi\) of the smaller cylinder with respect to some reference line. We’re given that the smaller cylinder is rolling, so the rolling condition applies, meaning that the point where its surface touches that of the larger cylinder is momentarily stationary. We can capture this condition as the following constraint:
For the kinetic and potential energy of the smaller cylinder we get:
For the Lagrangian (including a Lagrange multiplier \(\lambda\) for the constraint (9.16)) we thus get:
The Euler-Lagrange equation for \(\lambda\) reproduces the constraint (9.16). The equations for \(\theta\) and \(\phi\) give
We now have three equations for the three variables \(\theta\), \(\phi\) and \(\lambda\). In this case, reducing the system to one equation for one unknown is easy. We use equation (9.19) to express \(\lambda\) in terms of \(\phi\), which we can substitute in equation (9.18):
The constraint (9.16) relates the derivatives of \(\theta\) and \(\phi\); taking the time derivative of this equation, we get \(\ddot{\phi}\) in terms of \(\ddot{\theta}\), which we can use to eliminate \(\ddot{\phi}\) from equation (9.20), and we obtain:
We find that the equation of motion (9.21) is mathematically equivalent to that of a simple pendulum. Re-writing the equation in the form \(\ddot{\theta} + \omega^2 \sin(\theta)\), we can read off that the frequency of the oscillatory motion is given by
9.4. Symmetries and conservation laws#
The Langrangian formalism not only gives us easy methods for transforming coordinates and incorporating constraints, it also allows us to uncover a connection between the symmetries of a mechanical system and its conserved quantities. A symmetry in physics is any operation that does not change the properties of the system, generalizing the notion of symmetry in geometric shapes. In geometry, a square has mirror (or reflection) symmetry about various axes, and rotational symmetry for rotations which are multiples of \(90^\circ\); both are examples of discrete symmetries. A circle has a continuous symmetry under rotations about an arbitrary angle. Likewise, a mechanical system can be symmetric under rotations; an example we encountered before is a radial force field, for which the potential depends on the distance between two objects, but not on the angle with respect to some arbitrary reference. We’ve seen in Section 6.3 that for such a force, the angular momentum is conserved. A system with a radial force field has another symmetry: nothing changes if we translate the system in time. Just like we can define an arbitrary angle as the zero angle \(\theta = 0\), we can define an arbitrary point in time as \(t=0\), but those are choices for convenience; they do not influence the equations of motion of the system, which are independent of both \(\theta\) and \(t\). Indeed, there is also a second conserved quantity, the total energy of the system, and we used both conserved quantities when describing the possible orbits for a particle under the action of a (constant) central force.
A second example of a system with a physical symmetry is billiards, in which balls collide fully elastically. Here too, time does not enter explicitly in the equations, and we have conservation of energy. Moreover, if we move the whole system over an arbitrary displacement in space, nothing changes in the collision itself, and we therefore have symmetry under (spatial) translations. The associated conserved quantity is the total momentum of the system[6].
All three relations between conservation laws and symmetries are special cases of Noether’s theorem, which states that if the Lagrangian of a system is invariant under a continuous symmetry of a variable, there is an associated conserved quantity[7] of the system. To keep things concrete, we will prove Noether’s theorem for the conservation laws we encountered so far.
Emmy Noether (1882-1935)
Emmy Noether (1882-1935) was a German mathematician, who made key contribution both to the development of abstract algebra and to ideas in theoretical physics. In physics, she uncovered a deep connection between symmetry and conservation laws (now known as Noether’s theorem, considered by many as the most important theorem for the development of modern physics): for every continuous symmetry of a system, there exists a conserved quantity. Applications of Noether’s theorem include conservation of energy (corresponding to invariance under time translation, i.e., it doesn’t matter where you set \(t=0\), Theorem 9.1), conservation of momentum (invariance under space translation, i.e., it doesn’t matter where you put the origin) and conservation of angular momentum (invariance under space rotation, i.e., it doesn’t matter in which direction you choose your x-axis), see Theorem 9.2. Similar conservation laws are found in special and general relativity, quantum mechanics, and quantum field theory. Unfortunately, even in the early 20th century, women were still excluded from most academic positions. Noether therefore initially worked for free at the university of Erlangen, getting a paid position in G”ottingen in 1915 at the invitation of Hilbert and Klein, who had both been convinced by the quality of her work. Her fame grew through the 1910s and 1920s, gaining worldwide recognition. Due to her Jewish descent, she was dismissed from her academic position by the Nazi government in 1933, and moved to the United States, where she died two years later at age 53. Various institutes and scholarship programs, mostly in Germany, are now named in her honor.
9.4.1. Conservation of energy and the Hamiltonian#
(Conservation of energy under time translation symmetry)
If a mechanical system is invariant under translations in time, i.e., if its Lagrangian does not explicitly depend on \(t\), then the system’s total energy \(E = K + U\) is conserved.
Proof. We consider a general Lagrangian as a function of generalized coordinates[9] \(\bm{q}\) which does not explicitly depend on time, \(\mathcal{L}(\dot{\bm{q}}, \bm{q})\). As we’ve shown in Section 9.2, for each component of \(\bm{q}\), we have a separate equation of motion, given by equation (9.12). We now define the Hamiltonian \(\mathcal{H}\) as
A simple calculation shows that the Hamiltonian is conserved:
where the first and third term cancel each other trivially, and the second and fourth term cancel because of the Euler-Lagrange equation (9.12).
You might complain that while we’ve shown that the newly-introduced Hamiltonian is conserved, we did not prove anything about the energy. They are, however, one and the same, or more precisely, the Hamiltonian is the generalization of the energy to the general coordinates \(\bm{q}\). Returning to the one-dimensional Cartesian coordinate case of Section 9.1, we can easily show that the Hamiltonian we get from the Lagrangian given in equation (9.2) is indeed the total energy:
and similarly for the three-dimensional case. We can make the relation with the energy more explicit in the case of generalized coordinates by observing that the derivative of the Lagrangian to the velocity is simply the momentum:
Moreover, by Newton’s second law, the time derivative of the momentum is the force, which is the derivative of the Lagrangian to the position:
Note that equations (9.25) and (9.26) combined give the Euler-Lagrange equation (9.6); again the generalization to three dimensions is straightforward. We can also generalize to general coordinates \(\bm{q}\), introducing the associated general momentum, defined as
In terms of the general position and momentum, we can thus write for the Hamiltonian
showing that it is indeed the energy expressed in terms of our general coordinates.
As you may know, the Hamiltonian is the central function in quantum mechanics. The Schr”odinger equation, which is the quantum mechanical equivalent of Newton’s second law as the central equation around which the theory is built, contains a ‘quantized’ version of the Hamiltonian, which can be split into a kinetic and a potential energy term. In classical mechanics, the Hamiltonian can be used to write the basis equations in a third alternative form (next to Newton’s second law and the Euler-Lagrange equations), known as Hamilton’s equations of motion, given by
9.4.2. Conservation of linear and angular momentum#
The conservation laws for the two kinds of momentum are special cases of the following property of a Lagrangian in generalized coordinates.
(Conservation of generalized momentum)
If the Lagrangian of a system with generalized coordinates \(\bm{q}\) does not depend on the coordinate \(q_i\), then the associated general momentum \(p_i\) is conserved.
Proof. By the Euler-Lagrange equations we have
where the last equality follows because \(\mathcal{L}\) is independent of \(q_i\).
A coordinate \(q_i\) on which the Lagrangian does not depend (and of which the associated generalized momentum is thus conserved) is sometimes called a cyclic coordinate.
(Conservation of linear momentum in Cartesian coordinates)
For a system described by Cartesian coordinates, if the Lagrangian does not depend on \(x\), \(y\), or \(z\), the associated linear momenta \(p_x\), \(p_y\) and \(p_z\) are conserved.
The proof of Corollary 9.1 is simply that of Theorem 9.2 with \(q_i\) replaced by \(x_i\). A good example is a ball which you throw horizontally. While in the vertical direction there is an external force due to gravity, in the absence of drag, there is no horizontal force, and we thus expect the horizontal momentum to be conserved. Indeed, the associated Lagrangian in three dimensions is given by
which is independent of both \(x\) and \(y\), so the momenta \(p_x\) and \(p_y\) are conserved by Corollary 9.1.
(Conservation of angular momentum in spherical coordinates)
For a system described by spherical coordinates, if the Lagrangian does not depend on the azimuthal angle \(\phi\), describing rotations about the \(z\)-axis, the angular momentum \(L_z\) about the \(z\)-axis is conserved.
Proof. The Lagrangian does not depend on \(\phi\) if the potential does not; therefore, in this case the potential is a function of \(r\) and \(\theta\) alone. The most general Lagrangian that satisfies this condition is
Because the Lagrangian does not depend on \(\phi\), the associated general momentum is constant:
Note that \(p_\phi\) has the dimension of angular momentum. To see that it is indeed the angular momentum about the \(z\)-axis, we note that \(\rho = r \sin(\theta)\) is the distance to the \(z\)-axis, and \(\dot{\phi}\) is simply the angular velocity around the \(z\)-axis, so we have \(p_\phi = m \rho^2 \dot{\phi} = L_z\).
As Theorem 9.2 holds for generalized coordinates, it can also be applied to cylindrical coordinates (see Exercise 9.2) or any other coordinate system of your liking. In some cases you’ll get conservation laws for both linear and angular momentum. In all cases, the conservation law follows from a symmetry: the system doesn’t change if you change the value of a coordinate.
9.4.3. Noether’s theorem*#
In general, we define a continuous symmetry of a Lagrangian with respect to some coordinate \(q_i\) to mean that, to first order, the Lagrangian does not change if the coordinate \(q_i\) changes by a small amount. Therefore, if we apply a coordinate change
the Lagrangian is invariant under this coordinate change if the change does not introduce any corrections of order \(\varepsilon\) to the Lagrangian. Note that in equation (9.30), the function \(M_i(\bm{q})\) can depend on all coordinates \(q_j\), not just on \(q_i\). It may seem that we are making a weaker statement than before by restricting the definition of the symmetry to first-order effects, but in the derivation of the Euler-Lagrange equations we also ignored higher-order effects, and therefore we implicitly assumed the same in Theorem 9.1 and Theorem 9.2.
(Noether’s theorem)
For each continuous symmetry of the Lagrangian, there is an associated conserved quantity.
Proof. We consider a coordinate transformation of the type given by equation (9.30). By definition of the continuous symmetry, the transformation does not introduce any changes to the Lagrangian of order \(\varepsilon\), and we therefore have
where we could drop the limit after the last equality as it no longer has any dependence on \(\varepsilon\). Using the Euler-Lagrange equations (9.12), we can re-write our expression as
Equation (9.32) gives us our conserved quantity \(P(\bm{q}, \dot{\bm{q}})\):
9.5. Problems#
(Constraint forces for a particle sliding off a hemisphere without friction.)
We revisit Exercise 3.17 of a particle sliding off a hemisphere with radius \(R\) without friction. Let \(\theta\) be the angle between the vertical and the current position of the particle.
Write down the Lagrangian of the particle in terms of the (constant) parameter \(R\) and the variable \(\theta\).
From your Lagrangian in (a), find the equation of motion of the particle (which will be valid only as long as the particle touches the hemisphere). While the procedure you followed in (a) and (b) quickly gave you the correct equation of motion, you’ve done so through the (implicit) assumption that the particle always touches the sphere. If instead we include the constraint as a (to be determined) constraining force, in the form of a Lagrange multiplier, we can also get the value of the force. To that end, we’ll write the Lagrangian in terms of two coordinates: \(r\), the distance of the particle to the center of the hemisphere, and \(\theta\), the same angle as before.
Find the kinetic energy of the particle in terms of \(r\) and \(\theta\).
Write down the total Lagrangian of the particle: \(\mathcal{L} = K - V_\mathrm{grav} - \lambda(R-r)\), where \(V(r)\) is the (still to be determined) constraining potential.
From the Lagrangian, find the equations of motion of both \(r\) and \(\theta\).
Now apply the constraint (the ‘equation of motion’ you get from \(\lambda\)) \(r=R\), which also gives \(\dot{r} = \ddot{r} = 0\) to simplify your equations.
Use the one remaining nontrivial equation to find \(\lambda\), and verify that it gives you the normal force of the sphere on the particle.
(Conservation of momentum)
Prove the following corollary of Theorem 9.2:
(Conservation of linear and angular momentum in cylindrical coordinates)
For a system described by cylindrical coordinates, if the Lagrangian does not depend on either the \(z\)-coordinate or the angular coordinate \(\theta\), describing rotations about the \(z\)-axis, both the linear momentum in the \(z\) direction and the angular momentum \(L_z\) about the \(z\)-axis are conserved.