Calculus of Variations: Lecture Notes On
Calculus of Variations: Lecture Notes On
Calculus of Variations: Lecture Notes On
CALCULUS OF VARIATIONS
A. Salih
Department of Aerospace Engineering
Indian Institute of Space Science & Technology, Trivandrum, India
July 2013
ii
Contents
1 Classical Variational Problems
1.1 Introduction . . . . . . . . . . . . . . . . .
1.2 Variational Problems . . . . . . . . . . . . .
1.2.1 The brachistochrone problem . . . .
1.2.2 Minimum surface-area of revolution
1.2.3 Fermats principle of least time . . .
1.2.4 Principle of least action . . . . . . .
1.2.5 Chaplygins problem . . . . . . . . .
.
.
.
.
.
.
.
3
3
3
3
6
6
7
8
9
9
10
.
.
.
.
.
.
.
.
.
.
.
.
17
17
18
18
21
24
25
25
26
26
27
27
27
.
.
.
.
.
.
29
29
29
30
32
33
36
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
iii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
4.2
4.3
4.4
37
39
42
47
47
47
53
53
53
54
56
58
59
60
61
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
73
75
81
81
82
83
84
Bibliography
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
84
CONTENTS
Chapter 1
Classical Variational Problems
1.1 Introduction
The calculus of variations deals with functionals, which are functions of a function, to put it simply.
For example, the methods of calculus of variations can be used to find an unknown function that
minimizes or maximizes a functional. Many of its methods were developed over two hundred years
ago by Euler (1701-1783), Lagrange (1736-1813), and others. It continues to the present day to bring
important techniques to many branches of engineering and physics.
x
b
y(x)
g
b
c(x, y)
b
Ft
Fn
F
y
Figure 1.1: A particle sliding down a curved path.
each point on the curve of (F = mg, where m is the mass of the particle and g is the gravitational
acceleration), but Fn and Ft depend on the steepness of the curve at c. The steeper the curve, the
larger Ft is, and the faster the particle moves. So it would be better if the path close to point P is
more steeper so that the velocity of the object increases rapidly and then flattens towards point c.
Definitely this sort of curve is longer than the straight line connecting the end points. But the extra
speed that the particle develops just as it is released will more than make up for the extra distance
that it must travel, and it will arrive at Q in less time than it takes along a straight line. The curve
along which the particle takes the least time to go from P to Q is called the Brachistochrone (from
the Greek words for shortest time). This famous problem, known as the Brachistochrone Problem,
was posed by Johann Bernoulli (1667-1748) in 1696. The problem was solved by Johann Bernoulli,
his older brother Jakob Bernoulli, Newton, and LHospital.
Let us begin our own study of the problem by deriving a formula relating the choice of the curve
y to the time required for a particle to fall from P to Q. The instantaneous velocity of the ball along
the curve is v = ds
dt , where s denotes the arc-length. Therefore,
p
q
dx2 + dy2
ds
1
1 + y(x)2 dx
(1.1)
dt =
=
=
v
v
v
Let be the time of descent from A to B along the curve y = y(x). Then,
Z
0
dt =
Z S
ds
0
(1.2)
where S is the total arc-length of the curve. If the origin of the coordinate system is taken as the
staring point A, we have, using (1.1)
Z x2 p
1 + y(x)2
=
dx
(1.3)
v
0
To obtain an expression for v we use the fact that energy is conserved through the motion. Thus,
the total energy at any time t must be the same as the total energy at time zero (corresponding to
location P), which we may take to be zero; that is
1 2
mv + mg(y) = 0
2
(1.4)
where we have explicitly noted that depends on the curve y(x). Notice that we use square brackets
for a functional, to signify the fact that its argument is a function. Equation (1.4) defines a functional.
To experiment with formula (1.4), first we suppose that Q is the point with coordinates (1, 1) and
normalize the acceleration of gravity to g = 1/2. Then the straight line segment joining P and Q lies
in the line y(x) = x and we can compute the time of descend easily:
Z 1r
2
dx = 2 2 = 2.8284
=
x
0
If the curve is the circular arc with a vertical tangent at P (i.e., its center is at (1, 0) radius equal to
1) then
q
y(x) = 1 (x 1)2
and the time of descend is
Z 1
0
1
p
dx = 2.6220
4
(2x x2)3
This is an improvement of about 7%, and shows that the shortest path does not yield the shortest
time. If the curve is the arc of the parabola with a vertical tangent at P then y(x) = x and integrating
formula (1.4) numerically we obtain
Z
1 1 1 + 4x
=
= 2.5872
4 3
2 0
x
which is slightly better than the circular arc. But is it the best result possible? That is, is this parabolic
arc the brachistochrone for points P(0, 0) and Q(1, 1)?
Clearly it would be tedious to choose y(x) one after another and look for the shortest time. In our
situation, for P(0, 0) and Q(x2 , y2 ) fixed, we have a collection F of candidate functions, namely all
those that are differentiable and whose graphs pass through both P and Q. To each element y(x) of
F we associate a number according to formula (1.4). Thus there is a defined mapping J from the
set F of relevant functions to the set R of real numbers. Such a mapping from a set of functions to
a set of numbers is called a functional. The Brachistochrone Problem can thus be stated:
Find the function y(x) that minimizes the functional
s
Z x2
1 + y (x)2
1
dx
(1.5)
= J[y] =
y(x)
2g 0
subject to the conditions y(0) = 0 and y(x2 ) = y2 > 0.
We stated earlier that the importance of the brachistochrone problem is that it directed attention
to the systematic study of problems of a certain type. These are problems in which a fixed rule (a
functional J ) assigns a numerical value J[y] to each function y(x) in a particular set F of functions,
subject to constraints such as the endpoint conditions in the brachistochrone problem, and the goal
is to find the element y of F that either maximizes or minimizes J[y].
S=
2 y ds
(1.6)
x1
It may be noted that, the this problem has a discontinuous solution (discovered by Goldschmidt)
y
Q(x2 , y2 )
y(x)
P(x1 , y1 )
b
obtained by revolving the curve which is the union of three lines: the vertical line from P to the point
(x1 , 0), the vertical line from Q to (x2 , 0) and the segment of the x-axis from x1 to x2 .
y
Q(x2 , y2 )
b
P(x1 , y1 )
b
inhomogeneous planar optical medium in xy plane, the speed of light, v(x, y), varies from point to
point, depending on the optical properties. Speed equals the time derivative of distance travelled,
namely, the arc length of the curve y = y(x) traced by the light ray. The ratio, n = c/v is called the
refractive index of the medium where c is the speed light in vacuum. The time required to cover the
distance between two points P(x1 , y1 ) and Q(x2, y2 ) is
dt =
Z S
0
ds
=
v(x, y)
Z x2 p
1 + y(x)2
x1
v(x, y)
dx
(1.7)
where S is the total arc-length of the curve. Fermats principle states that, to travel from point P to
point Q, the light ray follows the curve y = y(x) that minimizes this functional (1.7) subject to the
boundary conditions y(x1) = y1 and y(x2 ) = y2 .
If the medium is homogeneous, e.g., a vacuum, then v(x, y) = c is constant, and we have
1
=
c
Z x2 q
x1
1 + y(x)2 dx
whose minimizers are the straight line connecting the points P and Q. In an inhomogeneous medium,
the path taken by the light ray is no longer evident, and we are in need of a systematic method for
solving the minimization problem. Indeed, all of the known laws of geometric optics, lens design,
focusing, refraction, etc., will be consequences of the geometric and analytic properties of solutions
to Fermats minimization principle.
under the influence of a gravitational field moves on a path along which the kinetic energy is minimal.
As such, it is a variational problem of
E =2
Ek dt =
mv2 dt
where m and v are the mass and velocity of the particle respectively. Since
v=
ds
dt
and
ds =
1 + y(x)2
E =m
v ds = m
q
1 + y(x)2 dx
(1.8)
Since the gravitational field induces the particles motion, its speed is related to its height known from
elementary physics as
v2 = u2 2gy
where u is an initial speed with yet undefined direction. Plugging into the equation (1.8) yields the
functional
Z p
q
2
E =m
u 2gy 1 + y (x)2 dx
(1.9)
These are some of the problem for physics and geometry that can be cast in variational form.
e=
Vw
Va
Chapter 2
Functionals and Its Variations
2.1 Functionals
As we have seen in the last section, there exist a great variety of physical problems that deals with
functionals, which are functions of a function. We are familiar with the definition of a function. A
function can be regarded as a rule that maps one number (or a set of numbers) to another value. For
example,
f (x) = x2 + 2x
is a function, which maps x = 2 to f (x) = 8, and x = 3 to f (x) = 15, etc. On the other hand, a
functional is a mapping from a function (or a set of functions) to a value. That is, a functional is
a rule that assigns a real number to each function y(x) in a well-defined class. Like a function, a
functional is a rule, but its domain is some set of functions rather than a set of real numbers. We can
consider F[y(x)] as a functional for the fixed values of x. For example,
F[y(x)] = 3y2 y + 10
where
y(x) = ex + cos x x
for
x=
J[y] =
Z b
y(x) dx
Here J gives the area under the curve y = y(x). Hence J is not a function of x and its value will be a
number. However, this number depends on the particular form of y(x) and hence J[y] is a functional.
For a = 0 and b = , the value of the functional when y(x) = x is
J[y] =
x dx =
J[y] =
2
4.93
2
sin x dx = 2
10
Therefore the given functional J[y] maps y(x) = x to 2 /2 and maps y(x) = sin x to 2. Because an
integral maps a function to a number, a functional usually involves an integral. The following form
of functional often appears in the calculus of variations,
J[y] =
Z b
a
F(x, y, y ) dx
(2.1)
The fundamental problem of the calculus of variations is to find the extremum (maximum or minimum)
of the functional (2.1).
d 2 f x2
d 3 f x3
df
x + 2
+ 3
+ ...
dx
dx 2!
dx 3!
df
d 2 f x2
d 3 f x3
f f (x + x) f (x) =
x + 2
+ 3
+ ...
dx
dx 2!
dx 3!
f (x + x) = f (x) +
(2.2)
By definition, the differential d f of the function f (x) is how much f changes if its argument, x,
changes by an infinitesimal amount x. That is
d f = lim f =
x0
df
x
dx
(2.3)
Comparing (2.2) and (2.3), we see that the differential d f is the linear part of the total change f .
That is
f = d f + higher-order terms in x
(2.4)
In line with definition of differential of a function f (x), we now introduce the concept of the
variation (or differential) of a functional F[y(x)]. Let y(x) is changed to y(x) + y(x), where y(x) is
the vertical displacement of the curve y(x). It is known as the variation of y and is denoted by y.
We introduce an alternative function of the form
(2.5)
This is illustrated in figure 2.1, where y(x) is shown in red color and Y (x) is shown in blue color.
By definition, the total change in the functional is given by
(2.6)
If (x) is an arbitrary differentiable function that vanishes at the boundaries of the domain, i.e.,
(a) = 0 and (b) = 0, then the variation y(x) can be represented as
y(x) = (x)
y, A
(2.7)
11
y(x)
y(x)
Y (x)
x
Figure 2.1: Plot of y(x) and a small variation from it.
where is an arbitrary parameter independent of x. This definition enable us to write equation (2.5)
in the following form,
(2.8)
Y = y +
Now from (2.6), the total change in functional F is given by
F = F[y + ] F[y]
(2.9)
d 3F 3
dF
d2F 2
+ 2 2 + 3 3 + . . .
dy
dy
2!
dy
3!
(2.10)
F = F[y + ] F[y] =
dF
+ higher-order terms
dy
(2.11)
By definition, first variation of a functional F[y], denoted by F , is how much F changes if its
argument, y, changes by an infinitesimal amount y. Therefore,
dF
dF
=
y
dy
dy
(2.12)
which shows that F is given by the linear part of the equation (2.11). Thus, the change in functional
F[y] and its first variation is related by the equation
F = F + higher-order terms
(2.13)
Let us now define what is called the Gateaux derivative or Gateaux variation in the direction of (x).
It is denoted by F[y; ] and is defined as
d
F
F[y + ] F[y]
F[y; ] = lim
= lim
=
F[y + ]
(2.14)
0
0
d
=0
12
Note that the first variation and the Gateaux variation are related through the parameter , i.e.,
F( f v) = F(gv) where we have denoted first variation by F( f v) and Gateaux variation by F(gv) .
Unfortunately, in the literature, these two variations are denoted by the same symbol F .
Let us look at the meaning of and geometrically. Since y is the unknown function to be
found so as to extremize a functional, we want to see what happens to the functional F[y] when we
perturb this function slightly. For this, we take another function and multiply it by a small number
. We add to y and look at the value of F[y + ]. That is, we look at the perturbed value of
the functional due to perturbation . This is the shaded area shown in figure 2.2. Now as 0,
we consider the limit of the shaded area divided by . If this limit exists, such a limit is called the
Gateaux variation of F[y] at y for an arbitrary but fixed function .
y(x)
y +
(x)
a
F[x, y, y , y ]
for fixed values of x. If y changes to y + , then y changes to y + and y changes to y + .
From equation (2.8), we have
Y = y +
Y = y +
and
Y = y +
The new value of the functional is then
F[x, Y, Y , Y ] = F[x, y + , y + , y + ]
13
F = F[x, y + , y + , y + ] F[x, y, y , y ]
(2.15)
F
F F
F[x, y + , y + , y + ] = F[x, y, y , y ] +
+ +
y
y
y
2
F 2 2 F 2 2 F 2
2F
2F
2 F 2
+ ...
+
+ 2 + 2 + 2
+ 2
+ 2
y2
y
y
y y
y y
y y
2!
Rearranging the above Taylor series expansion, we obtain the change in functional F :
F =
2
F
F F
F 2 2 F 2 2 F 2
+ + +
+ 2 + 2
y
y
y
y2
y
y
2
2
2
F
F
F 2
+ 2
+ 2
+2
+ ...
y y
y y
y y
2!
In analogy with the definition of a function, the sum of the linear part in the F is called the first
variation of the functional F . Therefore,
F =
F
F
F
+ +
y
y
y
(2.16)
Since
y = ,
y = ,
y =
F =
F
F
F
y + y + y
y
y
y
(2.17)
Now, the total differential dF of a function F(x, y, y , y), when x is considered fixed, is given by
dF =
F
F
F
dy + dy + dy
y
y
y
Formula (2.17) for F has the same form as the above formula for dF . Thus the variation of F is
given by the same formula as differential of F , if x is considered to be fixed.
It is to be noted that the differential of a function is the first-order approximation to the change in
that function, along a particular curve while the variation of a functional is the first-order approximation
to the change in the functional from one curve to other.
We mention here that the sum of terms in and 2 is called the second variation of F and the
sum of terms in , 2 , and 3 is called the third variation of F . However, when the term variation is
used alone, the first variation is meant.
14
Some rules of variational calculus
The variational operator follows the rules of differential operator d of calculus. Let F1 and F2 be
any continuous and differentiable functionals. Then we have the following results:
F n = nF n1 F
(F1 + F2 ) = F1 + F2
(F1 F2 ) = F1 F2 + F2 F1
F1
F2 F1 F1 F2
=
F2
F22
It is easy to show that the operators
be written mathematically as
d
dx
dy
d
( y) =
dx
dx
The proof is as follows:
d
d
dy
d
( y) =
( ) =
= = y =
dx
dx
dx
dx
That is, the differential of the variation of a function is identical to the variation of the differential of
the same function.
Another commutative property is the one that states that the variation of the integral of a functional F is the same as the integral of the variation of the same functional, or mathematically
Fdx =
Fdx
Note that the two integrals must be evaluated between the same two limits.
Rb
a
F(x, y, y , y ) dx
J[y] =
Z b
a
F(x, y, y , y ) dx
J[y + ] =
Z b
a
(2.18)
F[x, y + , y + , y + ] dx
J =
Z b
a
F[x, y + , y + , y + ] dx
Z b
a
F(x, y, y , y) dx
(2.19)
d
=0
15
Example 2.1
Consider the functional
J[y] =
Z 1
0
x2 y2 + y2 dx
with y(0) = 0 and y(1) = 1. Calculate J and J[y; ] when y(x) = x and (x) = x2 .
We first evaluate J[y],
J[y] =
Z 1
0
Z 1
0
x2 y2 + y2 dx
2
x x + 1 dx =
Z 1
dx = 1
J[y + ] =
=
Z 1
0
Z 1
0
x2 (y + )2 + (y + )2 dx
x2 (x + x2)2 + (1 + 2 x)2 dx
3
17
= 1 + + 2
2
15
Hence, the change in the functional
J = J[y + ] J[y] =
3
17
+ 2
2
15
3 34
d
J[y + ] = +
d
2 15
Evaluating this derivative at = 0 gives the Gateaux derivative
3
d
J[y + ]
=
d
2
=0
16
Chapter 3
The Fundamental Problem
3.1 Introduction
A fundamental problem of the calculus of variations can be stated as follows: Given a functional J
and a well-defined set of function A, determine which function in A afford a minimum (or maximum)
value to J . The word minimum can be interpreted as a local minimum or an absolute minimum a
minimum relative to all elements in A. The well-defined set A is called the set of admissible functions.
It is those functions that are the competing functions for extremizing J . For example, the set of
admissible functions might be the set of all continuous functions on an interval [a, b], the set of all
continuously differentiable functions on [a, b] satisfying the conditions such as f (a) = 0.
Classical calculus of variations restricts itself to functionals that are defined by certain integrals
and to the determination of both necessary and sufficient conditions for extrema. The problem of
extremizing a functional J over the set A is called a variational problem. Several examples have already
been presented in section 1.2. To a certain degree the calculus of variations could be termed as the
calculus of functionals. In the present discussion we restrict ourselves to an analysis of necessary
conditions for extrema. An elementary treatment of sufficient conditions can be found in Gelfand and
Fomin.
Even the preceding limited collection of examples of variational problems should convince the
reader of the tremendous practical utility of the calculus of variations. Let us now discuss the most
basic analytical techniques for solving such minimization problems. Let us concentrate on the simplest
class of variational problems, in which the unknown is a continuously differentiable scalar function, and
the functional to be minimized depends upon at most its second derivative. As already mentioned,
the basic minimization problem, then, is to determine a suitable function y = y(x) that minimizes the
objective functional
J[y] =
Z b
a
F(x, y, y y) dx,
yA
(3.1)
where F(x, y, y , y ) is some given function and A is a admissible class of functions. The integrand F is
known as the Lagrangian for the variational problem. We assume that the Lagrangian is continuously
differentiable in each of its four arguments x, y, y , and y .
17
18
We note here that all the problems discussed in section 1.2 have the functional in the form
J[y] =
Z b
a
with Lagrangian
q
2
F = 1+y
y
p
F = y 1 + y2
p
F = 1 + y2
p
p
F = u2 2gy 1 + y2
F(x, y, y ) dx,
yA
(3.2)
Brachistochrone Problem
Minimum Surface-Area of Revolution
Fermats Principle of Least Time - for homogeneous medium
Principle of Least Action
f (x0 ) = 0
and
f (x0 ) > 0
(3.3b)
provided f exists. Again, similar conditions can be formulated for local maxima. If (3.3b) holds, we
say f is stationary at x0 and that x0 is an extreme point for f .
is that its
variation vanishes for y = y. That is,
J[y;
] = 0
(3.4)
for y = y and for all admissible variations .
The fact that the condition (3.4) holds for all admissible variations often allows us to eliminate
from the condition and obtain an equation just in terms of y, which can then be solved for y.
19
Generally the equation for y is a differential equation. Since (3.4) is a necessary condition we are not
guaranteed that solutions y actually will provide a minimum. Therefore the solutions y to (3.4) are
called (local) extremals or stationary functions, and are the candidates for maxima and minima. If
J[y;
] = 0, we say J is stationary at y in the direction .
Based on the variations y and y , we distinguish between the following cases, i.e., strong
extremum and weak extremum. Strong extremum occurs when y is small, however, y is large,
while weak extremum occurs when both y and y are small.
Example 3.1
Consider the functional
J[y] =
Z 1
0
1 + y(x)2 dx
y = x
= x(1 x)
x
Z 1h
0
Z 1h
0
= 2+
1 + y (x) + (x)
2 i
dx
i
1 + (1 + (1 2x))2 dx
2
3
2
d
J[y + ] =
d
3
Evaluating this derivative at = 0 gives the Gateaux derivative
d
J[y;
] =
J[y + ]
=0
d
=0
Hence we conclude that variation J[y; ] = 0 and J is stationary at y = x in the direction = x(1 x).
20
Example 3.2
Consider the functional
J[y] =
Z 2
0
1 + y(x)2 dx
y = x
= sin x
J[y + ] =
=
Z 2 h
0
1 + y (x) + (x)
Z 2
0
2 i
dx
1 + (1 + cos x)2 dx
= (4 + 2 )
Then the derivative of the functional
d
J[y + ] = 2
d
Evaluating this derivative at = 0 gives the Gateaux derivative
d
J[y + ]
=0
d
=0
Hence we conclude that variation J[y; ] = 0 and J is stationary at y = x in the direction = sin x.
Example 3.3
Consider the functional
J[y] =
Z 1
0
1 + y(x)2 dx
21
y = x2
= x(1 x)
1
x
Z 1h
0
i
1 + (2x + (1 2x))2 dx
1
=
7 2 2
3
2
d
J[y + ] = (1 + )
d
3
Evaluating this derivative at = 0 gives the Gateaux derivative
2
d
J[y + ]
=
d
3
=0
J[y] =
Z b
a
F(x, y, y ) dx
(3.5)
is a minimum. Here y C2 [a, b].1 and F is a given function that is twice continuously differentiable
on [a, b] R2 . In order to uniquely specify a minimizing function, we must impose suitable boundary
1C2 [a, b]
2
is the set of all continuous functions on an interval [a, b] whose second derivative is also continuous. If
22
conditions. Any type of boundary conditions including, Dirichlet (essential) and Neumann (natural)
boundary conditions may be prescribed. In the interests of brevity, we shall impose the Dirichlet
boundary conditions of the form
y(a) =
y(b) =
That is, the graphs of the admissible functions pass through the end points (a, ) and (b, ).
We seek a necessary condition for the functional J[y] to be a minimum. For this, we need to
compute the G
ateaux variation of J . Let y(x) be a local minimum and (x) a twice continuously
differentiable function satisfying (a) = (b) = 0. Then, Y = y + is an admissible function and the
new functional becomes
J[Y ] =
Z b
a
F[x, Y, Y ] dx =
Z b
a
F[x, y + , y + ] dx
(3.6)
d
J[Y ] =
d
Z b
F[x, Y, Y ] dx
a
Z b
Z b
F Y
F Y
F
F
dx =
=
+
+
dx
Y
Y
Y
Y
a
a
Z b
F
F
d
=
J[y + ]
+ dx
d
y
y
a
=0
(3.7)
As we have seen earlier, the necessary condition for the functional J[y] to have an extremum at y is
that its variation vanishes for y. That is,
d
J[y; ] =
J[y + ]
=0
d
=0
(3.8)
Therefore, from (3.7) the necessary condition for the functional J[y] to have an extremum at y is given
by
Z b
F
F
(3.9)
+ dx = 0
y
y
a
for all C2 [a, b] with (a) = (b) = 0.
J =
Z b
a
F(x, y, y ) dx =
Z b
a
F dx = 0
23
F
y
dx
y
y
a
Z b
F
F
+ dx = 0
=
y
y
a
F dx =
Z b
F
y +
Z b
F
F
+ dx = 0
y
y
dx =
y |{z}
|{z} v
b
a
Z b
d F
dx
dx
y
dx
F
y
F
dx +
b
=0
(3.10)
Since, (a) = (b) = 0, the last term on right-hand side vanishes and thus the condition (3.10)
becomes
Z b
d F
F
dx = 0
(3.11)
y
dx y
a
The above equation must hold for any arbitrary limits. This is possible only if the integrand is
identically zero (DuboisReymond lemma). Therefore, we have
F
d
y
dx
F
y
=0
Since (x) is an arbitrary admissible function, equation (3.11) holds good only if
d
F
y
dx
F
y
=0
Z b
a
uv dx = uv u v dx
y(a) =
F(x, y, y ) dx
y(b) =
24
then y must satisfy the equation
d
F
y
dx
F
y
=0
x [a, b]
(3.12a)
Equation (3.12a) is called the EulerLagrange equation or simply Euler equation. There are two
important aspects of the derivation of the EulerLagrange equation that deserve close inspection.
First, it provides a necessary condition for a local minimum but not a sufficient one. It is analogous
to the derivative condition f (x) = 0 in differential calculus. Therefore its solutions are not necessarily
local minima. It is a second-order ordinary differential equation with a solution that is required to
satisfy two conditions at the boundaries of the domain of solution. Such boundary value problems
may have no solution, one unique solution, or multiple solutions depending on the situation. A case
with multiple solutions will imply that more than one paths from point (a, ) to point (b, ) satisfy
the EulerLagrange equation. However, not all of these paths will necessarily minimize the functional
J[y]. A second important aspect of the EulerLagrange equation is related to our assumption that the
curve y(x) C2 [a, b]. Indeed, our considerations focused only on such smooth functions. However,
the actual path that extremizes an integral might be one with a corner or a kink. Such paths are
not relevant for the use of the EulerLagrange equation in Newtonian mechanics. However, they are
often the true solutions in other problems in the calculus of variations, as we have seen in the case of
physics of soap films.
It may be worthwhile to note that if y is treated as independent variable and x is dependent
variable, then the EulerLagrange equation (3.12a) will takes the form
d
F
x
dy
F
x
=0
y [ , ]
(3.12b)
y(a) =
y(b) =
where and are constants. This is called the essential (or Dirichlet) boundary condition. In some
applications, we may need to apply other types of boundary conditions to the function y(x).
If we still want the last term in equation (3.10) to vanish (so that we obtain the familiar Euler
Lagrange equation), but allowing y(a) and y(b) to be non-zero, then we need to have,
F
=0
y x=a
F
=0
y x=b
This is called a natural (or Neumann) boundary condition. A system may also have a natural boundary
condition at one end (x = a) and an essential boundary condition at the other end (x = b).
25
F
F dy F dy
dF
=
+
+
dx
x
y dx y dx
dF
F
F
F
=
+ y
+ y
dx
x
y
y
But, we have
F
d
d
F
F
+y
y =y
dx
y
y
dx y
(3.13)
(3.14)
+y
y
y =
dx
dx
y
x
y
dx y
Rewriting the above equation to give
d
d F
F
F
F
=y
F y
dx
y
x
y
dx y
By the EulerLagrange equation (3.12a) we see that the right-hand side of the above equation is zero.
Thus,
d
F
F
=0
(3.15)
F y
dx
y
x
Equation (3.15) is another useful form of the EulerLagrange equation.
F y
F
=C
y
(3.16)
Thus, the extremizing function y is obtained as the solution of a first-order differential equation (3.16)
involving y and y only. This simplified form of EulerLagrange equation (3.16) is known as the
Beltrami identity. The combination F y Fy that appears on the left of the Beltrami identity is
sometimes referred to as Hamiltonian.
Case II. If F is independent of y, then F/ y = 0 and the form of EulerLagrange equation (3.12a)
becomes
d F
=0
dx y
26
F
=k
(3.17)
y
where k is a constant. Note that equation (3.17) is a first order differential equation involving x and
y .
Case III. If F is independent of y , then F/ y = 0 and the form of EulerLagrange equation
(3.12a) becomes
F
=0
y
integrating, we get F = F(x), a function of x alone.
J[y] =
Z b
a
F(x, y, y y ) dx
(3.18)
y(a) =
y (a) =
y(b) =
y (b) =
Here y C4 [a, b] and F is a given function that is twice continuously differentiable on [a, b] R2 .
The necessary condition for the functional J[y] to be a minimum is that the function y(x) satisfies
the following EulerLagrange equation
d F
d2 F
F
+ 2
=0
(3.19)
y
dx y
dx y
Instead of the Dirichlet-type boundary conditions we may also prescribe a Neumann -type (natural)
boundary conditions of the form
d F
F
F
=0
=0
y
dx y x=a
y x=a
F
F
d F
=0
=0
y
dx y x=b
y x=b
In general, when the functional contains higher derivatives of y(x), which extremizes the functional
J[y] =
Z b
a
F(x, y, y y , , y(n) ) dx
+ 2
+ (1)
y
dx y
dx y
dxn y(n)
(3.20)
(3.21)
Equation (3.21) is differential equation of order 2n and is called EulerPoisson equation. The general
solution of this contains 2n arbitrary constants, which may be determined from the 2n boundary
conditions.
27
=0
u
x ux
y uy
(3.23)
This second-order partial differential equation that must be satisfied by the extremizing function u(x, y)
is called the Ostrogradsky equation after the Russian mathematician M. Ostrogradsky.
J[y1 , , yn ] =
Z b
a
F(x, y1 , y2 , , yn , y1 , y2 , , yn ) dx
(3.24)
where the functions y1 , y2 , , yn satisfy the prescribed essential boundary conditions of the form
y1 (a) = y1a
y2 (a) = y2a
y1 (b) = y1b
y2 (b) = y2b
yn (a) = yna
yn (b) = ynb
Here yi C2 [a, b] and F is a given function that is twice continuously differentiable on [a, b] R2 . The
necessary condition for the functional J to be a minimum is that the function yi (x) (i = 1, 2, . . . , n)
satisfies the following EulerLagrange equation
d F
F
=0
(i = 1, 2, . . . , n)
(3.25)
yi
dx yi
Thus for an extremum of J , the necessary conditions are that the n differential equations shown in
equation (3.25) should be satisfied. From the solutions of this system of differential equations, we can
determine the functions yi s which make the integral (3.24) an extremum.
y = y(x)
in the two dimensional case with one independent variable, x. However, its often more convenient to
consider functionals of functions given in parametric form. Thus, we consider here the case where the
function is given parametrically in the form
x = x(t)
y = y(t)
(3.26)
28
Suppose that in the functional
Z x2
x1
F(x, y, y ) dx
where the function y is given in parametric form (3.26). The (3.27) can be written as
Z t2
Z t2
y(t)
(x, y, x,
y)
dt
x(t)
dt =
F x(t), y(t),
x(t)
t1
t1
(3.27)
(3.28)
where the overdot denotes differentiation with respect to t . The functional appearing on the righthand side of (3.28) does not involve t explicitly. It is possible to show that the value of the functional
given by (3.26) depends only on the function y = y(x) defined by the parametric equations x = x(t),
y = y(t) and not on the particular choice of x(t), y(t) themselves.
Suppose we have a functional of the form
J[t] =
Z t2
t1
(x, y, x,
y)
dt
(3.29)
in which the function is parameterized, the variational problem of (3.29) leads to the pair of EulerLagrange equations
d
d
=0
=0
(3.30)
x
dt x
y
dt y
which must be equivalent to the single Euler-Lagrange equation
d F
F
=0
x
dt x
corresponding to the variational problem for the original functional (3.27).
Chapter 4
Application: Standard Variational
Problems
This section deals with several classical problems to illustrate the methodology. The problem of finding
the minimal path between two points in space will be addressed here.
L = J[y(x)] =
Z Q
ds =
Z x2 q
1 + y(x)2 dx
x1
The variational problem is to find the plane curve whose length is shortest i.e., to determine the
function y(x) which minimizes the functional J[y]. The curve y(x) which minimizes the functional J[y]
is be determined by solving the EulerLagrange equation (3.12a)
d
F
y
dx
F
y
=0
F=
q
1 + y(x)2
29
30
and is a special case in which F independent of x and y. Then according to (3.17) EL equation reduces
to
F
=k
y
where k is a constant. The derivative
Therefore,
F
2y
1
p
=k
=
y
2 1 + y (x)2
y = k
y =
p
1 + y2
k2
=m
1 k2
Integrating, y = mx + c, where constants m and c are to be found using the boundary conditions
y(x1) = y1 and y(x2 ) = y2 . Thus, the straight line joining the two points P(x1 , y1 ) and Q(x2 , y2 ),
y=
y2 y1
x2 y1 x1 y2
x+
x2 x1
x2 x1
F y
where B is a constant. Now
F
=B
y
F
1
1
= p
2y
y
y 2 1 + y2
y
y 1 + y2
31
p
p
1 + y2 1 + y2 y2
=B
p
y 1 + y2
"
y 1 + y2 = C
dy
y 1+
dx
2 #
=C
That is, the solution to the brachistochrone problem is the solution y = y(x) of the above ordinary
differential equation. This is a well known differential equation whose solution1 is called the cycloid.
A cycloid is the locus of a point fixed on the circumference of a circle as the circle rolls on a flat
horizontal surface. We can show that there is one and only one cycloid passing through points P and
Q. Its equation is given in the parametric form
x( ) = a( sin )
(4.2)
y( ) = a(1 cos )
where a = C/2 is the radius of the rolling circle and is the angle of rotation. Using the condition
that the curve (cycloid) passes through Q(x2 , y2 ), the value of the constant a can be determined.
b
b
y
Figure 4.1: The cycloid acts as a brachistochrone.
Another remarkable characteristic of the brachistochrone particle is that when two particles at
rest are simultaneously released from two different points M and N of the curve they will reach the
terminal point of the curve at the same time, if the terminal point is the lowest point on the path.
Such a curve is called an isochrone or a tautochrone. This is also counterintuitive, since clearly they
have different geometric distances to cover; however, since they are acting under the gravity and the
slope of the curve is different at the two locations, the particle starting from a higher location gathers
much bigger speed than the particle starting at a lower location. Hence the brachistochrone problem
may also be posed with a specified terminal point and a variable starting point, leading to the class
of variational problems with open boundary.
1
32
x
b
M
b
N
b
y
Figure 4.2: The tautochrone
S = J[y] = 2
Z x2
x1
q
y(x) 1 + y(x)2 dx
(4.3)
q
F = y(x) 1 + y(x)2
which is independent of x and therefore we can apply the Beltrami identity (3.16)
F y
where a is a constant. Now
F
=a
y
F
1
= y p
2y
2
y
2 1+y
y
which on simplification yields
p
y y2
1 + y2 p
=a
1 + y2
y=a
p
1 + y2
The solution to the minimization problem therefore reduces to solving the above differential equation.
Fortunately this nonlinear, first-order, differential equation is elementary. We recast it as
dy
= y =
dx
p
y 2 a2
a
33
1
a
dx + b =
x
+b =
a
dy
p
y 2 a2
dy
p
y 2 a2
where b is the constant of integration. Substituting y = a cosh , so that dy = a sinh d and plugging
into the right-hand side of the above equation gives
x
a sinh d
p
+b =
a
a cosh2 1
Z
y
x
+ b = d = = cosh1
a
a
Z
i.e.,
Therefore, we have
y = a cosh
x
+b
a
The constants a and b are determined using the end (boundary) conditions
y(x1) = y1
and
(4.4)
y(x2) = y2
Equation (4.4) represents a catenary. The surface generated by rotation of the catenary is called a
catenoid.
=0
y [ , ]
x
dy x
34
p
F = n(y) 1 + x2
Since it does not contain the x term explicitly, E-L equation becomes
F
=a
x
where a is a constant. It follows that
x
n(y)
=a
1 + x2
Solving for x , we get
n2 x2 = a2 1 + x2
a
dx
= x =
dy
n2 a2
The solution to the minimization problem therefore reduces to solving the above differential equation.
Depending on the particular model of the speed of light (or refractive index) in the medium the result
varies.
Case I. If refractive index is inversely proportional to the height of the medium (n(y) = 1/y), the
minimization problem becomes
dx
a
y
= q
= p
2
1
dy
2
b y2
2 a
y
x+d =
ydy
p
b2 y 2
x+d =
1
2
or
(x + d)2 = b2 y2
dz
= z = b2 y 2
z
(x + d)2 + y2 = b2
n(y) = n0 (1 + y)
where n0 and are constants. The minimization problem then becomes
dx
b
a
= p
= q
dy
(1/ + y)2 b2
n20 (1 + y)2 a2
35
x x0 =
b dy
p
(1/ + y)2 b2
where x0 is the constant of integration. Let u = 1/ + y, then the above equation becomes
x x0 = b
du
2
u b2
x x0 = b cosh1
or
x x0 = b cosh1
u
b
1/ + y
b
Therefore,
p
p
(x x1)2 + (y0 y1 )2
(x x2)2 + (y0 y2 )2
+
(x) =
v1
v2
Fermats principle of least time says that the path taken by the light ray will be the one for which x
minimizes (x). From calculus, it follows that
d
=0
dx
which gives
v1
x x1
(x x1)2 + (y0 y1 )2
x x2
p
=0
v2 (x x2)2 + (y0 y2 )2
The solution of this equation yields the x location of the ray crossing the boundary, and produces the
well-known Snells law of
sin 2
sin 1
=
(4.5)
v1
v2
where the angles are measured with respect to the normal of the boundary between the two media.
36
P1
y = y0
2
b
P2
EI
d2y
M(x) = 0
dx2
(4.6)
y(0) = 0
and
y(L) = 0
where E is the Youngs modulus I is the second moment of area of the cross-section of the beam,
and L is the span of the beam. The product EI , called the flexural rigidity, represents the resistance
offered by the beam to deflection and M(x) is the bending moment. In the present problem M = M0
is a constant. Therefore,
EIy M0 = 0
This standard differential equation can be readily integrated to obtain the deflection curve. The
solution is given by
M0
y(x) =
x(x L)
(4.7)
2EI
This beam deflection problem can also be solved by using the variational methods. To do this we
y
M0
M0
x
Figure 4.4: Simply supported beam
need to recast the problem as a variational problem using an appropriate variational statement. Here
we use the principle of minimum potential energy which states that
37
EI
F =
2
dy
dx
2
+ M0 y =
EI 2
y + M0 y
2
=0
y
dx y
We compute
F
F
= M0
and
= EIy
y
y
Substitute the above results in the EL equation to obtain
M0
d
(EIy) = 0
dx
EIy = M0 x + c1
Integrating again
x2
+ c1 x + c2
2
where constants c1 and c2 are to be found using the boundary conditions y(0) = 0 and y(L) = 0. Thus,
we have
M0 L
and
c2 = 0
c1 =
2
Substitution of these values in the general solution gives the equation of the deflection curve
EIy = M0
y=
M0
x(x L)
2EI
38
Multiply the left-hand side of the differential equation L (y) with the variational y of the
dependent variable y and integrate over the domain of the problem.
Use integration by parts to transfer the derivatives to variation y.
Express the boundary integrals in terms of the specified boundary conditions.
Bring the variational operator outside the integrals.
The procedure is best illustrated with an example. We will take the problem of the deflection of beam
governed by the equation (4.1). Since the differential equation holds good for all points within the
system, we can write
d 2y
EI 2 M0 y = 0
dx
where y is an arbitrary variation on y with y|x=0 = 0. Integrating over the domain of the problem,
Z L
d 2y
J =
EI 2 M0 y dx = 0
dx
0
Z L
Z L
2
d y
EI 2 y dx
M0 y dx
J =
dx
0
0
Now, the first integral on the right-hand side can be integrated by parts2 by letting u = y and
2
v = EI ddx2y . Thus
Z L
Z L
dy L
d( y) dy
J = y EI
M0 y dx
EI dx
dx 0
dx
dx
0
0
The first term vanish if we assume either the homogeneous Dirichlet or Neumann conditions at the
boundaries. That is
y(0) = y(L) = 0
y(0) = y(L) = 0
or
Hence
J =
Therefore,
dy
dy
=
=0
dx L
dx 0
Z L
EI dy 2
2
J[y] =
dx
"
dx
Z L
EI dy 2
0
dx
Z L
0
M0 y dx
#
+ M0 y dx
Some standard differential equations and their functional are given below:
If the differential equation is of the form
d2
+ P(x) + Q(x) = 0
dx2
x [a, b]
uv dx = uv u v dx
(4.9a)
(4.9b)
39
2 + p2 = q
xD
(4.10a)
1
J[ ] =
2
where
Z
D
| |2 p2 2 + 2q d D
| | = =
2
2
(4.10b)
2
Example 4.1
Find the functional for the ordinary differential equation
d 2y
+ 3y + x = 0,
dx2
0<x<1
1
J[y] =
2
"
Z 1 2
dy
dx
3y2 2xy dx
d
F
y
dx
F
y
=0
for the above functional to recover the original differential equation. That is
dy
d
2
=0
6y 2x
dx
dx
d 2y
+ 3y + x = 0
dx2
J[ ] =
F(x, y, , x , y ) dS
(4.11)
40
Our objective is to minimize this integral. In the Rayleigh-Ritz method, we select a linearly independent
set of functions called basis functions un and construct an approximate solution to equation (4.11),
satisfying some prescribed boundary conditions. The solution is in the form of a finite series
= u0 +
an un
(4.12)
n=1
where u0 meets the nonhomogeneous boundary conditions if any, and un satisfies homogeneous boundary conditions. The unknown coefficients an are to be determined and is an approximate solution
to the exact solution . Substitution of the approximate solution into equation (4.11) results in the
function with N coefficients a1 , a2 , aN . That is,
J( ) = J(a1, a2 , aN )
The minimum of this function is obtained when its partial derivatives with respect to each coefficient
is zero. That is
J
J
J
= 0,
= 0,
=0
a1
a2
aN
or
J
= 0, n = 1, 2, N
(4.13)
an
Thus we obtain a system of N linear algebraic equations which can be solved to obtain an . These an
are then substituted into the approximate solution (4.12). Now, if as N in some sense,
then the procedure is said to converge to the exact solution.
The basis functions are selected to satisfy the prescribed boundary conditions of the problem.
u0 is chosen to satisfy the inhomogeneous boundary conditions, while un (n = 1, 2, N) are selected
to satisfy the homogeneous boundary conditions. It may be noted that u0 = 0 if the prescribed
boundary conditions are all homogeneous (Dirichlet conditions). The Rayleigh-Ritz method has two
major limitations. First, the variational principle in equation (4.11) may not exist in some problems
such as in nonself-adjoint equations (odd order derivatives). Second, it is difficult, if not impossible,
to find the functions u0 satisfying the global boundary conditions for the domains with complicated
geometries.
Example 4.2
Use the Rayleigh-Ritz method to solve the beam deflection problem given by the variational principle
(4.8):
"
#
Z L
EI dy 2
[y] =
+ M0 y dx
2 dx
0
with the boundary conditions y(0) = 0 = y(L). The exact solution of this minimization problem is
y(x) =
M0
x(x L)
2EI
y = u0 +
an un
n=1
41
where u0 = 0. Some of the possible choices for base function are polynomial of the form
N
y =
an x n
n=1
y =
an sin n x
n=1
We will first explore the case of trigonometric function with N = 1. That is, we have
y = a sin x
The assumed solution should satisfy both the boundary conditions. If we set = 1/L, we have a
solution which satisfies the boundary conditions. Thus, we have
y = a sin
x
L
Here a is the undetermined parameter to be found out. We have to select a such that the functional
[y] is a minimum. Substituting the above approximate solution into the functional gives
Z L
x 2
x
EI a
+ M0 a sin dx
cos
(a) =
2 L
L
L
0
Evaluating the integral to yield
(a) =
EI 2
4L
a +
2M0 L
a
At this point observe that (a) is an ordinary function of the unknown a. The function (a) is
minimum when
EI 2
2M0 L
4M0 L2
=0
2
a+
=0
or
a= 3
a
4L
EI
Hence the approximate solution is
x
4M0 L2
sin
3
EI
L
Next, we will try with the polynomial function with N = 2. That is, we
y =
Polynomial approximation:
have
y = a1 x + a2 x2
The assumed solution satisfy the boundary condition y(0) = 0. Application of the second boundary
condition yields
0 = a 1 L + a 2 L2
a1 = a2 L
Hence the approximate solution which satisfies both the BCs is given by
y = ax(x L)
42
where we have dropped the subscript of a. Substituting the above approximate solution into the
functional gives
Z L
EI
2
[a(2x L)] + M0 ax(x L) dx
(a) =
2
0
Z L
Z L
EI
2 2
2
2 2 2
=
4a x 4a Lx + a L
dx +
M0 ax2 aLx dx
2
0
0
3
aL
aL3
EI 4 2 3
2 3
2 3
a L 2a L + a L + M0
=
2 3
3
2
2 3
3
EI a L
aL
=
+ M0
2
3
6
The function (a) is minimum when
=0
a
EI 2aL3
M0 L3
=0
2 3
6
or
a=
M0
2EI
M0
x(x L)
2EI
We see here that this is the exact solution of the problem. This has happened because the assumed
approximate solution was of the same for as the exact solution.
y =
J[y] =
Z bp 2
c (1 y2) v2 vy
c2 v2
dx
(4.14)
y(0) = 0
and
y(b) unspecified
Such a problem is called a free endpoint problem and if y(x) is an extremal, then a certain condition
must hold at x = b. Conditions of these types, called natural boundary conditions (Neumann type),
are discussed here.
y = y(x)
(0, 0)
x=b
J[Y ] =
Z b
a
F[x, Y, Y ] dx =
Z b
a
F[x, y + , y + ] dx
(4.15)
If y(x) is not prescribed at the end points, then end points are treated as variable in the y direction. In
this case, the condition for the functional J[y] in (4.15) to have minimum is given by equation (3.10),
which is repeated below:
Z b
F
d F
F b
(4.16)
dx +
=0
y
dx y
y a
a
There are four possible cases in which the above conditions can be met. We show these diagrammatically as
y +
y
a
(i)
y
b
a
(ii)
y +
y +
y +
y
b
a
(iii)
y
b
a
(iv)
Figure 4.6: The four possible cases of varying end points in the direction of y.
Case (i) In this case we have fixed boundary at both boundaries in the form
y(a) =
y(b) =
Here should satisfy (a) = (b) = 0. The equation (4.16), in this case, then becomes
Z b
F
d F
dx = 0
y
dx y
a
44
Since (x) is arbitrary admissible function, the above equation is satisfied only when the integrand is
zero. Thus, we have
d F
F
=0
(4.17)
y
dx y
This is just our standard EulerLagrange equation (3.12a).
Case (ii) In this case only the left boundary has the fixed boundary condition unspecified. Thus, we
have (a) = 0 but (b) 6= 0. The equation (4.16), in this case, then becomes
Z b
F
a
y
dx
F
y
F
dx +
=0
x=b
Also since (x) is arbitrary admissible function and (b) does not vanish, the above equation is
satisfied only if
d F
F
=0
y
dx y
and
F
=0
y x=b
(4.18)
F
y
= 0 at x = b.
Case (iii) In this case only the right boundary has the fixed boundary condition unspecified. Thus, we
have (b) = 0 but (a) 6= 0. The equation (4.16), in this case, then becomes
Z b
F
a
y
dx
F
y
F
dx +
=0
x=a
Also since (x) is arbitrary admissible function and (a) does not vanish, the above equation is
satisfied only if
F
d F
=0
y
dx y
and
F
=0
y x=a
(4.19)
Case (iv) If both the boundary conditions (a) and (b) are not specified, we can chose (x) which
does not vanish both at a and b. Thus the necessary conditions for the functional to become a
minimum are
F
F
=0
and
=0
(4.20)
y x=a
y x=b
So when a condition is not specified at a boundary, we first find the general solution of the Euler
Lagrange equation (4.17). The arbitrary constants in the solution are then determined by using the
natural boundary conditions given by (4.18), (4.19), or (4.20).
p
where F = 1 + y(x)2 . We have seen that if the points are fixed in space, say A(x1 , y1 ) and B(x2 , y2 ),
the curve joining the two points which has the shortest distance is a straight line given by
y=
y2 y1
x2 y1 x1 y2
x+
x2 x1
x2 x1
If y values are not given at x1 and x2 , we can still solve the EulerLagrange equation to get the
equation of the curve. As we have seen earlier, it is a straight line, y = mx + c. Now we have from
the natural boundary conditions (4.15),
F
F
=0
and
=0
y x=x1
y x=x2
Therefore, for the left boundary at x1 ,
y (x1 )
F
p
=
=0
y x=x1
1 + y2
dy
y (x1 ) =
=0
dx x1
Since y = mx + c, we get m = 0 at x = x1 . In a similar manner we can get the condition on the right
boundary (at x2 ) as m = 0. Hence y = c is the equation of the curve with shortest length.
Example 4.4
Find the differential equation and boundary conditions for the extremal of the variational problem
J[y] =
Z 1
0
with
y(0) = 0
p y2 q y2 dx
and
y(b) unspecified
where p = p(x) and q = q(x) are positive smooth functions on [0, b]. In this case
F = p y2 q y2
Therefore,
F
= 2q y
y
The EulerLagrange equation is
and
d
(py) + qy = 0,
dx
F
= 2p y
y
0xb
46
2p(b)y(b) = 0
Since p > 0 the boundary conditions are
y(0) = 0
and
y (b) = 0
Example 4.5
We find the natural boundary condition at x = b for the river crossing problem defined by the functional
(4.14). In this case
p
c2 (1 y2) v2 vy
F =
c2 v2
Therefore,
"
#
F
1
c2 y
= 2
v
y
c v2 (c2 (1 + y2) v2)1/2
The natural boundary condition (4.20) becomes
c2 y (b)
[c2 (1 + y(b)2 ) v(b)2]1/2
v(b) = 0
On simplification yields
v(b)
c
Thus the slope that the boat enters the bank at x = b is the ratio of the water speed at the bank to
the boat velocity in still water.
y (b) =
Chapter 5
Variational Problems with Constraints
5.1 Introduction
In the simplest variational problems in the last few sections, the class of admissible functions was
specified by conditions imposed on the end points of the curve. However, many applications of
calculus of variations lead to problems in which not only boundary conditions, but also conditions of
a quite different type side conditions or constraints are imposed on the admissible curves.
There are three types of variational problems with constraints: Isoperimetric problems, holonomic
problems, and non-holonomic problems. Their difference is in the way the constraints are specified.
In an isoperimetric problem the constraint is given in terms of an integral, in holonomic problems the
constraint is given in terms of a function which does not involve derivatives, and in non-holonomic
problems the constraints are given by differential equations.
J[y] =
Z b
a
F(x, y, y ) dx
(5.1)
y(a) =
y(b) =
K[y] =
Z b
a
G(x, y, y ) dx
47
(5.2)
48
assumes a given prescribed value . To solve this problem, use the method of Lagranges multipliers.
Let J[y] has an extremum for y = y(x). Then, if y = y(x) is not an extremal of K[y], there exists a
constant such that y = y(x) is an extremal of the functional
J[y] + K[y] =
Z b
a
(F + G) dx
+
=0
y
dx y
y
dx y
(5.3)
The general solution of equation (5.3) contains two constants of integration say c1 , c2 , and the
unknown Lagrange multiplier . These unknowns will be determined using the two end conditions
y(a) = , y(b) = and the side condition (5.2).
Example 5.1
Consider a curve in the upper half-plane having a fixed length and passing through the points (a, 0)
and (a, 0). Find the maximum enclosed area formed by the curve and the interval [a, a].
This is an isoperimetric problem. Here we need to find y = y(x) for which the integral
J[y] =
Z a
y dx
y(a) = y(a) = 0
and
K[y] =
We first form the functional
Z ap
J[y] + K[y] =
1 + y2 dx =
Z a
a
y+
1 + y2 dx
y
x+p
= C1
1 + y2
(x C1 )2 + (y C2 )2 = 2
This equation represents the equation of family of circles. The constants C1 , C2 , and are determined
from the conditions
y(a) = y(a) = 0
and
K[y] =
49
Example 5.2
A rope of length with constant weight per unit length w hangs from two fixed points (x1 , y1 ) and
(x2 , y2 ) in the plane. Determine the shape of the hanging rope.
Let y(x) be an arbitrary configuration of the rope with the y axis adjusted so that y(x) > 0. A small
element of length ds at (x, y) has weight w ds and potential energy wyds relative to y = 0. Therefore
the total potential energy of the rope hanging in the arbitrary configuration is given by the functional
Z x2 p
Z
wy 1 + y2 dx
wy ds =
J[y] =
x1
It is known that the curve will assume a shape that minimizes the potential energy. Thus we are faced
with minimizing the above functional subject to the isoperimetric condition
Z x2 p
K[y] =
1 + y2 dx =
x1
J + K =
Z x2
wy
x1
p
p
1 + y2 + 1 + y2 dx
Since F and G are independent of x, EulerLagrange equation (5.3) can be written as (see equation
(3.16))
G
F
F y + Gy =C
y
y
Therefore, we have
!
!
p
p
wy
1
2y
2y
p
wy 1 + y2 y
1 + y2 y p
+
=C
2
2 1 + y2
1 + y2
or
!
2
p
y
=C
1 + y2 p
(wy + )
1 + y2
Solving for y and separating variables yields
1
dy
p
= dx
C
(wy + )2 C2
dy
p
(wy + )2
C2
x
+ C1
C
The left hand-side is a standard integral and can be evaluated by the formula
Z
u
du
= cosh1
2
2
C
u C
Thus, we have
1
wy +
x
cosh1
= + C1
w
C
C
Solving for y, we get
wx
C
+ C2
y = + cosh
w
w
C
The above equation represents a catenary. Therefore the shape of a hanging rope is a catenary. The
constants C, C2 , and may be determined from the endpoint conditions y(x1) = y1 , y(x2 ) = y2 , and
the side condition of fixed perimeter . In practice this calculation may be difficult.
50
Example 5.3
J[y] =
Z 1
y2 dx
y(0) = y(1) = 0
and
K[y] =
Z 1
0
y2 dx = 2
J + K =
Z 1
0
y2 + y2 dx
(2y) + (2y ) = 0
2y 2 y = 0
dx
or
d2y
1
y=0
dx2
d 2y
dy
+ b + cy = 0
2
dx
dx
b + b2 4ac
b b2 4ac
r1 =
r2 =
2a
2a
We distinguish three cases according to the sign of the discriminant, D = b2 4ac.
Case I: If D > 0, the general solution is given by
y = c1 er1 x + c2 er2 x
Case II: If D = 0, the auxiliary equation will have only a single root, r = b/2a. The general solution
is then given by
y = c1 erx + c2 xerx
Case III: If D < 0, the roots and of the auxiliary equation are complex numbers. They are given by
r1 = + i
where = b/2a and =
r2 = i
51
1
r2 =
1
r1 =
y = c1 ex/
+ c2 ex/
The constants c1 and c2 can be determined using the boundary conditions y(0) = y(1) = 0. Using the
condition y(0) = 0,
0 = c1 + c2
c1 = c2 = c
Therefore,
and using the condition y(1) = 0,
y = c ex/ ex/
0 = c e1/ e1/
Since is assumed to be positive, the above equation is satisfied only when c = 0. Hence, we get a
trivial solution of y = 0 for > 0.
Case II: If = , we have c = 0 D = 0. So, we have the single root r = b/2a = 0. The general
solution then becomes
y = c1 + c2 x
Using the boundary condition, we get c1 = 0 = c2 . Again we get a trivial solution y = 0 for this case.
Case III: If < 0, we have D < 0. In this case, the roots and of the auxiliary equation are complex
numbers. They are given by
1
1
r1 = i
r2 = i
sin 1/ = 0
n = 1, 2, 3, . . .
y = c sin n x,
n = 1, 2, 3, . . .
So we have here an infinite number of solutions and we can construct the most general solution as
y=
cn sin n x
n=1
52
y2 dx = 2
2
c1 =
2
y=
sin x
Chapter 6
Principle of Least Action
6.1 Introduction
Most of the applications of the calculus of variations examined so far have been geometrical in nature.
We now explore applications of the calculus of variations to problems in classical mechanics. It is
assumed that the time evolution of a mechanical system is completely determined if its state is known
at some given instant. This is expressed by the fact that the dynamical variables satisfy a set of
differential equations (the equations of motion of the system) as functions of time along with initial
conditions. The method of classical dynamics, therefore, consists of listing the dynamical variables
and discovering the equations of motion that predict the systems evolution in time.
One method of obtaining the equations of motion is from a variational principle. This method is
based upon the idea that a system should evolve along the path of least resistance. Principles of this
sort have a long history. In the seventeenth century, Fermats principle that light rays travel along the
path of shortest time was announced. For mechanical systems Maupertuiss principle of least action
stated that a system should evolve from one state to another in such a way that the action (a vaguely
defined term with the units energy time) is smallest. Lagrange and Gauss were also advocates of
similar principles. In the early part of the nineteenth century, however, W. R. Hamilton stated what has
become an extremely powerful principle that can be generalized to formulate virtually all fundamental
laws of physics.
particle is a body whose dimensions may be neglected in describing its motion. The possibility of so doing
depends on the conditions of the problem concerned. For example, the planets may be regarded as particles in
considering their motion about the Sun, but not in considering their rotation about their axes. The position of a
particle in space is defined by its radius vector r, whose components are in Cartesian coordinates x, y, z.
53
54
an underline is used to denote this vector). According to the principle of least action every mechanical
system is characterized by a definite function L(r1 , r2 , . . . , rN , r 1 , r 2 , . . . , r N , t), or briefly L(r, r,t), and
he motion of the system is such that a certain condition is satisfied.
Let the system occupy positions defined by the set of radius vectors r(0) at an instant t0 and r(1)
at an instant t1 . Then, according to the principle of least action, the system moves between these
positions in such a way that the integral
J=
Z t1
L(r, r,t) dt
(6.1)
t0
takes the least possible value. The function L is called the Lagrangian of the system concerned and
the integral (6.1) is called the action integral. Thus, principle of least action may be stated as follows:
Among all the paths that a system of particles could take to go from an initial position
at t0 to a final position at t1 , the paths that the particles actually take are the ones that
minimize the action integral (6.1).
Note that the Lagrangian contains only r and r, but not the higher derivatives of r, is due to the fact
that the mechanical state of the system is completely defined when the position and the velocities are
given.
t = t0
t = t1
r
b
r
r
m r F = 0
(6.2)
55
Let r(t) be the resulting path within the time interval, t0 < t < t1 . Let us also consider another path
r (t) associated with another law of motion, such that
t0 < t < t1
r(t0 ) = 0
r(t1 ) = 0
(6.3)
where r is the variation in r so that the varied path is given by r (t). We take the dot product of
equation (6.2) with r and integrate from t0 to t1
Z t1
m r r F r dt = 0
t0
T =
(6.4)
1
1 2
m r = m r r
2
2
The variational operator follows the rules of differential operator d of calculus. Therefore the
variation in kinetic energy due to r can be expressed as
1
T =
m r r = mr r
2
By the product rule of differentiation, we have
d
(r r) = r r + r r
dt
Using the above equation, the variation in kinetic energy can be written as
T =
d
(m r r) m r r
dt
Integrate from t0 to t1
Z t1
t0
Z t1
d
T dt =
t0
dt
(m r r) dt
[m r r]tt10
Z t1
t0
Z t1
t0
m r r dt
m r r dt
By equation (6.3), the first term on the right-hand side of the above equation becomes zero. Therefore,
Z t1
t0
T dt =
Z t1
t0
m r r dt
T + F r dt = 0
(6.5)
This is the most general form of Hamiltons principle for a single particle under a general force field
and says that the path of motion is such that along it, the integral of the sum of the variation T of
the kinetic energy and F r must be stationary for path satisfying r(t0 ) = r(t1 ) = 0.
56
If the force field is conservative, then there is a force potential (x, y, z) such that
F =
The term on the right-hand side of equation (6.4) can be written as
F r = Fx x + Fy y + Fz z =
x +
y +
z =
x
y
z
F r = U
Hence, the Hamiltons principle can be written as
Z t1
t0
(T U) dt =
Z t1
t0
(T U) dt = 0
(6.6)
Defining
equation (6.6) can be written as
L = T U
(6.7)
Z t1
L dt = 0
(6.8)
Z t1
(6.9)
t0
The integral
J=
L dt
t0
is called the action integral. Here we identify L as the Lagrangian of the system. It is the difference
between the particles kinetic energy, T and the potential energy, U . Equation (6.8) states that
variation of the action integral J is equal to zero. This is precisely the necessary condition for the
action integral to have a minimum. Thus, Hamiltons principle for a conservative system may be
stated as follows:
Among all the possible paths passing through two fixed points, corresponding to the
times t0 and t1 , the true motion is performed on that path for which the action integral
is a minimum.
Note: The above derivation Hamiltons principle can be easily extended to system of particle by
summation and to a continuous system. It is equally valid to a general dynamical system consisting
of particles and rigid bodies.
T =
1
m x2 + y2 + z2
2
57
We assume that the particle has a potential energy U , which can be expressed in the form
U = U(t, x, y, z)
such that the negative of partial derivative of potential energy in a given direction gives the force in
that direction (see equation (C.12)). That is,
Fx =
U
x
Fy =
U
y
Fz =
U
z
If the functional (action integral) (6.9) has a minimum, the following EulerLagrange equation
L
d L
=0
x
dt x
L
d L
=0
y
dt y
d L
L
=0
z
dt z
(6.10)
must be satisfied. Since L = T U , and the potential energy U depends only on t, x, y, z, while kinetic
energy T depends on the velocity components x,
y,
z, we can write the equation (6.10) in the form
U
d T
=0
x
dt x
d T
U
=0
y
dt y
U
d T
=0
z
dt z
(6.11)
U
d
(mx)
=0
x
dt
d
U
(my)
=0
y
dt
U
d
(mz) = 0
z
dt
(6.12)
or
Since the negative of partial derivative of potential energy in a given direction gives the force in that
direction, the system of equation (6.12) reduces to
mx = Fx
my = Fy
mz = Fz
This is just the Newtons second law of motion for a single particle.
(6.13)
58
f1 (r1 , . . . , rN ) = 0
..
.
(6.14)
f p (r1 , . . . , rN ) = 0
Due to the existence of the constraints, the 3N coordinates of particles are not independent, therefore
the number of independent coordinates will be
n = 3N p
Here n is the smallest possible number of variables required to describe the configuration of the system.
It is equal to the number of degrees of freedom of a system. For example, a system of two particles,
at a fixed distance one from the other, has 6 1 = 5 degrees of freedom.
If the number of particles is large, the presence of constraints makes the determination of the
coordinates xi , yi , zi a difficult task. We shall attach to the n degrees of freedom a number of n
independent variables q1 , . . . , qn called generalized coordinates or Lagrangian variables. Similar to
Cartesian coordinates, generalized coordinates are assumed to be functions of time and completely
specify the state of the system at any instant. Further we assume that there are no relations among
the qi so that they may be regarded as independent.
Unlike Cartesian coordinates, the generalized coordinates do not necessarily have the dimension of
length. In general Lagrangian coordinates may be any suitable geometrical objects, like line segments,
arcs, angles, etc. The choice of generalized coordinates is somewhat arbitrary.
If the system is not subject to constraints, we can choose the generalized coordinates as the 3N
Cartesian coordinates of the particles itself, but there are also other possible choices. For example, the
position of a particle can also be defined by its cylindrical coordinates r, , z or its spherical coordinates
r, , .
The 3N Cartesian coordinates ri are then expressed in terms of q1 , . . . , qn by
(6.15)
59
q j = q j (t)
( j = 1, . . . , n)
Once these equations are known, by means of equation (6.15), the motion of the particles in real space
can also be determined.
The time derivatives of q1 , . . . , qn
q j =
dq j
dt
( j = 1, . . . , n)
are called generalized velocities. In view of (6.15), the generalized velocities are related to the real
velocities r i by
dri
vi = r i =
=
dt
r i dq j
ri dt
+
=
dt
t dt
qj
j=1
ri
q j q j +
j=1
ri
t
(i = 1, . . . , N)
(6.16)
The kinetic energy T of the system in terms of the generalized coordinates q1 , . . . , qn , and the
generalized velocities q1 , . . . , qn may be obtained as
n
n
N
1
1 N
r
i
i
i
i
T = mi vi v i = mi
qk +
q j +
2 i=1
2 i=1 j=1 q j
t
t
k=1 qk
If the constraints do not explicitly depend on time, the above equation for kinetic energy will be
simplified to
1 n n
(6.17)
T = a jk q j qk
2 j=1 k=1
where
ri ri
1 N
mi
2 i=1 q j qk
1 N
xi xi
yi yi
zi zi
= mi
+
+
2 i=1
q j qk
q j qk
q j qk
a jk =
60
It can be shown that T 0, zero only if all q1 , . . . , qn are zero. For example, the kinetic energy of a
particle of mass m in spherical coordinates is
T =
1
m r2 + r2 2 + r2 sin2 2
2
We can therefore conclude that, in general, the kinetic energy has the following functional dependence:
T = T (q, q,t)
(6.18)
The potential energy U can also be written in generalized coordinates. U may be a function of
q1 , . . . , qn ,t , that is
U = U(q1 , . . . , qn ,t)
or briefly
U = U(q,t)
(6.19)
J =
Z t1
L(q, q,t)
dt
(6.20a)
t0
is a minimum for functions q j (t) which describe the actual time evolution of the system. Using the
variational operator, the Hamiltons principle can be stated concisely as
J =
Z t1
L(q, q,t)
dt = 0
(6.20b)
t0
L(q, q,t)
= T (q, q,t)
U(q,t)
(6.21)
where T and U are the kinetic energy and potential energy of the system respectively.
To find the paths q j (t) ( j = 1, 2, . . . , n) that minimize the action integral we solve the associated
EulerLagrange equation. Here the action integral (6.20a) is similar to the functional (3.24) where the
paths q j (t) plays the role of functions y j (x), the Lagrangian L plays the role of the functional F and
the parameter t plays the role of x.
The necessary condition for the action functional J to be an extremum is that the Lagrangian L
satisfies the following EulerLagrange equations (3.25):
d L
L
=0
( j = 1, . . . , n)
(6.22)
dt q j
qj
This set of equations (6.22) is known as Lagranges equations for mechanical systems rather than
EulerLagrange equations. There is one Lagrange equation for each degree of freedom. They form a
set of second-order differential equations for paths q j (t). The solutions to these equations, subjected
to the boundary conditions at t = t0 and t = t1 , are the paths q j (t) that minimize the action integral
(6.20a).
61
In Lagrangian formulation, the motion of the system is described by the energy consideration
rather that of force. Further, since Lagrangian, L, is purely a scalar, the form of the equation is
independent of the coordinate system. This formulation is also advantages due to the fact that the
constraints can be handled in an easier way.
Since, for a conservative mechanical system, the Lagrangian L = T U , and the potential energy
U depends only on q,t while kinetic energy T depends on the velocity components q,t
, we can write
the Lagranges equation (6.22) in the form
U
d T
=0
( j = 1, . . . , n)
(6.23)
+
dt q j
qj
Equation (6.23) is the Lagranges equations for a conservative mechanical system with n degrees of
freedom.
In view of equation (3.16) (Beltrami identity), if the Lagrangian L is independent of time t , that
is, L/ t = 0, then the Lagranges equation (6.22) may be written as
L q j
L
=C
q j
( j = 1, . . . , n)
(6.24)
Thus, the extremizing function q is obtained as the solution of a first-order differential equation
involving q and q only.
b
b
m
x
1 2
mx
2
and the potential energy can be found by the relation
T =
F = kx =
U
x
62
Integrating to obtain the potential energy
U =
1 2
kx + c
2
where c can be set equal to zero by taking the horizontal plane as the reference level. The Lagrangian
is given by
1
1
L = T U = mx2 kx2
2
2
By principle of least action the motion takes place so that
Z t1
1 2 1 2
mx kx dt
J(x) =
2
2
0
is stationary. Hence the Lagranges equation (6.22) becomes
L
d L
=0
dt x
x
where
L
= kx
and
x
Plugging into the Lagranges equation to obtain
L
= mx
x
d
(mx)
+ kx = 0
dt
or
k
x=0
(6.25)
m
Equation (6.25), which expresses Newtons second law that force equals mass times acceleration, is
an equation for simple harmonic motion. Its general solution of given by
r
r
k
k
t + c2 cos
t
x(t) = c1 sin
m
m
x +
x(0) = x0
and
x(0)
=0
where x0 is the semi-amplitude of oscillation. So, we have c1 = 0 and c2 = x0 . The solution is therefore,
r
k
x = x0 cos
t = x0 cos t
m
where is the angular frequency, given by
=
and the frequency is
k/m
p
k/m
f =
2
T = 2
p
m/k
63
O
b
m
b
T =
1
1 2
1
mv = m(l )2 = ml 2 2
2
2
2
In view of equation (??), the potential energy of the bob is mg times the height above its equilibrium
position, i.e.,
U = mg(OA OB) = mg(l l cos ) = mgl(1 cos )
where the zero potential energy level has been taken as the rest position of the bob. The Lagrangian
is given by
1
L = T U = ml 2 2 mgl(1 cos )
2
By principle of least action the motion takes place so that
Z t1
1 2 2
J( ) =
ml mgl(1 cos ) dt
2
0
is stationary. Hence the Lagranges equation (6.22) becomes
d L
L
=0
dt
where
L
= mgl sin
or
and
L
= ml 2
d
ml 2 + mgl sin = 0
dt
g
+ sin = 0
l
(6.26)
64
Equation (6.26) is the governing equation for the system. If the amplitude of oscillation is small
enough, sin , then the equation becomes
g
+ = 0
l
This is an equation for simple harmonic motion and the general solution of given by
r
r
g
g
t + c2 cos
t
= c1 sin
l
l
The constants c1 and c2 are determined from the initial conditions
(0) = 0
(0) = 0
and
=
and the frequency is
p
g/l
p
g/l
=
2
T = 2
l/g
Compound pendulum
A rigid body pivoted about a horizontal axis which does not coincide with the center of mass and able
to oscillate freely is called a compound pendulum. Let O and A are the point of suspension through
which the axis passes and the center of mass respectively. Let mass of the pendulum be m, moment
of inertia about axis of rotation I , and the distance OA = l .
O
x
b
1 2
I
2
65
U = mgl cos
where the zero potential energy level has been taken at a horizontal plane passing through the point
O. The Lagrangian is given by
L = T U =
1 2
I + mgl cos
2
J( ) =
Z t1
1
2
I + mgl cos dt
d
dt
where
L
= mgl sin
L
=0
L
= I
and
d
I + mgl sin = 0
dt
or
mgl
sin = 0
+
(6.27)
I
Equation (6.27) is the governing equation for the compound pendulum. If the amplitude of oscillation
is small enough, sin , then the equation becomes
mgl
+
=0
I
This is an equation for simple harmonic motion and the general solution is given by
= c1 sin
mgl
t + c2 cos
I
mgl
t
I
T = 2
p
I/mgl
66
y
m
b
T =
F =
k
r2
where sign accounts for the attraction force. Thus, the potential energy of the particle is
U =
Z r
F dr =
Z r
k
r2
dr =
k
r
where the zero potential energy level has been taken at r . The Lagrangian is given by
L = T U =
k
1
m r2 + r2 2 +
2
r
dt
d L
L
=0
dt r
r
where
L
=0
k
L
= mr 2 2
r
r
L
= mr2
L
= mr
r
67
d
mr2 = 0
dt
d
k
(mr) mr 2 + 2 = 0
dt
r
or
mr2 = const.
k
mr mr 2 + 2 = 0
r
(6.28)
This coupled pair of ordinary differential equations can be solved exactly to determine the path of
the motion of a particle in a central force field. Note that the first equation tells us that the angular
momentum (mr2 ) of the particle is conserved during the motion. The term
mr 2 =
mv2
r
(v = r )
in the second equation represents the centrifugal force on the particle. This equation says that the
net force in the radial direction is the sum of forces due to the centrally directed attraction ( rk2 ) and
the centrifugal force (mr 2 ).
The case of a satellite travelling about a spherical earth is an example of a particle moving in a
central force field.
Double pendulum
A double pendulum is a pendulum with another pendulum attached to its end as shown in figure 6.6.
Using Lagranges formulation, determine the equation of motion of the double pendulum.
O
x
b
1 l1
m1 b (x1 , y1 )
2
l2
m2 b (x2 , y2 )
y
Figure 6.6: A double pendulum
We use rectangular coordinates with (x1 , y1 ) and (x2 , y2 ) as the coordinates of the two masses m1
and m2 respectively. Then, in terms of the independent (generalized) coordinates 1 , and 2 we have
68
(choosing y positive down)
x1 = l1 sin 1
y1 = l1 cos 1
x2 = l1 sin 1 + l2 sin 2
y2 = l1 cos 1 + l2 cos 2
From these transformation equations, we can find the following derivatives:
x1 = l1 1 cos 1
y1 = l1 1 sin 1
x2 = l1 1 cos 1 + l2 2 cos 2
y2 = l1 1 sin 1 l2 2 sin 2
The kinetic energy of the system is given by
1
1
m1 x21 + y21 + m2 x22 + y22
2
2
1 2 2
1
2 2
2 2
= m1 l1 1 + m2 l1 1 + l2 2 + 2l1 l2 1 2 cos(1 2 )
2
2
The potential energy of the system above its equilibrium position (a distance l1 + l2 below the point
of suspension) is
T = T1 + T2 =
L = T U =
1
1
m1 l12 12 + m2 l12 12 + l22 22 + 2l1 l2 1 2 cos(1 2 )
2
2
m1 g (l1 + l2 l1 cos 1 ) m2 g (l1 + l2 l1 cos 1 l2 cos 2 )
J(1, 2 ) =
Z t1
L dt
=0
dt 1
1
L
d L
=0
dt 2
2
where
L
1
L
1
L
2
L
2
69
Plugging these equations into the in the Lagranges equation and after some manipulations we obtain
(6.29a)
(6.29b)
The system equation (6.29) is the governing equation for the double pendulum with masses m1 & m2
and the lengths l1 & l2 .
If m1 = m2 , the system of equation (6.29) assumes the form
(6.30a)
(6.30b)
g
21 + 2 cos(1 2 ) + 22 sin(1 2 ) = 2 sin 1
l
g
2
2 + 1 cos(1 2 ) 1 sin(1 2 ) = sin 2
l
(6.31a)
(6.31b)
If the amplitude of oscillation is small enough, sin , cos 1, and neglecting the terms involving
2 , the system of equation (6.31) becomes
g
21 + 2 = 2 1
l
g
1 + 2 = 2
l
(6.32a)
(6.32b)
Vibrating string
We now consider the transverse vibrations of an elastic string. We place the string along the x-axis,
stretch it length L and fasten it at the ends x = 0 and x = . We then distort the string (since it is
elastic) by displacing it from its equilibrium and release it it and allow it to vibrate. The problem is
to find the deflection u(x,t) at any point ant at any time t > 0. The following assumptions are made:
70
Since the string is fixed at the ends, the boundary conditions are
u(0,t) = u(,t) = 0
If is the mass of the undeflected string per unit length, the kinetic energy of the string is
2
Z
1
u
T =
dx
2 0
t
The potential energy is related to the elongation (stretching) of the string. The arc length of the
elastic string is
s
2
Z
u
dx
S=
1+
x
0
and the elongation due to the transversal motion is
s
= S =
u
1+
x
Z
0
2
dx
u
x
2
u
<1
x
it is reasonable to approximate
u
1+
x
2
1
1+
2
Z 2
1 u
0
dx
t
2 0 x
0
"
2 #
Z
u 2
u
1
=
F
dx
2 0
t
x
Z
i
1 h
=
(ut )2 F (ux )2 dx
2 0
L = T U =
1
2
Z 2
u
J(u) =
2 0 0
t
x
71
is stationary. Here, the unknown function u is a function of two independent coordinates x and t .
Further, the Lagrangian L is not a function of x and t , and u. Thus, the appropriate Lagranges
equation is (3.23). Noting that L/ u = 0, the Lagrange equation takes the form
L
L
+
=0
t ut
x ux
The partial derivatives can be evaluated using the Leibnitz rule of differentiation under integration
sign2 as follows:
1
L
=
(ut )2 dx
ut
2 ut 0
Z
1
(ut )2 dx
=
2 0 ut
Z
0
ut dx
and
1
L
=
ux
2 ux
=
Z
0
Z
0
F (ux )2 dx
Fux dx
Z
0
ut dx
Z
0
Fux dx = 0
( utt Fuxx ) dx = 0
This equation holds good for any arbitrary limits, then we should have
2
2u
2 u
=
c
t2
x2
(6.33)
where the physical constant F/ is denoted by c2 to indicate that this constant is positive. This is
the well known differential equation known as the one-dimensional wave equation.
Z b(t)
a(t)
u(x,t) dx =
Z b(t)
u(x,t)
a(t)
dx + u(b(t),t)
db
da
u(a(t),t)
dt
dt
72
Appendix A
Solution of Brachistochrone Problem
The Eulerlagrange equation for the brachistochrone problem reduces to
y 1 + y2 = C
y=
C
1 + y2
To solve this differential equation, we substitute y = cot where is a parameter. Then we have
y=
C
C
= C sin2 = (1 cos 2 )
2
1 + cot
2
dx =
dy
=
y
C
2 (2 sin 2 ) d
cot
C2 sin cos d
= 2C sin2 d
cot
dx = C(1 cos 2 )d
Integrating the above differential equation to obtain
sin 2
+D
x=C
2
where the constant of integration D can be determined from the condition y(0) = 0, we get D = 0.
Putting 2 = , we can write
x=
C
( sin )
2
and
73
y=
C2
(1 cos )
2
74
Appendix B
Method of Lagrange Multipliers
Lagrange multiplier method is a technique for finding a maximum or minimum of a function F(x, y, z)
subject to a constraint (also called side condition) of the form G(x, y, z) = 0.
Geometric basis of Lagrange multiplier method can be explained if the functions are of two variables. So we start by trying to find the extreme values of F(x, y) subject to a constraint of the form
G(x, y) = 0. In other words, we seek the extreme values of F(x, y) when the point (x, y) is restricted
to lie on the level curve G(x, y) = 0. Figure blow shows this curve together with several level curves
of F(x, y) = c, where c is a constant. To maximize F(x, y) subject to G(x, y) is to find the largest
value of c such that the level curve, F(x, y) = c, intersects G(x, y) = 0. It appears from figure that this
happens when these curves just touch each other, that is, when they have a common tangent line.
This means that the normal lines at the point (x0 , y0 ) where they touch are identical. So the gradient
vectors are parallel. That is
G(x, y) = 0
b
level curves of
F(x, y) = constant
x
Figure B.1: The four possible cases of varying end points in the direction of y.
F(x0 , y0 ) = G(x0 , y0 )
for some scalar . The scalar parameter is called a Lagrange multiplier. The procedure based on
the above equation is as follows. We have from chain rule,
dF =
F
F
dx +
dy = 0 ,
x
y
dG =
75
G
G
dx +
dy = 0
x
y
76
G
F
+
= 0,
x
x
G
F
+
=0
y
y
As can be seen, the above two equations are components of the vector equation
F G = 0
(B.1)
Thus, the maximum and minimum values of F(x, y) subject to the constraint G(x, y) = 0 can be found
by solving the following set of equations
G
F
+
=0
x
x
F
G
+
=0
y
y
(B.2)
G(x, y) = 0
This is a system of three equations in the three unknowns x, y, and , but it is not necessary to find
explicit values for .
If the function to be extremized F and the side condition G are function of three independent
variables x, y, and z, the following system of equation is solved to obtain the minimum or maximum
value of F .
F
G
+
=0
x
x
G
F
+
=0
y
y
(B.3)
G
F
+
=0
z
z
G(x, y, z) = 0
This is a system of four equations in the four unknowns x, y, z, and .
Example B.1
A rectangular box without a lid is to be made from 27 m2 of cardboard. Find the maximum volume
of such a box.
Method I We first, solve this relatively simple problem without using Lagrange multiplier. Let the
length, width, and height of the box (in meters) be x, y, and z. Then the volume of the box is
V = xyz
77
We can express V as a function of just two variables x and y by using the fact that the area of the
five sides of the box is
xy + 2yz + 2xz = 27
Solving this equation for z, we get
z=
so that
V = xy
27 xy
2(x + y)
27xy x2 y2
27 xy
=
2(x + y)
2(x + y)
y2 (27 2xy x2 )
V
=
x
2(x + y)2
x2 (27 2xy y2 )
V
=
y
2(x + y)2
V = xyz
subject to the constraint
Using the method of Lagrange multipliers, we look for values of x, y, z, and such that
V
G
+
=0
x
x
V
G
+
=0
y
y
V
G
+
=0
z
z
xy + 2yz + 2xz = 27
which become
yz + (y + 2z) = 0
(B.4)
xz + (x + 2z) = 0
(B.5)
xy + (2y + 2x) = 0
(B.6)
xy + 2yz + 2xz = 27
(B.7)
To solve this systems of equations in a convenient manner, we multiply the equation (B.4) by x, (B.5)
by y, and (B.6) by z, then the left sides of these equations will be identical. Thus, we have
(B.8)
(B.9)
(B.10)
78
P1
y1
y2
b
P2
L = x1 + x2
We observe that 6= 0 because = 0 would imply xy = yz = xz = 0 and this would contradict the
equation (B.7). Therefore, from equations (B.8) and (B.9), we have xz = yz. Since z cannot be zero,
we have x = y. From equations (B.9) and (B.10), we have y = 2z. If we now put x = y = 2z in equation
(B.7), we get
12z2 = 27
Since x, y, and z are all positive, we therefore have z = 1.5 and so x = 3 and y = 3.
Example B.2
Here we will demonstrate how Lagrange multiplier method can be used for proving Snells law. In the
case of the inhomogeneous optical medium consisting of two homogeneous media in which the speed
of light is piecewise constant. Suppose that the light travels from a point P1 (x1 , y1 ), with a constant
speed v1 , in a homogenous medium M1 to a point P2 (x2 , y2 ), with a constant speed v2 , in another
homogeneous medium M2 . The two media are separated by the line y = y0 .
The time of transit of light is given by the geometry as
T =
y2 / cos 2
y1 / cos 1
+
v1
v2
L = x1 + x2 = y1 tan 1 + y2 tan 2
Applying the condition (B.2)
T
L
y1
+
=
sec 1 tan 1 + y1 sec2 1 = 0
1
1
v1
y2
T
L
+
=
sec 2 tan 2 + y2 sec2 2 = 0
2
2
v2
These give as the only solution
sin 1 = v1
sin 2 = v2
79
or
sin 2
sin 1
=
v1
v2
where the angles are measured with respect to the normal of the boundary between the two media.
80
Appendix C
Work-Energy and Energy Conservation
Theorems
C.1 Work-Energy Theorem
The kinetic energy of a particle of mass m, moving with a speed v, is defined as
T =
1 2
mv
2
(C.1)
Let a particle move from point 1 to point 2 under the action of a force F . The total work done on
the particle by the force, as it moves from 1 to 2, is by definition the line integral
W12 =
Z 2
1
F ds
(C.2)
where ds = v dt is the displacement vector along the particles trajectory. Now, if the particle undergoes
an infinitesimal displacement ds under the action of a force F , the scalar product
dW = F ds
(C.3)
is the infinitesimal work done by the force F as the particle undergoes the displacement ds along the
particles trajectory. We use the Newtons second law of motion
F =
d(mv)
dt
d(mv)
d
v dt =
dW =
dt
dt
1
1 2
mv v dt = d
mv
2
2
Since the scalar quantity 12 mv2 is the kinetic energy of the particle. It follows that
dW = dT
81
(C.4)
82
Equation (C.4) is the differential form of the work-energy theorem: The differential work of the
resultant of forces acting on a particle is equal, at any time, to the differential change in the kinetic
energy of the particle. Integrating equation (C.3) between point 1 and point 2, corresponding to the
velocities v1 and v2 of the particle, we get
Z 2
Z v2
Z 2
1
1
1 2
d(mv)
(C.5)
v dt =
mv
= mv22 mv21 = T2 T1
d
F ds =
W12 =
dt
2
2
2
v1
1
1
This is the work-energy theorem, which can be stated as the work done by the force F acting on a
particle as it move from point 1 to point 2 along its trajectory is equal to the change in the kinetic
energy (T2 T1 ) of the particle during the given displacement.
F =
(C.6)
we shall say that the vector field F is a potential field. The scalar function (x, y, z,t) is then called
the potential function of the field. The vector field F is called conservative if does not explicitly
depend on time. The potential function (x, y, z), in this case, is called the force potential.
It is easy to show that if the force field is conservative the work done in moving the particle from
1 to 2 is independent of the path connecting 1 and 2. From equation (C.2), the total work done on
the particle by the force F as it moves from 1 to 2 is given by
W12 =
Z 2
1
F ds
Since the force field is conservative, from equation (C.6), we can write
W12 =
Z 2
1
F ds =
Z 2
1
ds =
Z 2
d
1
ds
ds =
Z 2
1
d = 2 1
(C.7)
The total work done is equal to the difference in force potential no matter how the particle moves
from 1 to 2. We can also write the following differential relation
dW = F ds = d
(C.8)
If we now write (x, y, z) = U(x, y, z) (inserting a minus sign for reasons of convention) and
express the force as
F = U
(C.9)
then the scalar function U is known as the potential energy of the particle. When F is expressed as
in the above equation, the line integral of equation (C.2) becomes
W12 = U1 U2
(C.10)
The total work done is equal to the difference U1 U2 no matter how the particle moves from 1 to 2.
83
It may be noted that the line integral of the field F = U along a closed curve (called circulation)
is zero as shown below:
I
I
C
F ds =
dU = 0
Comparing equations (C.5) and (C.10), we deduce: T1 +U1 = T2 +U2 . This says that the quantity
T +U stays constant as the particle moves from point 1 to point 2. Since 1 and 2 are arbitrary points,
we have obtained the statement of conservation of total mechanical energy
E = T + U = constant
(C.11)
Thus, the energy conservation theorem can be stated as the total energy of a particle in a conservative
force field is constant.
It is instructive to note that equation (C.6) does not uniquely determine the function . We could
as well define F = + c, where c is a constant. Hence, the choice for the zero level of and
consequently U is arbitrary.
We can verify directly from equation (C.11) that the total energy is a constant of the motion. We
have
dE
dT
dU
=
+
dt
dt
dt
The kinetic energy term can be written as
1 dv2
dv
dT
= m
= m v = F v
dt
2 dt
dt
The potential energy U depends on time only through the changing position of the particle: U =
U(s(t)) = U(x(t), y(t), z(t)). We therefore have
U dx U dy U dz
dU
=
+
+
= U v = F v
dt
x dt
y dt
z dt
Thus we have
dE
= F v F v = 0
dt
This shows that the total energy of the particle moving in a conservative force field is a constant
during the motion.
F = Fx i + Fy j + Fz k
We then have
F = = U
Therefore, we obtain the following relations
Fx =
U
=
x
x
Fy =
U
=
y
y
Fz =
U
=
z
z
(C.12)
84
This shows that the partial derivative of force potential in a given direction gives the force in that
direction. An example of a force that derives from a potential is gravitational force
F g = U
which leads to the following equations
mgx =
U
x
mgy =
U
y
mgz =
U
z
(C.13)
Thus, the negative of partial derivative of potential energy in a given direction gives the gravitational
force in that direction.
If gravitational acceleration vector is given by
g = g(0, 0, 1)
Then we have
0=
U
x
0=
U
y
mg =
U
z
(C.14)
U = mgz + f (x, y)
Setting f (x, y) = 0, the potential energy of the particle in a gravitational field is given by
U = mgz
where g acts in the negative z direction. The total mechanical energy E is conserved when a particle
moves under the action of the gravitational field.
Bibliography
[1] Clegg, J. C., Calculus of Variations, Oliver and Boyd (1968).
[2] Dacorogna,, L., Introduction to the Calculus of Variations, 2nd ed., Imperial College Press (2008).
[3] Elsgolc, L. D., Calculus of Variations, Dover (2007).
[4] Ewing, G. M., Calculus of Variations with Applications, Dover (1985).
[5] Forsyth, A. R., Calculus of Variations, Dover (1960).
[6] Fox, C., An Introduction to the Calculus of Variations, Dover (2010).
[7] Gelfand, I. M. and Fomin S. V., Calculus of Variations, Prentice-Hall (1963).
[8] Gupta, A. S., Calculus of Variations with Applications, Prentice-Hall India (1996).
[9] Jost, J. and Li-Jost, X., Calculus of Variations, Cambridge Univ. Press (1998).
[10] Komzsik, L., Applied Calculus of Variations for Engineers, CRC Press (2009).
[11] Kot, M., A First Course in the Calculus of Variations, American Math Society (2014).
[12] van Brunt, B., The Calculus of Variations, Springer (2004).
[13] Weinstock, R., Calculus of Variations: with Applications to Physics and Engineering, Dover
(1974).
85