Stochastic Optimal Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Stochastic Optimal Control

May 2010
Deterministic Optimal Control Problem

Consider the following dynamical system with the state vector x(t) ∈ Rn and
the control vector u(t) ∈ Rm:

ẋ(t) = f (x(t), u(t), t) .

Its initial state at the fixed initial time 0 is given:

x(0) = x0 .

The permissible controls over the fixed time interval [0, T ] satisfy the following
condition:
u(t) ∈ U for all t ∈ [0, T ],
where U is a time-invariant, closed, and convex subset of the control space Rm:
m
U ⊆R .

Stochastic Systems, 2010 2


Deterministic Optimal Control Problem

Furthermore, consider a cost functional of the following form:


Z T
J = K(x(T )) + L(x(t), u(t), t) dt .
0

This cost functional should either be minimized or maximized, depending upon


the problem at hand. Consequently, there are two alternative formulations of
the optimal control problem.

The Minimization problem:


Find the control trajectory u∗ : [0, T ] → U ⊆ Rm generating the state
trajectory x∗ : [0, T ] → Rn such that the cost functional J is minimized.

Stochastic Systems, 2010 3


Deterministic Optimal Control Problem

The Maximization problem:


Find the control trajectory u∗ : [0, T ] → Ω ⊆ Rm generating the state
trajectory x∗ : [0, T ] → Rn such that the cost functional J is maximized.

The Hamiltonian function H : Rn × U × Rn × [0, T ] associated


with a regular optimal control problem is:
T
H(x(t), u(t), p(t), t) = L(x(t), u(t), t) + p (t)f (x(t), u(t), t) ,

where p(t) ∈ Rn is the so-called costate vector.

Stochastic Systems, 2010 4


Deterministic Optimal Control Problem

The Russian mathematician Pontryagin has found the following necessary


conditions for the optimality of a solution:Pontryagin

If u∗ : [0, T ] → U is an optimal control trajectory, the following conditions


are satisfied:
a) Optimal state trajectory:

ẋ (t) = ∇pH|∗
∗ ∗
= f (x (t), u (t), t) for t ∈ [0, T ]

x (0) = x0

Stochastic Systems, 2010 5


Deterministic Optimal Control Problem

b) Optimal costate trajectory: There exists an optimal costate trajectory satis-


fying

ṗ (t) = −∇xH|∗
∗ ∗ T ∗ ∗ ∗
= −∇xL(x (t), u (t), t) − fx (x (t), u (t), t)p (t) for t ∈ [0, T ]
∗ ∗
p (T ) = ∇xK(x (T )) .

Stochastic Systems, 2010 6


Deterministic Optimal Control Problem

c) Global static optimization of the Hamiltonian function:


For the minimization problem:
For all t ∈ [0, T ], the Hamiltonian is globally minimized w.r. to u, i.e.,
∗ ∗ ∗ ∗ ∗
H(x (t), u (t), p (t), t) ≤ H(x (t), u, p (t), t) for all u ∈ U .

For the maximization problem:


For all t ∈ [0, T ], the Hamiltonian is globally maximized w.r. to u, i.e.,
∗ ∗ ∗ ∗ ∗
H(x (t), u (t), p (t), t) ≥ H(x (t), u, p (t), t) for all u ∈ U .

Stochastic Systems, 2010 7


Example: The det. LQ-Regulator Problem

For the linear time-varying system

ẋ(t) = A(t)x(t) + B(t)u(t)

with the initial state


x(0) = x0
find the unconstrained optimal control u : [0, T ] → Rm such that the
quadratic cost functional
Z
1 T T
1³ T T
´
J = x (T )F x(T ) + x (t)Q(t)x(t) + u (t)R(t)u(t) dt
2 0 2

is minimized. Here, the penalty matrix R(t) is symmetric and positive-definite,


and the penalty matrices F and Q(t) are symmetric and positive-semidefinite.

Stochastic Systems, 2010 8


Example: The det. LQ-Regulator Problem

Analysis of the necessary conditions for optimality:

Hamiltonian function:
1 T 1 T
H(x(t), u(t), p(t), t) = x (t)Q(t)x(t) + u (t)R(t)u(t)
2 2
T T
+p (t)A(t)x(t) + p (t)B(t)u(t)

Stochastic Systems, 2010 9


Example: The det. LQ-Regulator Problem

Pontryagin’s necessary conditions for optimality:


∗ ∗ ∗
ẋ (t) = A(t)x (t) + B(t)u (t)

x (0) = x0
∗ ∗ T ∗
ṗ (t) = − Q(t)x (t) − A (t)p (t)
∗ ∗
p (T ) = F x (T )
³1 ´
∗ T ∗
u (t) = arg min
m
u R(t)u + p (t)B(t)u
u∈R 2

H -minimizing control:
∗ −1 T ∗
u (t) = −R (t)B (t)p (t)

Stochastic Systems, 2010 10


Example: The det. LQ-Regulator Problem

Plugging the H -minimizing control into the differential equations leads to the
following linear two-point-boundary-value problem:
∗ ∗ −1 T ∗
ẋ (t) = A(t)x (t) − B(t)R (t)B (t)p (t)
∗ ∗ T ∗
ṗ (t) = − Q(t)x (t) − A (t)p (t)

x (0) = x0
∗ ∗
p (T ) = F x (T ) .

Considering that p∗(T ) is linear in x∗(T ) and that the linear differential
equations are homogeneous leads to the educated guess that p∗(t) is linear in
x∗(t) at all times, i.e.,
∗ ∗
p (t) = K(t)x (t)

Stochastic Systems, 2010 11


Example: The det. LQ-Regulator Problem

This leads to the following linear state feedback control:


∗ −1 T ∗
u (t) = −R (t)B (t)K(t)x (t) .

Exploiting the two-point-boundary-value problem and the proposed linear rela-


tion leads to the following matrix Riccati differential equation for K(t) with a
boundary condition at the final time T :
T −1 T
K̇(t) = − A (t)K(t) − K(t)A(t) + K(t)B(t)R (t)B (t)K(t) − Q(t)
K(T ) = F .

Stochastic Systems, 2010 12


Stochastic Optimal Control Problem

Problem 1. Stochastic optimal control problem


( " Z T #)
max E K(T, x(T )) + L(t, x(t), u(t)) dt
u(t)∈U t0

s. t. dx(t) = f (t, x(t), u(t)) dt + g(t, x(t), u(t)) dW (t) (1)


x(t0) = x0.

We define the value or cost-to-go function J(t, x(t))


( Z )
h T i
J(t, x(t)) = max E K(T, x(T )) + L(t, x, u)dt . (2)
u∈U t

Stochastic Systems, 2010 13


Hamilton-Jacobi-Bellman I

Idea: Divide problem in two parts:


½ ¾
u(s) for t ≤ s ≤ t + ∆t
u(t) = (3)
u∗(s) for t + ∆t < s ≤ T

The cost-to-go function can now be split into the two parts and we obtain
"Z
t+∆t
J(x, t) = max E L(x(t), u(t), t) dt
u:[t,T ]→U t
Z #
T
+ K(x(T )) + L(x(t), u(t), t) dt (4)
t+∆t
| {z }
J(x,t+∆t)

Stochastic Systems, 2010 14


Hamilton-Jacobi-Bellman II

This results in the following derivation for J(x, t + ∆t)


Z Ã !
t+∆t
∂J(x, t)
J(x, t + ∆t) = J(x, t) + + AJ(x, t) dt (5)
t ∂t
Z t+∆t
+ Jx(x, x)g(x, u, t)dW, (6)
t

where we use the stochastic differential operator

T ∂(·) 1 n ∂ 2(·) T
o
A(·) = f (x, u, t) + tr g(x, u, t)g (x, u, t)
∂x 2 ∂x2

Stochastic Systems, 2010 15


Hamilton-Jacobi-Bellman III

"Z #
t+∆t
J(x, t) = max E L(x(t), u(t), t) dt + J(x, t + ∆t)
u:[t,t+∆t]→U t
"Z
t+∆t
= max E L(x, u, t)dt
u:[t,t+∆t]→U t
Z t+∆t à ! Z t+∆t #
∂J(t, x)
+ J(x, t) + + AJ(x, t) dt + Jx (x, t)g(x, u, t)dW
t ∂t t
"Z
t+∆t
= max E L(x, u, t)dt + J(x, t)
u:[t,t+∆t]→U t
Z t+∆t à ! #
∂J(x, t)
+ + AJ(x, t) dt ,
t ∂t

Stochastic Systems, 2010 16


Hamilton-Jacobi-Bellman IV

"Z #
t+∆t
∂J(x, t)
0= max E L(x, u, t) + + AJ(x, t)dt (7)
u:[t,t+∆t]→U t ∂t

set the arguments of the integral to zero and interchange the maximum
operator with the integral
½ ¾
∂J(x, t)
max L(x, u, t) + + AJ(x, t) = 0, (8)
u∈U ∂t

expanding the differential operator A

Stochastic Systems, 2010 17


Stochastic Hamilton-Jacobi-Bellman Equation

½
∂J(t, x) ∂J(t, x)
+ max L(t, x, u) + f (t, x, u) (9)
∂t u(t)∈U ∂x
1 n ∂ 2
J(t, x) o¾
T
+ tr g(t, x, u)g (t, x, u) 2
=0
2 ∂x

This is the Hamilton-Jacobi-Bellman equation for a stochastic control


problem as in (1). The maximizing u(t) can now be found in terms of
(t, x(t)), Jx and Jxx and reinserted into (9). By solving resulting PDE for the
cost function J(t, x(t)) the explicit solution for the optimal control u(t) can
be found.

Stochastic Systems, 2010 18


HJB Solution Procedure

h
1. For a fixed J(t, x), one has to find u = u(t, x, J) such that L(t, x, u)+
i
1 T
J x(t, x)f (t, x, u) + 2 tr{g(t, x, u)g (t, x, u)J xx(t, x)} is maximal.
2. The function u is put back h into HJB equation for u, and
the partial differential equation L(t, x, u) + Jx(t, x)f (t, x, u) +
i
1 T
2 tr{g(t, x, u)g (t, x, u)Jxx (t, x)} = 0 is solved with terminal condi-
tion J(T, x) = K(T, X(T )). The solution J(t, x) is the maximal value
functional.
3. The function J(t, x) is put back into the equation for u, which is
derived in Step 1. This results in the optimal feedback control law
u∗ = u∗(t, x, J(t, x)).

Stochastic Systems, 2010 19


Stochastic LQG Controller HJB

We consider the following optimal control problem:


h1³ Z T
T T
min
m
E x (t)Q(t)x(t) + 2u (t)N (t)x(t)
u∈R 2 0
´i
T T
+u (t)R(t)u(t)dt + x (T )F x(T ) s.t.

dx(t) = (A(t)x(t) + B(t)u(t))dt


+(C(t)x(t) + D(t)u(t))dZ. (10)

with dZ ∈ R1. If the Brownian motion is only time-dependent (dx(t) =


(A(t)x(t) + B(t)u(t))dt + (σ(t))dZ ), the optimal stochastic controller
is the same as the deterministic controller (certainty equivalent), which is
proven later.

Stochastic Systems, 2010 20


Stochastic LQG Controller HJB

The corresponding HJB equation yields:


n1
T T T
Jt + min (x Qx + 2x N (t)u + u Ru)
u 2
1 o
T
+Jx(Ax + Bu) + (Cx + Du) Jxx(Cx + Du) = 0
2
1 T
with terminal condition J(T, x(T )) = x (T )F x(T ). (11)
2
1. step: Calculate the minimum for u(t, x):
T T T T
N x + Ru + D JxxDu + B Jx + D JxxCx = 0
T −1 T T T
u = −(R + D JxxD) (B Jx + N x + D JxxCx). (12)

Stochastic Systems, 2010 21


Stochastic LQG Controller HJB

2. step: Plug u back in the HJB equation:

1 T³ T T T T
Jt + x Q + C JxxC − 2C JxxDLN − N LN
2
´ 1 T
T T T
− C JxxDLD JxxC x − Jx BLB Jx
2
1 ³ T T T
´
+ Jx A − BLN − C JxxDLB x
2
1 ´
T T T
+ x(A − N LB − BLD JxxC Jx = 0 (13)
2

where L = (R + D T JxxD)−1 and J(T ) = 12 xT (T )F x(T ). We make


a quadratic Ansatz for J , J(t, x) = 21 xT (t)K(t)x(t).

Stochastic Systems, 2010 22


Stochastic LQG Controller HJB

Inserting the Ansatz yields


h ¡ T ¢T
T T T T
x K̇ + KA + A K + C KC + Q − B K + N + D KC
¡ T ¢−1¡ T T ¢i
· R + D KD B K + N + D KC x = 0 (14)

This gives the stochastic Riccati equation:


T T ¡ T T ¢T
K̇ + KA + A K + C KC + Q − B K + N + D KC
¡ T ¢−1¡ T T ¢
· R + D KD B K + N + D KC = 0
K(T ) = F (15)

Stochastic Systems, 2010 23


Stochastic LQG Controller HJB

3. step: Plug J back into the solution of u:


∗ T −1 T T T
u (t, x) = −(R + D JxxD) (B Jx + N x + D JxxCx)
T −1 T T T
= −(R + D KD) (B Kx + N x + D KCx)
T −1 T T T
= −(R + D KD) (B K + N + D KC)x. (16)

K(t) is the solution of the stochastic Riccati-equation (15). Note that in


the stochastic case we may set R = 0 if D T K(t)D > 0 (this is not
possible in the deterministic case).

Stochastic Systems, 2010 24


Deterministic Pontryagin Maximum Principle

ẋ(t) = f (t, x(t), u(t))


Z T
J = K(x(T )) + L(x(t), u(t), t) dt
0

Hamiltonain function: H(x(t), u(t), t, p(t)) = L(x(t), u(t), t) + pT (t)f (x(t), u(t), t)
∗ ∗ ∗
ẋ (t) = ∇pH|∗ = f (x (t), u (t), t)

x (0) = x0
∗ ∗ ∗ T ∗ ∗ ∗
ṗ (t) = −∇xH|∗ = −∇xL(x (t), u (t), t) − fx (x (t), u (t), t)p (t)
∗ ∗
p (T ) = ∇xK(x (T ))

∗ ∗ ∗ ∗ ∗
H(x (t), u (t), t, p (t)) ≤ H(x (t), u, t, p (t)) for all u ∈ U .

Stochastic Systems, 2010 25


Stochastic Pontryagins Maximum Principle

• Stochastic Pontryagins Maximum Principle:


A system of forward backward stochastic differential equations (FBSDE)
replaces the HJB partial differential equation.
nZ T o
max E L(t, x(t), u(t))dt + K(x(T ), T )
u 0
s.t.
dx = f (t, x, u)dt + g(t, x, u)dW

Stochastic Systems, 2010 26


Adjoint Equations

In the stochastic case, we define the Hamiltonian function by


n n n×n
H : [0, T ] × R × U × R × R

T 1 T
H(t, x, u, p, px) = L(t, x, u) + f (t, x, u)p + tr{pxg(t, x, u)g (t, x, u)}.
2
Before stating the stochastic maximum principle we need to define the adjoint
equations for the stochastic optimal control problem as

p(t) = Jx(t, x) (17)


∂p
px(t) = = Jxx(t, x). (18)
∂x

Stochastic Systems, 2010 27


Hamiltonian Function

Note that we use the Hamiltonian function to write the knwon HJB equation
as
−Jt = max H(t, x, u, p, px). (19)
u∈U
In the following we assume there is a known optimal control law u∗(t, x, p, px)
which solves the optimal control problem such that
∗ ∗
H (t, x, p, px) = H(t, x, u (t, x, p, px), p, px) (20)
T
= L(t, x, p, px) + f (t, x, p, px)p
1 © T ª
+ tr pxg(t, x, p, px)g (t, x, p, px) (21)
2
= −Jt (22)

Stochastic Systems, 2010 28


Differential Equations for Adjoint Equations

In the next step we write the differential equations for the state and adjoint
equations dx∗ and dp∗, respectively:
∗ ∗
dx = f (t, x, u )dt + g(t, x, u )dW

= Hp (t, x, p, px)dt + g(t, x, p, px)dW,

derived from the system dynamics and (21). And using Itô’s lemma on the
definition of p(t) in equation 17 gives us

1 2
dp = Jxtdt + Jxxdx + Jxxx(dx) (23)
2
h 1 i
T
= Jxt + Jxxf + tr{Jxxxgg } dt + JxxgdW. (24)
2

Define: tr{Jxxxgg T } = [tr{(Jx1 )xxgg T }, tr{(Jx2 )xxgg T }, . . . , tr{(Jxn )xxgg T }]T

Stochastic Systems, 2010 29


Differential Equations for Adjoint Equations

Then we take the partial derivative of (22) with respect to x to get Jxt

∗ ∗ ∂p ∗ ∂px
−Jxt = Hx + Hp + Hpx
∂x ∂x
∗ 1 T
= Hx + Jxxf + tr{Jxxxgg }, (25)
2
and insert (25) into (24) leads to

dp = −Hx dt + JxxgdW (26)

= −Hx dt + pxgdW. (27)

Stochastic Systems, 2010 30


Pontryagins Maximum Principle

We can finally write the system of forward backward stochastic differential


equation (FBSDE), the stochastic maximum principle

T 1 T
H(t, x, u, p, px) = L(t, x, u)+f (t, x, u)p+ tr{pxg(t, x, u)g (t, x, u)}.
2

 ∗ ∗

 dx = H p dt + gdW

 ∗ ∗
 dp = −Hx dt + pxgdW,
x∗(0) = x0,



 p∗(T ) = Kx(T, x(T )),

 H ∗(t, x(t), u(t), p(t), p (t)) = max H(t, x(t), u, p(t), p ).
x u x
(28)

Stochastic Systems, 2010 31


Conditions for Optimality

The conditions for optimality are similar to the deterministic case:


a) Optimal state trajectory obtained from dx∗
b) Optimal costate trajectory obtained from dp∗
c) Global static optimization of the Hamiltonian function
- Minimization
∗ ∗ ∗ ∗ ∗ ∗ ∗
H(t, x (t), u (t), p (t), px(t)) ≤ H(t, x (t), u(t), p (t), px(t)), for t ∈ [0, T ]

- Maximization
∗ ∗ ∗ ∗ ∗ ∗ ∗
H(t, x (t), u (t), p (t), px(t)) ≥ H(t, x (t), u(t), p (t), px(t)), for t ∈ [0, T ]

Stochastic Systems, 2010 32


Stochastic LQG Controller Pontryagin

Consider the stochastic LQG problem


n1 Z T
T T
J(t, x) = E x(t) Q(t)x(t) + u(t) R(t)u(t)dt
2 0
1 o
T
+ x(T ) M (T )x(T )
2
s.t.
£ ¤
dx = A(t)x(t) + B(t)u(t) dt + σ(t)dW (t)
x(0) = x0,

with R(t) = RT (t) > 0, M (t) = M T (t) ≥ 0, and Q(t) = QT (t) ≥ 0.

Stochastic Systems, 2010 33


Stochastic LQG Controller Pontryagin

The Hamiltonian function:


1¡ T T ¢ £ ¤T 1 T
H(t, x, u, p, q) = x Qx + u Ru + Ax + Bu p + tr{pxσ(t)σ (t)}.
2 2
Pontryagins necessary conditions are
∗ ∗ £ ∗ ∗ ¤
dx (t) = Hp dt + σ(t)dW = A(t)x (t) + B(t)u (t) dt + σ(t)dW (t)

x (0) = x0
∗ ∗ ∗ ∗ T ∗ ∗
dp (t) = −H x + pxσ(t)dW = −Q(t)x (t) + A (t)p (t) + pxσ(t)dW (t)
∗ ∗
p (T ) = M (T )x (T )
∗ ∗ ∗ ∗ ∗ ∗ ∗
H(t, x , u , p , px) ≥ H(t, x , u, p , px).

Stochastic Systems, 2010 34


Stochastic LQG Controller Pontryagin

From the last statement we follow, that

∗ 1³ T T T ∗
´
u (t) = arg min u (t)R(t)u(t) + u (t)B (t)p (t)
u 2

optimal control u∗(t) = −R−1(t)B T (t)p∗(t). This put back into the system of forward
backward stochastic differential equations (FBSDE)
∗ £ ∗ −1 T ∗ ¤
dx (t) = A(t)x (t) − B(t)R (t)B (t)p (t) dt + σ(t)dW (t)

x (0) = x0
∗ ∗ T ∗ ∗
dp (t) = −Q(t)x (t) + A (t)p (t) + px(t)σ(t)dW (t)
∗ ∗
p (T ) = M (T )x (T ).

Ansatz: p∗(t) = K(t)x + φ(t), ⇒ p∗x(t) = K(t)

Stochastic Systems, 2010 35


Stochastic LQG Controller Pontryagin

Using Itô’s lemma we yield for the stochastic process p∗


h 1 i
∗ −1 T ∗
dp = K̇x + φ̇ + K(Ax − BR B p ) + · 0 dt + KσdW
2
h ¡ ¢i
−1 T
= K̇x + φ̇ + K Ax − BR B (Kx + φ) dt + qdW

from the FBSDE system we know


∗ T £ T ¤
dp = (−Qx − A p)dt + qdW = − Qx − A (Kx + φ) dt + qdW.

Setting the two equations equal:


h ¡ ¢i h i
−1 T T
K̇x + φ̇ + K Ax − BR B (Kx + φ) dt + qdW = − Qx − A (Kx + φ) dt + qdW
¡ −1 T ¢ T
K̇x + φ̇ + K Ax − BR B (Kx + φ) = −Qx − A (Kx + φ)

Stochastic Systems, 2010 36


Stochastic LQG Controller Pontryagin

Leading to the two differential equations for K and φ:


T −1 T
K̇ + Q + KA + A K − KBR B K = 0
K(T ) = M (T )
−1 T T
φ̇ − KBR B φ+A φ = 0
φ(T ) = 0

∗ −1 T −1 T
u (t) = −R (t)B (t)(K(t)x + φ(t)) = −R (t)B (t)K(t)x

Note: certainty equivalent

Stochastic Systems, 2010 37


Extended Stochastic LQG Controller HJB

The extended LQG problem with a m-dimensional Brownian motions and


inhomogeneous linear dynamics is defined like the following:
n1 Z T
T T T
min E (x(t) Q(t)x(t) + 2x(t) N (t)u(t) + u(t) Ru(t))dt
u 2 0
1 o
T T
+ x(T ) M (T )x(T ) + g(T ) x(T ) (29)
2
³ ´
dx(t) = A(t)x(t)dt + B(t)u(t) + b(t) dt
m
X
+ [Cj (t)x(t) + Dj (t)u(t) + dj (t)]dZj , x(0) = x0 (30)
j=1

The solution is derived with the following Ansatz:


J(t, x) = 12 xT K(t)x(t) + ϕT x(t) + ψ(t)

Stochastic Systems, 2010 38


Extended Stochastic LQG Controller HJB
m
X m
X
T T ¡ T ¢T
K̇ + KA + A K + Cj KCj + Q − B K + N + Dj KCj
j=1 j=1
m
X m
X
¡ T ¢−1¡ T T ¢
· R+ Dj KDj B K+N + Dj KCj = 0, K(T ) = M (T ) (31)
j=1 j=1
m
X m
X
£ ¡ T ¢−1¡ T ¢¤T
ϕ̇ + A − B R + Dj KDj B K+N + Dj KCj ϕ
j=1 j=1
m
X m
X
£ ¡ T ¢−1
+ C j − Dj R + Di KDi
j=1 i=1
m
X
¡ T T ¢¤T
· B K+N + Di KCi Kdj + P b = 0, ϕ(T ) = g(T ) (32)
i=1

Stochastic Systems, 2010 39


Extended Stochastic LQG Controller HJB

The optimal control law is computed by solving the two stochastic Riccati
equations 31 and 32) for K(t) and ϕ(t)

m
X m
X
T ¢−1¡ T T ¢
u(t, x) = −(R + Dj KDj B K +N + Dj KCj x(t)
j=1 j=1
m
X m
X
T ¢−1¡ T T ¢
−(R + Dj KDj B ϕ + Dj dj . (33)
j=1 j=1

In contrary to the previous LQG controller (33) we have calculated the


feed-forward and the feed-back solutions. This is necessary because of the
terms dj (t) and b(t) which are only time-dependent.

Stochastic Systems, 2010 40


Portfolio Models and Stochastic Optimal Control

Statement of the problem


(Z )
T
−ρt γ γ
max E e C(t) dt + πX(T )
u 0

s.t.
T T
dX = [X(u (b − e r) + r) − C] dt + Xu σdW.

The cost-to-go function is defined by


(Z )
T
−ρt γ γ
J(X(t), t) = max E e C(t) dt + πX(T ) .
C,u(t) 0

Stochastic Systems, 2010 41


Portfolio Models and Stochastic Optimal Control

∂ 2J
This problem leads to the following HJB equation ( ∂J
∂t ≡ Jt ,
∂J
∂X ≡ Jx, and ∂X 2
≡ Jxx)
" #
−ρt γ ¡ T ¢ 1 2 T
Jt + max e C + X(u (b − e r) + r) − C Jx + X Jxxu Σu =0
C(t),u(t) 2

with Σ = Σ(t) = σ(t)T σ(t).


First step: First oder conditions

Jx −1
u(t) = − Σ (b − er)
XJxx
³1 ´ 1
ρt γ−1
C(t) = e Jx .
γ

Stochastic Systems, 2010 42


Solution to portfolio model

Second step: Put these values back into the HJB equation
γ ³ −γ −1 ´
ρt
γ−1 1 Jx2 T −1
Jt + e γ−1 Jx γ γ−1 − γ γ−1 − (b − er) Σ (b − er) + XrJx = 0
2 Jxx
For the solution of the PDE We use a separation Ansatz
γ −ρt
J(X, t) = X e h(t)
∂J −ρt γ¡ 0 ¢
= e X h (t) − ρh(t)
∂t
∂J −ρt γ−1
= e h(t)γX
∂X
∂ 2J −ρt γ−2
= e h(t)γ(γ − 1)X .
∂X 2

Stochastic Systems, 2010 43


Solution to portfolio model

Third step: Plugging these results back into the HJB-PDE yields
µ ³ T −1 ´ ¶
−ρt γ 0 (b − er) Σ (b − er) γ
e X h (t) + h(t) rγ − ρ + − (1 − γ)h(t) γ−1 = 0,
2(1 − γ)

the proposed Ansatz is therefore valid. To specify h(t) and find an explicit solution to the optimal
control problem, the ordinary differential equation
γ
0
h (t) + Ah(t) − (1 − γ)h(t) γ−1 = 0
ρT
h(T ) = πe ,
³ ´
(b−er)T Σ−1 (b−er)
with A = rγ − ρ + 2(1−γ) remains to be solved.

Stochastic Systems, 2010 44


Solution to portfolio model

Inserting these solution for J and its partial derivatives into equations for C and u yields the
optimal policies for consumption C ∗(t) and investment strategy u∗(t):

∗ 1 −1 1 −1
u (t) = − Σ (b − er) = Σ (b − er)
γ−1 1−γ
³ ´ 1 1
∗ γ−1 γ−1
C (t) = h(t)γX = h(t) γ−1 X.

Stochastic Systems, 2010 45

You might also like