Continuous State Dynamic Programming
In this section we solve for the optimal policy of a stochastic growth model
The model is quite specific, but the method is fairly general
You should be able to apply the same ideas to other kinds of models
The lecture is based on section 6.2 of EDTC
The presentation here is very terse: see the book for more details
To learn more about dynamic programming in general see chapter 10 of EDTC
An Optimal Growth Model
At each point in time, agent receives income yt
Split into consumption ct and savings kt
Output at t+1 is
where φ is a density function
The agent's behavior is specified by a policy function
Here σ(y) interpreted as agent's choice of savings when income = y
The set of all such policies will be denoted by Σ
The agent's decision problem is
and
The utility function U is assumed to be bounded and continuous, and f is assumed to be continuous
Dynamic Programming
We will solve for the optimal policy using dynamic programming (DP)
We start with principles of DP and then go on to implementation
Basic Principles
The value function is defined by
A policy in Σ is called optimal if it attains this supremum for all y in S
The Bellman equation states that
Let bcS be the set of continuous bounded real-valued functions on S.
Given a function w in bcS, we say that σ in Σ is w-greedy if
At least one such policy exists
Moreover, a policy is optimal if and only if it is v* -greedy
Hence to find an optimal policy, we can compute v and then solve for a v -greedy policy
The value function v* can be obtained by iterating on the Bellman operator
A map from bcS into itself defined by
A common starting point is to set w = U
Iteration produces a sequence of functions
Fitted Value Iteration
In practice, the functions on this sequence typically cannot be stored on a computer
Many people use discretization: Replace continuous state problem with a similar discrete state problem
In general, this is not the most efficient way
Instead, we approximate the iterates at each step, using simple functions which can be stored on a computer
This is called fitted value iteration
For a theoretical discussion of fitted value iteration, see section 6.2.2 of EDTC
Key idea:
- Use an approximate Bellman operator according to the following algorithm
- Given w, evaluate Tw on a grid of points
- Using this information, compute an approximation ATw to Tw
- Set w = ATw and repeat
What approximation to use when we go from Tw to ATw?
The simplest option is step functions (i.e., piecewise constant functions)
Fitted Value Iteration with Step Functions
We can now construct an approximate Bellman operator
Corresponds to the model
When iterating, we will use step functions via the
StepFun
class, as defined in this lecture
One of the problems we need to solve is how to evaluate the integral
Recall that a
StepFun
instance w
has an expectation()
methodw.expectation(F)
gives expectation ofw(Z)
whenZ
has cdfF
In the program, we find the cdf of
f(k, W)
and pass it to this method## Filename rpd.py
## Author: John Stachurski
import numpy as np
from scipy.optimize import fminbound
from scipy.stats import lognorm
from stepfun import StepFun
theta, alpha, rho = 0.5, 0.8, 0.9 # Parameters
def U(c):
return 1 - np.exp(- theta * c) # Utility
L = lognorm(1)
G = L.cdf # Cumulative distribution (cdf) of lognormal shock
def dist(c):
"""
Creates and returns (as a function) the cdf of c * W, where W is
lognormal and c is a constant.
"""
if c == 0.0:
return np.vectorize(lambda x: 0 if x < 0 else 1)
return lambda x: G(x / c)
# Next we make a grid. The grid is constructed in a nonlinear way,
# so that there are more points near the bottom of the grid.
# The reason is that the iterates have more curvature at the bottom,
# so we need more data there to approximate well.
gridmax, gridsize = 8, 150
grid = np.linspace(0, gridmax**1e-1, gridsize)**10
def maximum(h, a, b):
"Computes the max of h on the interval [a, b]."
return h(fminbound(lambda x: -h(x), a, b))
def bellman(w):
"""
The approximate Bellman operator.
Parameters:
* w is an instance of StepFun with w.X = grid
Returns: A new instance of StepFun
"""
Tw = np.empty(gridsize)
for i, y in enumerate(grid):
h = lambda k: U(y - k) + rho * w.expectation(dist(k**alpha))
Tw[i] = maximum(h, 0, y)
return StepFun(grid, Tw)
Problem:
Generate iterates of the approximate Bellman operator and plot them
Start from initial condition
v = StepFun(grid, U(grid))
- The step function approximation to
U
Iterate until deviation between current and previous iterate is less than 0.005
Here, deviation between functions v and w is measured by
When these functions are instances of
StepFun
, the supremum equals the supremum over the grid points alone
You should produce the following graph
Solution:
from rpd import *
from scipy import absolute as abs
import matplotlib.pyplot as plt
v = StepFun(grid, U(grid))
tol = 0.005 # Error tolerance
while 1:
plt.plot(grid, v.Y, 'k-')
new_v = bellman(v)
if max(abs(new_v.Y - v.Y)) < tol:
break
v = new_v
plt.show()
0 comments:
Post a Comment