Open Mind Tree: Computational economics lecture 20

Application: Finite Markov Chains

Finite Markov chains are found in many of the workhorse models of macroeconomics, operations research, etc.

Here we study efficient numerical methods for Markov chains

Simulation (generating sample paths)
Computing stationary distributions

Definitions

A Markov matrix (or stochastic kernel) is an N by N square matrix p such that

all elements are nonnegative, and
each row sums to one (for any x we have sum(p[x,:]) = 1)

We study a random process X that evolves on S = range(N) = [0,...,N-1]

S is called the state space

The basic idea is that

Each row p[x,:] of the matrix p can be regarded as a distribution on S
The distribution p[x,:] gives the probabilities for X[t+1] when X[t] = x

More formally, the Markov chain X is constructed as follows:

First, X[0] is drawn from some initial distribution (probability mass function) q
- X[0] = x with probability q[x] for each x in S
At each subsequent time t, the new state X[t+1] is drawn from p[X[t],:]
- Moves to state y with probability p[X[t],y]

In psuedo-Python,

X[0] = a draw from q (a nonnegative array with len(q) = N and sum(q) = 1)
while 1:
    X[t+1] = a draw from p[X[t],:]

Simulation

Let's try to simulate this process.

Problem:

Write a function which takes

a Markov kernel (matrix) p,
an initial condition init which represents X[0], and
a positive integer sample_size

and returns a sample path of length sample_size

Hint: Use the DiscreteRV class from this lecture

To test your solution you can use the small matrix

p = np.array([[.4, .6], [.2, .8]])

For a long series, the fraction of the sample that takes value 0 will be about 0.25

Solution:

## Filename: mc_sample.py
## Author: John Stachurski

import numpy as np
from discreterv import DiscreteRV

def sample_path(p, init=0, sample_size=1000): 
    """
    A function that generates sample paths of a finite Markov chain with 
    kernel p on state space S = [0,...,N-1], starting from state init.

    Parameters: 

        * p is a 2D NumPy array, nonnegative, rows sum to 1
        * init is an integer in S
        * sample_size is an integer

    Returns: A flat NumPy array containing the sample
    """
    N = len(p)
    # Let P[x] be the distribution corresponding to p[x,:]
    P = [DiscreteRV(p[x,:]) for x in range(N)]
    X = np.empty(sample_size, dtype=int)
    X[0] = init
    for t in range(sample_size - 1):
        X[t+1] = P[X[t]].draw()
    return X

Notice here that we don't eliminate all for loops.

Outer loops don't need to be optimized: savings minimal
Eliminate inner loops, that would be repeated many times

As a general rule, don't over-optimize your code

Unnecessary
Less clarity

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. -- Donald Knuth

Stationary Distributions

A distribution q is called stationary if q = q p

Here q is a row vector, postmultiplied by the Markov matrix p
At least one such distribution exists
- See EDTC, theorem 4.3.5
There may be many
- If p is the identity then all q are stationary
- A condition for uniqueness: see EDTC, theorem 4.3.18
  - One sufficient condition: all elements of p are strictly positive

Stationary Distribution by Inversion

Let's suppose that p has a unique stationary distribution

To solve for it we can solve the linear system

for q, where I is the identity, B is the matrix of ones, b is the column vector of ones

See EDTC, section 4.3.1 for details

Problem:

Write a function which takes p as a parameter and returns the stationary distribution, assuming that it is unique

You can test it using the matrix

p = np.array([[.4, .6], [.2, .8]])

The (unique) stationary distribution is (0.25, 0.75)

Solution:

## Filename: mc_stationary1.py
## Author: John Stachurski

import numpy as np

def stationary1(p):
    """
    Parameters: 
    
        * p is a 2D NumPy array, assumed to be stationary

    Returns: A flat array giving the stationary distribution
    """
    N = len(p)                               # p is N x N
    I = np.identity(N)                       # Identity matrix
    B, b = np.ones((N, N)), np.ones((N, 1))  # Matrix and vector of ones
    A = np.transpose(I - p + B) 
    solution = np.linalg.solve(A, b)
    return solution.flatten()                # Return a flat array

Stationary Distribution by Monte Carlo

Problem with this method: when solving a linear system of equations like this one, the number of operations is O(N³), where N is the size of the state space

Slow or infeasible when N is large
- Must use other methods (e.g., iterative or simulation)

Let's discuss a simulation based method

If p is stable with stationary distribution q, then the LLN for Markov chains tells us that

with probability one (see EDTC, section 4.3.4 for details).

That is, fraction of time that X spends at y converges to q[y]
In NumPy terms, np.mean(X == y) is approximately equal to q[y]
- Here X is an array storing the time series

Note that for this LLN, the initial condition does not matter

Summary: To solve for q we can

Sample a time series and store it in X
compute np.mean(X == y) for each y

Problem:

Write a function to compute q this way, assuming stability of p

Use sample_path() (see above) to generate the sample path

Again, you can test it using the matrix

p = np.array([[.4, .6], [.2, .8]])

Results should agree with previous solution

Solution:

## Filename: mc_stationary2.py
## Author: John Stachurski

import numpy as np

# Import the sample_path function defined above
from mc_sample import sample_path

def stationary2(p, n=1000): 
    """ 
    Computes the stationary distribution via the LLN for stable
    Markov chains.

    Parameters: 

        * p is a 2D NumPy array
        * n is a positive integer giving the sample size

    Returns: An array containing the estimated stationary distribution
    """
    N = len(p)
    X = sample_path(p, sample_size=n)    # Sample path starting from X = 0
    q = [np.mean(X == y) for y in range(N)]
    return np.array(q)

A More Efficient Method

Let's suppose that p is stable as before, with unique stationary distribution q

The LLN for stable Markov chains can be used to show that

with probability one (see EDTC, section 6.1.4)

Moreover, this look-ahead estimator

is more efficient than the standard Monte Carlo estimator

introduced above because it uses p, which contains information about the dynamics

Problem:

Modify the function stationary2() above

Providing a keyword argument lae such that
- lae=True returns the look-ahead estimator instead
- lae=False returns the standard Monte Carlo estimator
- The default is False

Again, you can test it using the matrix

p = np.array([[.4, .6], [.2, .8]])

Results should agree with previous solutions

Solution:

## Filename: mc_stationary3.py
## Author: John Stachurski

import numpy as np

# Import the sample_path function defined above
from mc_sample import sample_path

def stationary3(p, n=1000, lae=False): 
    """ 
    Computes the stationary distribution via the LLN for stable
    Markov chains.

    Parameters: 

        * p is a 2D NumPy array
        * n is a positive integer giving the sample size
        * lae is a flag indicating whether to use the look-ahead method

    Returns: An array containing the estimated stationary distribution
    """
    N = len(p)
    X = sample_path(p, sample_size=n)    # Sample path starting from X = 0
    if lae:
        solution = [np.mean(p[X,y]) for y in range(N)]
    else:
        solution = [np.mean(X == y) for y in range(N)]
    return np.array(solution)

Testing

How accurate is the look-ahead estimator compared to standard Monte Carlo?

Let's generate some observations and compare average accuracy

The measure of accuracy will be the d1 distance

The matrix we will use is 2000 x 2000, stored in this file

Problem:

Download the data file
Load the matrix using loadtxt()
Solve for the exact stationary distribution q using the first (i.e. linear algebra) technique
Generate 100 observations of the Monte Carlo estimator (n = 1000) and calculate average d1 distance from q
Generate 100 observations of the look-ahead estimator (n = 1000) and calculate average d1 distance from q

Solution:

## Filename: mc_test.py
## Author: John Stachurski


import numpy as np
from mc_stationary1 import stationary1 
from mc_stationary3 import stationary3

p = np.loadtxt('matrix_dat.txt')


def test_estimator(q, f, replications=100):
    """
    The estimator f returns an estimate of q.  Draw n=replications 
    observations and return average d1 distance

    Parameters:

        * q is a NumPy array representing the exact distribution
        * f is a function that, when called, returns an estimate of q
            as a NumPy array
        * replications is a positive integer giving the sample size

    Returns: A float
    """
    results = np.empty(replications)
    for i in range(replications):
        results[i] = np.sum(np.abs(f() - q))
    return results.mean()


def main():

    q = stationary1(p)  # Exact stationary distribution

    print "Standard MC, average distance:"
    print test_estimator(q, lambda: stationary3(p))

    print "Look-ahead MC, average distance:"
    print test_estimator(q, lambda: stationary3(p, lae=True))

Comment:

The function test_estimator() is written to avoid duplicating code
- Duplicating code is naughty, train yourself to avoid it

Open Mind Tree

Computational economics lecture 20

Application: Finite Markov Chains

Definitions

Simulation

Stationary Distributions

Stationary Distribution by Inversion

Stationary Distribution by Monte Carlo

A More Efficient Method

Testing

0 comments:

Popular Posts

Visitors

Archives

Infolinks In Text Ads

Featured Posts

Blogger Tips