Friday, December 23, 2011

Computational Economics Lecture 12

Basic Plotting with Matplotlib

Matplotlib is a module for generating plots and graphs in Python
  • Probably the most popular
  • Written to mimic MATLAB plotting functionality
Follow the current download and installation instructions
  • Might need to install NumPy separately
  • If you are going to do that, install SciPy
    • Includes NumPy plus a bunch of other stuff
We will learn more about NumPy and SciPy later

Basic Plots

Here's a simple plot
import pylab  # import the Matplotlib module
X = [1, 2, 3]
Y = [1, 4, 9]
pylab.plot(X, Y)
This opens a window with the following plot

The buttons at bottom left allow you to adjust axes, save, etc.
Here I've saved as a png file:

Adding another line to the same plot is straightforward:
import pylab     
X = [1, 2, 3]
Y = [1, 4, 9]
Z = [4, 5, 6]
pylab.plot(X, Y)   
pylab.plot(X, Z)

Let's plot the cosine function
import pylab  
X = pylab.linspace(-10, 10, 200)  # A grid on [-10, 10] with 200 points
Y = pylab.cos(X)                  # cos(x) for all x in X
pylab.plot(X, Y)

We can make it a red line if we prefer
import pylab  
X = pylab.linspace(-10, 10, 200)  
Y = pylab.cos(X)                 
pylab.plot(X, Y, 'r-')

For a dashed red line use pylab.plot(X, Y, 'r--')

For yellow dots use pylab.plot(X, Y, 'yo')

We can add titles, axis labels and so on
import pylab  
X = pylab.linspace(-10, 10, 200)  
Y = pylab.cos(X)                 
pylab.plot(X, Y, 'yo')
pylab.xlabel('x values')
pylab.ylabel('y values')
pylab.title('Plot of the cosine function.')

There are many other ways to customize and control the plots
See the user guide at the Matplotlib homepage.


Here's a quick example of how to plot a histogram
import pylab  
data = pylab.randn(500)    # 500 draws from the standard normal distribution
pylab.hist(data, bins=40)

Note that the y-axis gives frequency in the last plot
For a density use pylab.hist(data, bins=40, normed=True)


This file contains daily quotes for the Nikkei 225 from Jan 1984 until May 2009, downloaded from Yahoo finance
Here are the first few lines
Date,Open,High,Low,Close,Volume,Adj Close
Data is comma separated (csv), with most recent date first
For our price data we will use the last column (Adj Close)
Exercise 1:
Plot the data (i.e., the Adj Close column) as a time series
  • Use the File I/O operations in this lecture to extract the data
    • You might like to use the string method split()
    • Note that there is a module called csv for working with csv files
      • But don't use it this time: I want you to practice basic file I/O
  • Make sure your time series is from earliest (i.e., Jan 84) to latest (i.e., May 2009)
Exercise 2:
Write a function that
  • takes a start year and an end year, and
  • plots daily returns (as a percentage)
Daily return = [(today - yesterday) / yesterday] * 100
Exercise 3:
Histogram the daily returns data
If you can, fit a normal density to the data and plot that too
Exercise 4:
Repeat Exercise 1, but using monthly data
  • Extract first quote of each month and plot as a time series
  • Note that first observation is not necessarily on the first day of month
    • first day of the month might be the weekend


Solution to Exercises 1--4
## Author: John Stachurski
## Filename:

from __future__ import division
import pylab

# First let's create some functions 

def percent_change(data):
    Calculates change in percentages from one data point to the next,  
    where data is an array of numbers.
    percent_change = []
    for next, current in zip(data[1:], data[:-1]):
        percent_change.append(100 * (next - current) / current)
    return percent_change

def seriesplot(data):

def returnsplot(start_year, end_year, data, dates):
    Plots daily returns from start_year to end_year.
    Parameters: start_year and end_year are integers from 1984 to 2008.  data
    is the price data as a list of floats, and dates is the corresponding list
    of dates.  Each date is a string in the format YYYY-MM-DD.
    plotvals = []
    for value, date in zip(values, dates):
        year = int(date.split('-')[0])  # extract the year
        if start_year <= year <= end_year:

def densityplot(data):
    Plots a histogram of daily returns from data, plus fitted normal density.
    dailyreturns = percent_change(data)
    pylab.hist(dailyreturns, bins=200, normed=True)
    m, M = min(dailyreturns), max(dailyreturns)
    mu = pylab.mean(dailyreturns)
    sigma = pylab.std(dailyreturns)
    grid = pylab.linspace(m, M, 100)
    densityvalues = pylab.normpdf(grid, mu, sigma)
    pylab.plot(grid, densityvalues, 'r-')

def monthly_returns(data, dates):
    plotdata = []
    # Append the first data entry for plotting
    # Get the month corresponding to the first data entry
    month = dates[0].split('-')[1]
    for value, date in zip(data, dates):
        current_month = date.split('-')[1]
        if current_month == month:
            pass  # Do nothing
            month = current_month

#  Now we are ready to read in the data and make the plots

infile = open("table.csv", 'r')
lines = infile.readlines()
del lines[0]     # Remove the first line
lines.reverse()  # Reverse order to start at earliest date

dates = []
values = []
for line in lines:
    elements = line.split(',')

# Solutions to the exercises

exercise_number = int(raw_input("Enter the number of the exercise: "))

if exercise_number == 1:
elif exercise_number == 2:
    sy = int(raw_input("Enter the start year: "))
    ey = int(raw_input("Enter the end year: "))
    returnsplot(sy, ey, values, dates)
elif exercise_number == 3:
elif exercise_number == 4:
    monthly_returns(values, dates)
    print "Dude, there's no exercise number " + str(exercise_number)