Basic Plotting with Matplotlib
Matplotlib is a module for generating plots and graphs in Python
- Probably the most popular
- Written to mimic MATLAB plotting functionality
Follow the current download and installation instructions
- Might need to install NumPy separately
- If you are going to do that, install SciPy
- Includes NumPy plus a bunch of other stuff
We will learn more about NumPy and SciPy later
Basic Plots
Here's a simple plot
import pylab # import the Matplotlib module
X = [1, 2, 3]
Y = [1, 4, 9]
pylab.plot(X, Y)
pylab.show()
This opens a window with the following plot
The buttons at bottom left allow you to adjust axes, save, etc.
Here I've saved as a png file:
Adding another line to the same plot is straightforward:
import pylab
X = [1, 2, 3]
Y = [1, 4, 9]
Z = [4, 5, 6]
pylab.plot(X, Y)
pylab.plot(X, Z)
pylab.show()
Let's plot the cosine function
import pylab
X = pylab.linspace(-10, 10, 200) # A grid on [-10, 10] with 200 points
Y = pylab.cos(X) # cos(x) for all x in X
pylab.plot(X, Y)
pylab.show()
We can make it a red line if we prefer
import pylab
X = pylab.linspace(-10, 10, 200)
Y = pylab.cos(X)
pylab.plot(X, Y, 'r-')
pylab.show()
For a dashed red line use
pylab.plot(X, Y, 'r--')
For yellow dots use
pylab.plot(X, Y, 'yo')
We can add titles, axis labels and so on
import pylab
X = pylab.linspace(-10, 10, 200)
Y = pylab.cos(X)
pylab.plot(X, Y, 'yo')
pylab.xlabel('x values')
pylab.ylabel('y values')
pylab.title('Plot of the cosine function.')
pylab.show()
There are many other ways to customize and control the plots
See the user guide at the Matplotlib homepage.
Histograms
Here's a quick example of how to plot a histogram
import pylab
data = pylab.randn(500) # 500 draws from the standard normal distribution
pylab.hist(data, bins=40)
pylab.show()
Note that the y-axis gives frequency in the last plot
For a density use
pylab.hist(data, bins=40, normed=True)
Exercises
This file contains daily quotes for the Nikkei 225 from Jan 1984 until May 2009, downloaded from Yahoo finance
Here are the first few lines
Date,Open,High,Low,Close,Volume,Adj Close
2009-05-21,9280.35,9286.35,9189.92,9264.15,133200,9264.15
2009-05-20,9372.72,9399.40,9311.61,9344.64,143200,9344.64
2009-05-19,9172.56,9326.75,9166.97,9290.29,167000,9290.29
2009-05-18,9167.05,9167.82,8997.74,9038.69,147800,9038.69
2009-05-15,9150.21,9272.08,9140.90,9265.02,172000,9265.02
Data is comma separated (csv), with most recent date first
For our price data we will use the last column (Adj Close)
Exercise 1:
Plot the data (i.e., the Adj Close column) as a time series
- Use the File I/O operations in this lecture to extract the data
- You might like to use the string method
split()
- Note that there is a module called
csv
for working with csv files- But don't use it this time: I want you to practice basic file I/O
- You might like to use the string method
- Make sure your time series is from earliest (i.e., Jan 84) to latest (i.e., May 2009)
Exercise 2:
Write a function that
- takes a start year and an end year, and
- plots daily returns (as a percentage)
Daily return = [(today - yesterday) / yesterday] * 100
Exercise 3:
Histogram the daily returns data
If you can, fit a normal density to the data and plot that too
Exercise 4:
Repeat Exercise 1, but using monthly data
- Extract first quote of each month and plot as a time series
- Note that first observation is not necessarily on the first day of month
- first day of the month might be the weekend
Solutions
Solution to Exercises 1--4
## Author: John Stachurski
## Filename: nikkei_plot.py
from __future__ import division
import pylab
# First let's create some functions
def percent_change(data):
"""
Calculates change in percentages from one data point to the next,
where data is an array of numbers.
"""
percent_change = []
for next, current in zip(data[1:], data[:-1]):
percent_change.append(100 * (next - current) / current)
return percent_change
def seriesplot(data):
pylab.plot(data)
pylab.show()
def returnsplot(start_year, end_year, data, dates):
"""
Plots daily returns from start_year to end_year.
Parameters: start_year and end_year are integers from 1984 to 2008. data
is the price data as a list of floats, and dates is the corresponding list
of dates. Each date is a string in the format YYYY-MM-DD.
"""
plotvals = []
for value, date in zip(values, dates):
year = int(date.split('-')[0]) # extract the year
if start_year <= year <= end_year:
plotvals.append(value)
seriesplot(percent_change(plotvals))
def densityplot(data):
"""
Plots a histogram of daily returns from data, plus fitted normal density.
"""
dailyreturns = percent_change(data)
pylab.hist(dailyreturns, bins=200, normed=True)
m, M = min(dailyreturns), max(dailyreturns)
mu = pylab.mean(dailyreturns)
sigma = pylab.std(dailyreturns)
grid = pylab.linspace(m, M, 100)
densityvalues = pylab.normpdf(grid, mu, sigma)
pylab.plot(grid, densityvalues, 'r-')
pylab.show()
def monthly_returns(data, dates):
plotdata = []
# Append the first data entry for plotting
plotdata.append(data[0])
# Get the month corresponding to the first data entry
month = dates[0].split('-')[1]
for value, date in zip(data, dates):
current_month = date.split('-')[1]
if current_month == month:
pass # Do nothing
else:
plotdata.append(value)
month = current_month
seriesplot(plotdata)
# Now we are ready to read in the data and make the plots
infile = open("table.csv", 'r')
lines = infile.readlines()
infile.close()
del lines[0] # Remove the first line
lines.reverse() # Reverse order to start at earliest date
dates = []
values = []
for line in lines:
elements = line.split(',')
dates.append(elements[0])
values.append(float(elements[-1]))
# Solutions to the exercises
exercise_number = int(raw_input("Enter the number of the exercise: "))
if exercise_number == 1:
seriesplot(values)
elif exercise_number == 2:
sy = int(raw_input("Enter the start year: "))
ey = int(raw_input("Enter the end year: "))
returnsplot(sy, ey, values, dates)
elif exercise_number == 3:
densityplot(values)
elif exercise_number == 4:
monthly_returns(values, dates)
else:
print "Dude, there's no exercise number " + str(exercise_number)
0 comments:
Post a Comment