Friday, December 23, 2011

Computational economics Lecture 9


Iterators are a uniform interface to stepping through elements in a collection
  • One of the (many) nice features of the Python language...
In this lecture we'll talk about using iterators
In a later lecture we'll learn how to build our own


First we define iterators and iterables


An iterator is an object with a next() method
For example, file objects (which we met in this lecture) are iterators
Recall that we had a file test.txt with contents
Foo foo
Bar bar
Let's create a file object linked to this file
>>> f = open('test.txt', 'r')
This object has a next() method:
'Foo foo\n'
'Bar bar\n'
Calling is essentially the same as calling f.readline()
Other examples are
  • enumerate objects
>>> e = enumerate(['foo', 'bar'])
(0, 'foo')
(1, 'bar')
  • reader objects from the csv module (which is used to manipulate CSV files)
>>> from csv import reader
>>> nikkei_data = reader(open('table.csv'))  # The reader() function is passed a file object
['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'Adj Close']
['2008-05-19', '14294.52', '14343.19', '14219.08', '14269.61', '133800', '14269.61']
  • objects returned by urllib.urlopen()
>>> import urllib
>>> webpage = urllib.urlopen("")
'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""' # etc
'<meta http-equiv="refresh" content="1800;url=?refresh=1">\n'
'<meta name="Description" content=" delivers the latest breaking news and information..' # etc 


The built-in function iter() can be used for creating iterators from certain objects
An object is said to be iterable if it can be passed to iter()
A good example is a list:
>>> X = ['foo', 'bar']
>>> type(X)
<type 'list'>
>>> Y = iter(X)
>>> type(Y)
<type 'listiterator'>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Another example is a dictionary
>>> d = {'name': 'godzilla', 'height in meters': 10}
>>> d = iter(d)
>>> type(d)
<type 'dictionary-keyiterator'>
'height in meters'
The next() method steps through the keys of the dictionary
  • The keys are not ordered, so no notion of "first", "second", etc.
Incidentally, we can get iterators directly
  • d.iterkeys() returns same iterator as iter(d.keys()) or iter(d)
  • d.itervalues() returns same iterator as iter(d.values())
  • d.iteritems() returns same iterator as iter(d.items())
Of course, not all objects are iterable
>>> iter(42)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable

Using Iterators

Let's look at some different ways we can use iterators

Iterators in For Loops

A very common use of iterators is in for loops
In fact this is how the for loop works!
for x in iterator:
    <code block>
This is what happens:
  • Interpreter calls and binds x to result
  • Executes code block
  • Repeats until StopIteration error
Remember that in this lecture that we introduced the syntax
f = open('somefile.txt')
for line in f:
    # do something
Now you know how it works:
  • f is bound to an iterator
    • A file object, which implements a next() method
  • Interpreter
    • Calls and binds line to return value
    • Executes body of loop
    • Repeats until StopIteration error
Another example
for i, x in enumerate(X):
    # do something
Again, enumerate(X) is an iterator
What about this example
X = ['a', 'b']
for x in X:
    print x
Here X is a list (an iterable), not an iterator
Internally, Python calls iter(X) to make an iterator
More generally,
  • for loops work on either iterators or iterables
  • In the second case, the iterable is converted into an iterator
    • iter(iterable)
Here's another example
d = {'name': 'godzilla', 'height in meters': 10}
for key in d:
    # do something
Now you know how this works
Internally, the iterable d is passed to iter()
The resulting iterator steps through the keys of d

Iterators and built-ins

Some built-in functions that act on sequences also work with iterables
  • max()min()sum()all()any()
>>> X = [10, -10]
>>> max(X)
>>> Y = iter(X)
>>> type(Y)
<type 'listiterator'>
>>> max(Y)

Use and reuse

A major difference in usage is that iterators are depleted by use
>>> X = [10, -10]
>>> Y = iter(X)
>>> max(Y)
>>> max(Y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: max() arg is an empty sequence

Application: Web Data

The application involves downloading data with the module urllib
URL stands for uniform resource locator
Some URLs have a query string
The part after (but not including) ? is the query string
Passed to the server as an argument
We can obtain stock price data from Yahoo Finance using query strings, such as
The query string is a collection of field/value pairs, separated by &
The meanings of the main fields are
  • a: start month, base zero (e.g., jan = 0, feb = 1, etc.)
  • b: start day
  • c: start year
  • d: end month, base zero
  • e: end day
  • f: end year
  • g: period (in this case, d = daily)
  • s: ticker symbol for the stock (in this case, Google)
Here is an example of useage
import urllib

base_url = ''

request_data = {'s': 'GOOG',          # Ticker symbol for Google
                'a': '00',            # Start month, base zero
                'b': '01',            # Start day
                'c': '2005',          # Start year
                'd': '05',            # End month, base zero
                'e': '03',            # End day
                'f': '2009',          # End year
                'g': 'd',             # Daily
                'ignore': '.csv'}     # Data type

encoded = urllib.urlencode(request_data)  # Formats the query string
response = urllib.urlopen(base_url + '?' + encoded)
After running this script, we can get successive lines of the data as follows
'Date,Open,High,Low,Close,Volume,Adj Close\n'
We see that Google's share price opened at 426.00 on the 3rd of June 2009, etc.
Note: If you have problems runnning this, your internet connection might be using a proxy server
Try googling for some help with urllib and proxy servers
Write a program to print out the percentage change in value since the start of the year for all of the stocks in this file
  • Change is from Jan 1st until the most recent price available
  • Use the last column (i.e., Adj Close) as the price
  • Stock prices should be downloaded at runtime from Yahoo Finance
  • If you can, print returns in order, from largest to smallest
    • Hint: use the sorted() function
A hint: if
line = '2009-06-01,418.73,429.60,418.53,426.56,3322400,426.56\n'
then line.split(',') returns the elements as a list of strings


## Filename:
## Author: John Stachurski

from urllib import urlopen, urlencode
from datetime import date
from operator import itemgetter

# Record current day and month as strings, month is base zero
today =
mm = str(today.month - 1)  
dd = str(

base_url = ''

request_data = {'a': '00',            # Start month, base zero
                'b': '01',            # Start day
                'c': '2008',          # Start year
                'd': mm,              # End month, base zero
                'e': dd,              # End day
                'f': '2008',          # End year
                'g': 'd',             # Daily
                'ignore': '.csv'}     # Data type

# Main loop

portfolio = open('portfolio.txt')  
percent_change = {}
for line in portfolio:
    ticker, company_name = [item.strip() for item in line.split(',')]
    request_data['s'] = ticker
    response = urlopen(base_url + '?' + urlencode(request_data))  # Skip the first line
    prices = [line.split(',')[-1] for line in response]
    old_price, new_price = float(prices[-1]), float(prices[0])    
    percent_change[company_name] = 100 * (new_price - old_price) / old_price

items = percent_change.items()

for name, change in sorted(items, key=itemgetter(1), reverse=True):
    print '%-12s %10.2f' % (name, change)