Generators
A generator is a kind of iterator (i.e., it implements anext()
method)We will study two ways to build generators
- Generator expressions
- Generator functions
Generator Expressions
The easiest way to build generators is using generator expressionsJust like a list comprehension, but with round brackets
Here is the list comprehension:
>>> singular = ('dog', 'cat', 'bird')
>>> type(singular)
<type 'tuple'>
>>> plural = [string + 's' for string in singular] # Creates a list
>>> plural
['dogs', 'cats', 'birds']
>>> type(plural)
<type 'list'>
>>> singular = ('dog', 'cat', 'bird')
>>> plural = (string + 's' for string in singular) # Creates a generator
>>> type(plural)
<type 'generator'>
>>> plural.next()
'dogs'
>>> plural.next()
'cats'
>>> plural.next()
'birds'
sum()
can be called on iterators, we can do this>>> sum((x * x for x in range(10)))
285
sum()
calls next()
to get the items, adds successive termsIn fact, we can omit the outer brackets in this case
>>> sum(x * x for x in range(10))
285
Generator Functions
The most flexible way to create generator objects(Note that this section is technical, and you can probably get by without it)
Here's an example
Example 1
def f():
yield 'start'
yield 'middle'
yield 'end'
f()
is called a generator functionLooks like a function, uses new keyword
yield
Let's see how it works
john@c246:~/sync_dir/teaching/kyoto_08$ python -i temp.py
>>> type(f) # f itself is a function
<type 'function'>
>>> gen = f() # Creates a generator object
>>> gen
<generator object at 0xb7cf31ac>
>>> gen.next()
'start'
>>> gen.next()
'middle'
>>> gen.next()
'end'
>>> gen.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
f()
is used to create generator objects (in this case gen
)Generators are iterators, because they support a
next()
methodThe first call to
gen.next()
- Executes code in the body of
f()
until it meets ayield
statement - Returns that value to the caller of
gen.next()
gen.next()
- Starts executing from the next line
def f():
yield 'start'
yield 'middle' # This line!
yield 'end'
- Continues until the next
yield
statement - Returns that value to the caller of
gen.next()
- Etc.
StopIteration
errorExample 2
Our next example receives an argument
x
from the callerdef g(x):
while x < 100:
yield x
x = x * x
john@c246:~$ python -i test.py
>>> g
<function g at 0xb7d6b25c>
>>> gen = g(2) # Call generator function to make a generator
>>> type(gen) # gen is an object of type generator
<type 'generator'>
>>> gen.next() # Generators are iterators
2
>>> gen.next()
4
>>> gen.next()
16
>>> gen.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>>
gen = g(2)
binds gen
to a generatorInside the generator, the name
x
is bound to 2
When we call
gen.next()
- The body of
g()
executes until the lineyield x
- The value of
x
is returned
x
is retained inside the generatorWhen we call
gen.next()
again, execution continues from where it left offdef g(x):
while x < 100:
yield x
x = x * x # execution continues from here
yield x
, returns the value of x
, repeatsWhen
x < 100
fails, throws a StopIteration
errorHere's the generator used with
for
gen = g(2)
for v in gen:
print v
def g(x):
while 1:
yield x
x = x * x
>>> gen = g(3)
>>> gen.next()
3
>>> gen.next()
9
>>> gen.next()
81
>>> gen.next()
6561
>>> gen.next()
43046721
>>> gen.next()
1853020188851841L
Advantages of Iterators
What's the advantage of using an iterator here?Suppose we want to sample a binomial(n,0.5)
One way to do it is as follows
>>> n = 10000000
>>> draws = [random.uniform(0, 1) < 0.5 for i in range(n)]
>>> sum(draws)
range(n)
, anddraws
If I make
n
even bigger then my computer refuses to allocate the memory>>> n = 1000000000
>>> draws = [random.uniform(0, 1) < 0.5 for i in range(n)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
Here is the generator function:
import random
def f(n):
i = 1
while i <= n:
yield random.uniform(0, 1) < 0.5
i += 1
john@c246:~/sync_dir/teaching/kyoto_08$ python -i temp.py
>>> n = 10000000
>>> draws = f(n)
>>> draws
<generator object at 0xb7d8b2cc>
>>> sum(draws)
4999141
- Iterables avoid the need to create big lists/tuples
- Provide a uniform interface to iteration
- Can be used transparently in
for
loops
- Can be used transparently in
Exercises
Exercise 1Write a generator which yields a time series for the quadratic map
Inputs to the generator are x0 and n, the length of the series
Plot a series with Matplotlib
Exercise 2
Complete the following code, and test it using this file
def column_iterator(target_file, column_number):
"""A generator function for CSV files.
When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
which steps through the elements of column column_number in file
target_file.
"""
# put your code here
dates = column_iterator('table.csv', 1)
for date in dates:
print date
Solutions
Solution to Exercise 1:## Filename: quadmap.py
## Author: John Stachurski
import pylab
def qm(x, n):
i = 0
while i < n:
yield x
x = 4 * (1 - x) * x
i += 1
h = qm(0.1, 200)
time_series = [x for x in h]
pylab.plot(time_series)
pylab.show()
def column_iterator(target_file, column_number):
"""A generator function for CSV files.
When called with a file name target_file (string) and column number
column_number (integer), the generator function returns a generator
which steps through the elements of column column_number in file
target_file.
"""
f = open(target_file, 'r')
for line in f:
yield line.split(',')[column_number - 1]
f.close()
dates = column_iterator('table.csv', 1)
for date in dates:
print date