Wednesday, July 22, 2015

Generators in Python

I was trying to construct a two dimensional code and encountered the following problem:

>>> l = list()
>>> l.append([] for i in range(5))
>>> l[0].append(1)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'generator' object has no attribute 'append'

Hmm, I thought I was creating an empty list and append another 5 empty lists. However:

>>> l
[<generator genexpr="" object=""> at 0x102077fc0>]

Recall my conversation with my friend the day before about generators, I decide to dig a little bit deeper.

It turns out I am using the generator expression, and thus instead of creating a list of lists, I created a generator object.

So what is a generator function?
In Python, generator functions allow for more efficient memory use. Consider we need to get the range from 0 to a very large integer (I just realize in Python 3 there is no maximum integer), and we need operate on this range of number (e.g., print them out). We can either construct a list and operate this list, which will consume all of our memory or, we can simply generate the number and operate it at the same time. For example, we can generate the number, print it out, and only remember the current number so that we can do the next operation. Take a look at the following class:

class firstn(object):
    def __init__(self, n):
        self.n = n
        self.num, self.nums = 0, []

    def __iter__(self):
        return self

    # Python 3 compatibility
    def __next__(self):
        return self.next()

    def next(self):
        if self.num < self.n:
            cur, self.num = self.num, self.num+1
            return cur
        else:
            raise StopIteration()

The class constructs an iterator, but it does not store the each element in the memory. In fact, it only stores the current element and the element prior to it. When operating on this iterator, all we need to do is to operate, store the current result and generate the next element. This saves a lot of memory.

However, writing in this way is somehow cumbersome. Python provides an easier way to do it --yield:

>>> def firstn(n):
...     num = 0
...     while num < n:
...         yield num
...         num += 1
... 
>>> firstn(100)
<generator object firstn at 0x10207e0d8>

The firstn(n) function is written in the generator function manner.

We can even skip writing a function by using a generator expression:

>>> (n for n in range(5))
<generator genexpr="" object=""> at 0x10207e048>
>>> [n for n in range(5)]
[0, 1, 2, 3, 4]

As you can see, the only difference between a generator and a list constructor is () versus []. And this is the reason I encountered the problem I mentioned at the beginning of the post.

However, even though there are lots of perks by using generators, do remember such iterator (remember a generator is still an iterator) can only be iterate once. Because the elements are not saved in the memory, after the operation, the elements are gone. You have to create a generator (or call the function if you write a generator function) if you want to perform the operations again.

Last thing, how to solve the above-mentioned problem?

Not this way if you are thinking of it:

>>> l.append([[] for i in range(5)])
>>> l
[[[], [], [], [], []]]

append() appends the item (which is a list of lists you have just constructed) to l, which results a list that has an element of a list of lists...

either extend() or += should work:

>>> l.extend([[] for i in range(5)])
>>> l
[[], [], [], [], []]
>>> l += [[] for i in range(5)]
>>> l
[[], [], [], [], [], [], [], [], [], []]

References:
[1] https://wiki.python.org/moin/Generators
[2] http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do-in-python
[3] http://stackoverflow.com/questions/5164642/python-print-a-generator-expression

1 comment:

  1. The knowledge of python is very essential for the software developers. Python is a high level, general purpose, dynamic programming language that is of code readability and its synatx allows programmers to express the concept in fewer lines of code.
    python training in chennai | python training institutes in chennai

    ReplyDelete