Closure Problem in Python

I’ve seen several posts now with people complaining about the following python code:

>>> funcs = []
>>> for i in range(11):
...     def func(x):
...         return x + i
...     funcs.append(func)
...
>>> funcs[1](5)
15

Most people first looking at the code would expect the value of funcs[1](5) to be 6, which it is clearly not, but I’ve found many confusing and wrong explanations for why this is so. I wanted to clarify my understanding of the issue and hopefully provide a simple and clear reasoning for others.

The surprise is the result of two non-obvious python features:

For loops do not create their own isolated scope

This is clearly demonstrated with:

>>> for i in range(10):
...     j = 5
...
>>> j
5
>>> i
9

Where you can see that not only is the loop variable i maintained after the for loop, but j as well. This seems like a somewhat questionable design decision to me, but perhaps that’s because I’m primarily a C developer.

Expressions within the body of a function are evaluated when the function is called, not when it is defined

This is also clearly demonstrated with a simple bit of code:

>>> def func(x):
...     return x + i
...
>>> func(5)
Traceback (most recent call last):
File "", line 1, in
File "", line 2, in func
NameError: global name 'i' is not defined
>>> i = 3
>>> func(5)
8
>>> i = 4
>>> func(5)
9

So you can see, even though i didn’t exist with the function was defined, the interpreter happily allowed us to use it in the body of the function, and then only attempted to evaluate it when it was actually called.

Putting these two things together it’s clear why python evaluates our first code example the way it does. The variable i inside the body of the function has nothing to do with the loop variable i until the moment that the function in the array is actually called, at which point python looks up the name i in the symbol table and finds the one that happens to have ‘leaked out’ of the for loop.

Solutions to this issue usually involve forcing evaluation of the variable so that the function references the value instead of the name. My favorite version involves adding the variable as a default value to the function, forcing evaluation at definition time. e.g.

>>> funcs = []
>>> for i in range(11):
...     def func(x, inc=i):
...         return x + inc
...     funcs.append(func)
...
>>> funcs[1](5)
6

In this version i is evaluated and the value stored in inc when each copy of func is defined.

Comments are closed