Python unintuitive member variable behaviour_问答_开发者

This s开发者_开发百科cript:

class testa():
    a = []

class testb():
    def __init__(self):
        self.a = []

ta1 = testa(); ta1.a.append(1); ta2 = testa(); ta2.a.append(2)
tb1 = testb(); tb1.a.append(1); tb2 = testb(); tb2.a.append(2)

print ta1.a, ta2.a, tb1.a, tb2.a

produces this output:

[1, 2] [1, 2] [1] [2]

but I expected

[1] [2] [1] [2]

Why was I wrong? The definitions of testa and testb seem equivalent to me, so why should behavior change so drastically?!

EDIT: This seems unintuitive because it is different from how other types like int and str behave. For some reason lists are created as class variables when not initialized in init, but ints and strs are created as object variables no matter what.

In testa the variable a is a class variable and is shared between all instances. ta1.a and ta2.a refer to the same list.

In testb the variable a is an object variable. Each instance has its own value.

See Class and Object Variables for more details.

One is a class variable, the other is an instance variable.
Class vars are shared between all members of the class, instance vars are unique to each instance.
http://docs.python.org/tutorial/classes.html

It helps to remember that the class statement in Python is much closer to any other block statement than is the case for languages like C++.

In Python, a class statement contains a bunch of statements to be executed, just like def, if, or while. The difference is just in what the interpreter does with the block. In the case of flow control block statements like if and while, the interpreter executes the block as specified by the meaning of the flow control statement. In a def, the interpreter saves the block and executes it whenever the function object is called.

In the case of a class block, Python executes the block immediately, in a new scope, and then uses whatever is left in that scope after execution finishes as the contents of the class.

So for this:

class testa():
    a = []

Python executes the block a = []. Then at the end, the scope contains a bound to an empty list object. So that's what is in your class. Not any particular instance of the class, that is the class itself.

It inherits a do-nothing constructor from object, so that when you instantiate the class with ta1 = testa(), you get an empty instance. Then when you ask for ta1.a, Python finds no instance variable named a (because ta1 has no instance variables at all), and so it looks for a in the class testa. This of course it finds; but it finds the same one for every instance of testa, and so you get the behaviour you observed.

On the other hand this:

class testb():
    def __init__(self):
        self.a = []

is completely different. Here once the class block has been executed the contents of the class scope is again a single name, but this time it's __init__ bound to a function. That becomes the contents of the class.

Now when you instantiate this class with testb(), Python finds __init__ in the class and calls that function with the new instance. The execution of that function creates an a instance variable in the new instance. So every instance of testb() gets its own variable, and you get the behaviour observed.

Take home message: a class block in Python is not just a set of declarations of things that are contained in instances of that class, unlike in traditional OO-ish languages like C++ and Java. It is actual code that is actually executed to define the contents of the class. This can be really handy: you can use if statements, the results of function execution, and anything you would use in any other context, inside your class body to decide what to define in your class.

(NOTE: I lied for simplicity earlier. Even instances of testa will have some instance variables, because there are some that are automatically created by default for all instances, but you don't see them as much in day-to-day Python).

The definitions of testa and testb seem equivalent to me, so why should behavior change so drastically?!

Python's behaviour is completely logical - more so than that of Java or C++ or C#, in that everything that you declare within the class scope belongs to the class.

In Python, a class is logically a template for the objects, but it isn't really physically one. The closest you can get to specifying "instances have exactly this set of members" is to inherit from object, list them in the class' __slots__ and initialize them in __init__. However, Python won't force you to set all the values in __init__, the ones you don't set don't get any default value (attempting to access them results in an AttributeError) and you can even explicitly remove attributes from instances later (via del). All that __init__ does is run upon each new instance (which is a convenient and idiomatic way to set initial values for attributes). All that __slots__ does is prevent adding other attributes.

This seems unintuitive because it is different from how other types like int and str behave. For some reason lists are created as class variables when not initialized in init, but ints and strs are created as object variables no matter what.

This is not the case.

If you put an int or str there, it is still a member of the class. If you could somehow modify an int or str, then those changes would be reflected in every object, just as they are with the list, because looking up the attribute in the object would find it in the class. (Roughly speaking, attribute access checks the object first, then the class.)

However, you can't actually do that. What you think is an equivalent test isn't equivalent at all, because your test with an int or str causes an assignment to the object's attribute, which hides the class attribute. ta1.a.append calls a method of the list, which causes the list's internals to change. This is impossible for Python's strings and ints. Even if you write something like ta1.n += 1, this translates to ta.n = ta.n + 1, where the attribute is looked up (and found in the class), the sum calculated and then assigned back to ta.n (assignment will always update the object, unless explicitly forbidden by the class' __slots__, in which case an exception is raised).

You can trivially demonstrate that the value belongs to the class by accessing it directly from the class, e.g. testa.a.