9

When investigating for another question, I found the following:

>>> class A:
...   def m(self): return 42
... 
>>> a = A()

This was expected:

>>> A.m == A.m
True
>>> a.m == a.m
True

But this I did not expect:

>>> a.m is a.m
False

And especially not this:

>>> A.m is A.m
False

Python seems to create new objects for each method access. Why am I seeing this behavior? I.e. what is the reason why it can't reuse one object per class and one per instance?

Community
  • 1
  • 1
Krumelur
  • 27,311
  • 6
  • 71
  • 108

2 Answers2

14

Yes, Python creates new method objects for each access, because it builds a wrapper object to pass in self. This is called a bound method.

Python uses descriptors to do this; function objects have a __get__ method that is called when accessed on a class:

>>> A.__dict__['m'].__get__(A(), A)
<bound method A.m of <__main__.A object at 0x10c29bc10>>
>>> A().m
<bound method A.m of <__main__.A object at 0x10c3af450>>

Note that Python cannot reuse A().m; Python is a highly dynamic language and the very act of accessing .m could trigger more code, which could alter behaviour of what A().m would return next time when accessed.

The @classmethod and @staticmethod decorators make use of this mechanism to return a method object bound to the class instead, and a plain unbound function, respectively:

>>> class Foo:
...     @classmethod
...     def bar(cls): pass
...     @staticmethod
...     def baz(): pass
... 
>>> Foo.__dict__['bar'].__get__(Foo(), Foo)
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo.__dict__['baz'].__get__(Foo(), Foo)
<function Foo.baz at 0x10c2a1f80>
>>> Foo().bar
<bound method type.bar of <class '__main__.Foo'>>
>>> Foo().baz
<function Foo.baz at 0x10c2a1f80>

See the Python descriptor howto for more detail.

However, Python 3.7 adds a new LOAD_METHOD - CALL_METHOD opcode pair that replaces the current LOAD_ATTRIBUTE - CALL_FUNCTION opcode pair precisely to avoid creating a new method object each time. This optimisation transforms the executon path for instance.foo() from type(instance).__dict__['foo'].__get__(instance, type(instance))() with type(instance).__dict__['foo'](instance), so 'manually' passing in the instance directly to the function object. The optimisation falls back to the normal attribute access path (including binding descriptors) if the attribute found is not a pure-python function object.

Martijn Pieters
  • 889,049
  • 245
  • 3,507
  • 2,997
  • I see, and this was actually what I was investigating. However, I would have expected it to at least reuse the object for each static call. – Krumelur Jan 08 '14 at 17:23
  • @Krumelur: Why would that be? Python is a dynamic language, the mere act of accessing `a.m` could have triggered code that *replaced* `m` for the next access. – Martijn Pieters Jan 08 '14 at 17:23
  • When put like this, it makes a lot of sense. But it went against my gut feeling. – Krumelur Jan 08 '14 at 17:37
7

Because that's the most convenient, least magical and most space efficient way of implementing bound methods.

In case you're not aware, bound methods refers to being able to do something like this:

f = obj.m
# ... in another place, at another time
f(args, but, not, self)

Functions are descriptors. Descriptors are general objects which can behave differently when accessed as attribute of a class or object. They are used to implement property, classmethod, staticmethod, and several other things. The specific operation of function descriptors is that they return themselves for class access, and return a fresh bound method object for instance access. (Actually, this is only true for Python 3; Python 2 is more complicated in this regard, it has "unbound methods" which are basically functions but not quite).

The reason a new object is created on each access is one of simplicity and efficency: Creating a bound method up-front for every method of every instance takes time and space. Creating them on demand and never freeing them is a potential memory leak (although CPython does something similar for other built-in types) and slightly slower in some cases. Complicated weakref-based caching schemes method objects aren't free either and significantly more complicated (historically, bound methods predate weakrefs by far).

  • This definitely makes sense, and I expected an answer along these lines. I would have expected that method calls needed to be really quick and object creation would be too slow. But then iterative method call performance is not one of Python's strengths. – Krumelur Jan 08 '14 at 17:30
  • And I specifically like the "least magical". I wish others would consider this solution more often :) – Krumelur Jan 08 '14 at 17:32
  • @Krumelur You're right, method calls are slower than in most languages, but only partially due to bound method objects. Both attribute lookup and ordinary function calls are already (comparatively) expensive, and at least CPython and PyPy a "method cache" reduces the cost. –  Jan 08 '14 at 17:34