WTF is this?

or, When is is what you think is is?

Also, rabbit hole alert...


In [1]:
%%HTML
<img src="https://imgs.xkcd.com/comics/bun_alert.png" width=500></img>


The Problem


In [2]:
%%HTML
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Pay no mind.... <a href="https://t.co/mnIPHJXE1h">pic.twitter.com/mnIPHJXE1h</a></p>&mdash; David Beazley (@dabeaz) <a href="https://twitter.com/dabeaz/status/890634046958477312">July 27, 2017</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>



In [3]:
# let's reproduce it
class A():
    pass
A.__dict__ is A.__dict__


Out[3]:
False

In [4]:
# ... and more robustly...
a = A()
a.__class__.__dict__ is a.__class__.__dict__


Out[4]:
False

Our path...

The code in question involves class objects and instances, the is operator, and attribute access via the dot notation. Let's explore how those objects and operations work.

Out of scope:

  • we're not gonna talk about properties (by name)
  • we're not gonna talk about descriptors (by name)
  • we're not gonna talk about slots

...but you will run into these concepts if you investigate beyond this tutorial.

1) Python class construction


In [5]:
class B():
    pass

In [6]:
C = type('C',(),dict())

In [7]:
D = type('C',(),dict())

In [8]:
D


Out[8]:
__main__.C

Takeaways:

  • two forms of class definition
  • variables point to objects

Reminder:

  • all objects in python 3 are instances of object, including objects that are class definitions

2) Python class comparision

How do these class definitions compare?


In [9]:
# Start with the equivalence operator (==)
# --> remember that this will be defined by the ".__eq__()" method of the argument on the left

In [10]:
B == B


Out[10]:
True

In [11]:
B == C


Out[11]:
False

In [12]:
C == D


Out[12]:
False

In [15]:
B == D


Out[15]:
False

In [16]:
# check the directory of the object's attributes (more about this later)

vars(B)


Out[16]:
mappingproxy({'__dict__': <attribute '__dict__' of 'B' objects>,
              '__doc__': None,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'B' objects>})

In [17]:
vars(B) == vars(B)


Out[17]:
True

In [18]:
vars(B) == vars(C)


Out[18]:
False

In [19]:
vars(C) == vars(D)


Out[19]:
False

In [20]:
# let's cast it to a real 'dict'
dict(vars(D))


Out[20]:
{'__dict__': <attribute '__dict__' of 'C' objects>,
 '__doc__': None,
 '__module__': '__main__',
 '__weakref__': <attribute '__weakref__' of 'C' objects>}

In [21]:
dict(vars(B)) == dict(vars(C))


Out[21]:
False

In [22]:
dict(vars(C)) == dict(vars(D))


Out[22]:
False

In [23]:
# check the directory of attributes (more about this later)

dir(B)


Out[23]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__']

In [24]:
dir(B) == dir(B)


Out[24]:
True

In [25]:
dir(B) == dir(C)


Out[25]:
True

In [26]:
dir(C) == dir(D)


Out[26]:
True

Takeaways:

  • Class definitions are objects with attributes
  • Class descriptions (vars, dir, etc.) are equivalent for self-comparison
  • Only the objects' list of attribute names are equivalent for separately-constructed objects

3) Python class identity

What are these objects?


In [27]:
# instance and type

isinstance(B,type)


Out[27]:
True

In [28]:
isinstance(B,object)


Out[28]:
True

In [29]:
type(B)


Out[29]:
type

In [30]:
B.__class__


Out[30]:
type

In [31]:
B.__base__


Out[31]:
object

In [32]:
B.__bases__


Out[32]:
(object,)

In [33]:
id(B)


Out[33]:
140626250575432

In [34]:
# the 'is' operator compares the result of the 'id' function's application to the arguments

B is B


Out[34]:
True

In [35]:
id(B) == id(B)


Out[35]:
True

In [36]:
# now use B's callability to create an instance of it
b = B()

In [37]:
isinstance(b,B)


Out[37]:
True

In [38]:
type(b).__bases__


Out[38]:
(object,)

In [39]:
# FWIW
type(type)


Out[39]:
type

In [40]:
type.__bases__


Out[40]:
(object,)

Takeaways:

  • class objects are instances of the type 'type'
  • class objects are classes that inherit from 'object'

WTF?

4) Object attributes

In addition to various notions of identity, we also need to investigate attribute access.

Apart from the problem we're investigating, Python places a lot of importance on interfaces, in which an object is described and classified in terms of its function and attributes, rather than its identity or inheritance properties.


In [42]:
# set some attributes of some objects
setattr(b,'an_instance_attr',1)
setattr(B,'a_class_attr',2)
setattr(B,'a_class_method',lambda x: 3)

In [43]:
vars(b)


Out[43]:
{'an_instance_attr': 1}

In [44]:
b.__dict__


Out[44]:
{'an_instance_attr': 1}

In [45]:
vars(B)


Out[45]:
mappingproxy({'__dict__': <attribute '__dict__' of 'B' objects>,
              '__doc__': None,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'B' objects>,
              'a_class_attr': 2,
              'a_class_method': <function __main__.<lambda>>})

Conclusion: __dict__ / vars() returns an instance's attributes.

Let iterate through b's inheritance tree, and look at the instance attributes.


In [46]:
vars(type)


Out[46]:
mappingproxy({'__abstractmethods__': <attribute '__abstractmethods__' of 'type' objects>,
              '__base__': <member '__base__' of 'type' objects>,
              '__bases__': <attribute '__bases__' of 'type' objects>,
              '__basicsize__': <member '__basicsize__' of 'type' objects>,
              '__call__': <slot wrapper '__call__' of 'type' objects>,
              '__delattr__': <slot wrapper '__delattr__' of 'type' objects>,
              '__dict__': <attribute '__dict__' of 'type' objects>,
              '__dictoffset__': <member '__dictoffset__' of 'type' objects>,
              '__dir__': <method '__dir__' of 'type' objects>,
              '__doc__': <attribute '__doc__' of 'type' objects>,
              '__flags__': <member '__flags__' of 'type' objects>,
              '__getattribute__': <slot wrapper '__getattribute__' of 'type' objects>,
              '__init__': <slot wrapper '__init__' of 'type' objects>,
              '__instancecheck__': <method '__instancecheck__' of 'type' objects>,
              '__itemsize__': <member '__itemsize__' of 'type' objects>,
              '__module__': <attribute '__module__' of 'type' objects>,
              '__mro__': <member '__mro__' of 'type' objects>,
              '__name__': <attribute '__name__' of 'type' objects>,
              '__new__': <function type.__new__>,
              '__prepare__': <method '__prepare__' of 'type' objects>,
              '__qualname__': <attribute '__qualname__' of 'type' objects>,
              '__repr__': <slot wrapper '__repr__' of 'type' objects>,
              '__setattr__': <slot wrapper '__setattr__' of 'type' objects>,
              '__sizeof__': <method '__sizeof__' of 'type' objects>,
              '__subclasscheck__': <method '__subclasscheck__' of 'type' objects>,
              '__subclasses__': <method '__subclasses__' of 'type' objects>,
              '__text_signature__': <attribute '__text_signature__' of 'type' objects>,
              '__weakrefoffset__': <member '__weakrefoffset__' of 'type' objects>,
              'mro': <method 'mro' of 'type' objects>})

In [47]:
vars(object)


Out[47]:
mappingproxy({'__class__': <attribute '__class__' of 'object' objects>,
              '__delattr__': <slot wrapper '__delattr__' of 'object' objects>,
              '__dir__': <method '__dir__' of 'object' objects>,
              '__doc__': 'The most base type',
              '__eq__': <slot wrapper '__eq__' of 'object' objects>,
              '__format__': <method '__format__' of 'object' objects>,
              '__ge__': <slot wrapper '__ge__' of 'object' objects>,
              '__getattribute__': <slot wrapper '__getattribute__' of 'object' objects>,
              '__gt__': <slot wrapper '__gt__' of 'object' objects>,
              '__hash__': <slot wrapper '__hash__' of 'object' objects>,
              '__init__': <slot wrapper '__init__' of 'object' objects>,
              '__le__': <slot wrapper '__le__' of 'object' objects>,
              '__lt__': <slot wrapper '__lt__' of 'object' objects>,
              '__ne__': <slot wrapper '__ne__' of 'object' objects>,
              '__new__': <function object.__new__>,
              '__reduce__': <method '__reduce__' of 'object' objects>,
              '__reduce_ex__': <method '__reduce_ex__' of 'object' objects>,
              '__repr__': <slot wrapper '__repr__' of 'object' objects>,
              '__setattr__': <slot wrapper '__setattr__' of 'object' objects>,
              '__sizeof__': <method '__sizeof__' of 'object' objects>,
              '__str__': <slot wrapper '__str__' of 'object' objects>,
              '__subclasshook__': <method '__subclasshook__' of 'object' objects>})

In [48]:
# collect all the instance attributes of the inheritance tree (don't include type)

attribute_keys = set( list(vars(b).keys()) + list(vars(B).keys()) + list(vars(object).keys()))

In [50]:
for attribute_key in attribute_keys:
    print('{} : {}'.format(attribute_key,getattr(b,attribute_key)))


__hash__ : <method-wrapper '__hash__' of B object at 0x1050980f0>
__lt__ : <method-wrapper '__lt__' of B object at 0x1050980f0>
__reduce_ex__ : <built-in method __reduce_ex__ of B object at 0x1050980f0>
__ne__ : <method-wrapper '__ne__' of B object at 0x1050980f0>
__dir__ : <built-in method __dir__ of B object at 0x1050980f0>
__ge__ : <method-wrapper '__ge__' of B object at 0x1050980f0>
__new__ : <built-in method __new__ of type object at 0x1034f45e0>
__gt__ : <method-wrapper '__gt__' of B object at 0x1050980f0>
a_class_attr : 2
__reduce__ : <built-in method __reduce__ of B object at 0x1050980f0>
__le__ : <method-wrapper '__le__' of B object at 0x1050980f0>
__class__ : <class '__main__.B'>
__doc__ : None
__getattribute__ : <method-wrapper '__getattribute__' of B object at 0x1050980f0>
__str__ : <method-wrapper '__str__' of B object at 0x1050980f0>
a_class_method : <bound method <lambda> of <__main__.B object at 0x1050980f0>>
__format__ : <built-in method __format__ of B object at 0x1050980f0>
__eq__ : <method-wrapper '__eq__' of B object at 0x1050980f0>
__sizeof__ : <built-in method __sizeof__ of B object at 0x1050980f0>
__repr__ : <method-wrapper '__repr__' of B object at 0x1050980f0>
__module__ : __main__
__subclasshook__ : <built-in method __subclasshook__ of type object at 0x7fe619b62e48>
__dict__ : {'an_instance_attr': 1}
__init__ : <method-wrapper '__init__' of B object at 0x1050980f0>
__weakref__ : None
__setattr__ : <method-wrapper '__setattr__' of B object at 0x1050980f0>
an_instance_attr : 1
__delattr__ : <method-wrapper '__delattr__' of B object at 0x1050980f0>

In [53]:
# our manual attributes collection should match that from 'dir'
attribute_keys - set(dir(b))


Out[53]:
set()

NOTE: dir is not always reliable.

Take-aways:

  • The __dict__ attribute lists the instance attributes of an object

5) Instance and class attributes


In [54]:
b.an_instance_attr


Out[54]:
1

In [55]:
B.an_instance_attr


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-55-862e618be79b> in <module>()
----> 1 B.an_instance_attr

AttributeError: type object 'B' has no attribute 'an_instance_attr'

In [56]:
B.a_class_attr


Out[56]:
2

In [57]:
b.a_class_attr


Out[57]:
2

In [58]:
b.a_class_method


Out[58]:
<bound method <lambda> of <__main__.B object at 0x1050980f0>>

In [59]:
b.a_class_method()


Out[59]:
3

Take-aways:

  • instance attributes do not affect the associated class attribute set
  • class attributes are available for lookup by an instance

Out of scope:

  • how do instance attributes get added at construction?

6) Attribute access

The dot notation searches through the attributes of the instance, then the class, the through parent classes, to find an attribute of the requested name.

The method resolution order defines how complex inheritance structures are traversed.


In [60]:
B.mro()


Out[60]:
[__main__.B, object]

In [61]:
# Python's MRO invokes a smart algorithm that accounts for circularity in the inheritance tree
# https://en.wikipedia.org/wiki/C3_linearization

class X():
    a = 1
class Y():
    b = 2
class Z(X,Y):
    c = 3
Z.mro()


Out[61]:
[__main__.Z, __main__.X, __main__.Y, object]

In [62]:
Z.c


Out[62]:
3

In [63]:
Z.b


Out[63]:
2

In [64]:
Z.a


Out[64]:
1

In [65]:
# get an attribute defined only by the base class
Z.__repr__


Out[65]:
<slot wrapper '__repr__' of 'object' objects>

To locate the attribute named my_attr, Python:

  • searchs the __dict__ attribute of the instance for key my_attr
  • searches the __dict__ attributes of all the objects in the MRO
  • searches in all the places for a __getattr__ method, and calls object.__getattr__('my_attr')
  • ...other things...

Take-aways:

  • the method resolution order manages the order and sources for object attribute lookup
  • attribute lookup is potentially complicated

7) An optimization

Because attribute lookup is common and potentially complicated, the Python authors decided to enforce some simplifications to the process. Most important for our problem here: class-level attributes and methods must by referenced with strings.


In [66]:
# let's start with the instance-level attribute dictionary

b.__dict__['an_attr'] = 'value'
b.__dict__


Out[66]:
{'an_attr': 'value', 'an_instance_attr': 1}

In [67]:
# I don't know why anyone would want to do this, but we'll allow it at the level of instance objects. 
# Any hashable object can be a key in an ordinary dictionary.

b.__dict__[1] = [3,4]

In [93]:
vars(b)[1]


Out[93]:
[3, 4]

In [69]:
# what happens if we do the same to `b`'s class?

b.__class__.__dict__[1] = [3,4]


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-69-1a9fc67e7197> in <module>()
      1 # what happens if we do the same to `b`'s class?
      2 
----> 3 b.__class__.__dict__[1] = [3,4]

TypeError: 'mappingproxy' object does not support item assignment

In [70]:
# right, we've seen this "mappingproxy" before
b.__class__.__dict__


Out[70]:
mappingproxy({'__dict__': <attribute '__dict__' of 'B' objects>,
              '__doc__': None,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'B' objects>,
              'a_class_attr': 2,
              'a_class_method': <function __main__.<lambda>>})

In [71]:
# also equivalent
B.__dict__


Out[71]:
mappingproxy({'__dict__': <attribute '__dict__' of 'B' objects>,
              '__doc__': None,
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'B' objects>,
              'a_class_attr': 2,
              'a_class_method': <function __main__.<lambda>>})

The MappingProxyType type is a read-only view of a mapping (dictionary). So we can't set instance attributes via this attribute. This requires that attributes be set with setattr, which calls __setattr__.


In [72]:
# turns out, it's a method of 'object'
B.__setattr__


Out[72]:
<slot wrapper '__setattr__' of 'object' objects>

In [73]:
setattr(B,1,2)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-73-26a59c65562d> in <module>()
----> 1 setattr(B,1,2)

TypeError: attribute name must be string, not 'int'

Take-away:

  • class attributes are required to be referenced by strings, due to the implementation of object.__setattr__, thus speeding up attribute lookup.
  • the class-level attribute mapping is returned by a read-only mappingproxy object

8) Tying it together

Now we know why an object's __dict__ attribute returns a read-only mappingproxy object. Let's return to the Tweet and address the question of the mappingproxy object's identity.


In [74]:
%%HTML
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">Pay no mind.... <a href="https://t.co/mnIPHJXE1h">pic.twitter.com/mnIPHJXE1h</a></p>&mdash; David Beazley (@dabeaz) <a href="https://twitter.com/dabeaz/status/890634046958477312">July 27, 2017</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>



In [75]:
# the example
A.__dict__ is A.__dict__


Out[75]:
False

In [81]:
# run this a few times
id(A.__dict__)


Out[81]:
4379070184

Takeaway: a new mappingproxy object is created for every call to __dict__, and since two objects can't share the same memory address at the same time, this form of comparison will never be true. The reason that a new mappingproxy is created for each call to __dict__ is, unfortunately, out of scope.

Bonus questions below:


In [86]:
# what about this?
id(A.__dict__) == id(A.__dict__)


Out[86]:
True

In [87]:
# or this?
x = id(A.__dict__)
y = id(A.__dict__)
x == y


Out[87]:
True

In [88]:
# or this?
x = A.__dict__
y = A.__dict__
id(x) == id(y)


Out[88]:
False

Remember: the return value of the id builtin function "is an integer which is guaranteed to be unique and constant for this object during its lifetime."


In [ ]: