Introduction to Python and Natural Language Technologies

Lecture 03, Week 04

Object oriented programming

27 September 2017

Introduction

  • Python has been object oriented since its first version
  • basically everything is an object including
    • class definitions
    • functions
    • modules
  • PEP8 defines style guidelines for classes as well

Defining classes

  • class keyword
  • instance explicitly bound to the first parameter of each method
    • named self by convention
  • __init__ is called after the instance is created
    • not exactly a constructor because the instance already exists
    • not mandatory

In [1]:
class ClassWithInit:
    def __init__(self):
        pass
    
class ClassWithoutInit:
    pass

Class attributes

  1. data attributes: these correspond to data members in C++
  2. methods: these correspond to methods in C++

both are

  • created upon assignment
  • can be assigned anywhere (not just in __init__)

In [2]:
class A:
    def __init__(self):
        self.attr1 = 42
        
    def method(self):
        self.attr2 = 43
        
a = A()
print(a.attr1)
# print(a.attr2)  # raises AttributeError
a.method()
print(a.attr2)


42
43

Attributes can be added to instances


In [3]:
a.attr3 = 11
print(a.attr3)


11

this will not affect other instances


In [4]:
a2 = A()
# a2.attr3  # raises AttributeError

__init__ may have arguments


In [5]:
class InitWithArguments:
    def __init__(self, value, value_with_default=42):
        self.attr = value
        self.solution_of_the_world = value_with_default
        
class InitWithVariableNumberOfArguments:
    def __init__(self, *args, **kwargs):
        self.val1 = args[0]
        self.val2 = kwargs.get('important_param', 42)

In [6]:
obj1 = InitWithArguments(41)
obj2 = InitWithVariableNumberOfArguments(1, 2, 3, param4="apple", important_param=23)
print(obj1.attr, obj1.solution_of_the_world, 
      obj2.val1, obj2.val2)


41 42 1 23

Method attributes

  • functions inside the class definition
  • explicitly take the instance as first parameter

In [7]:
class A:
    def foo(self):
        print("foo called")
    
    def bar(self, param):
        print("bar called with parameter {}".format(param))

Calling methods

  1. instance.method(param)
  2. class.method(instance, param)

In [8]:
c = A()
c.foo()
c.bar(42)
A.foo(c)
A.bar(c, 43)


foo called
bar called with parameter 42
foo called
bar called with parameter 43

Special attributes

  • every object has a number of special attributes
  • double underscore or dunder notation: __attribute__
  • automatically created
  • advanced OOP features are implemented using these

In [9]:
', '.join(A.__dict__)


Out[9]:
'__module__, foo, bar, __dict__, __weakref__, __doc__'

Data hiding with name mangling

  • by default every attribute is public
  • private attributes can be defined through name mangling
    • every attribute with at least two leading underscores and at most one trailing underscore is replaced with a mangled attribute
    • emulates private behavior
    • mangled name: __classname_attrname

In [10]:
class A:
    def __init__(self):
        self.__private_attr = 42
        
    def foo(self):
        self.__private_attr += 1
        
a = A()
a.foo()
# print(a.__private_attr)  # raises AttributeError
a.__dict__
print(a._A__private_attr)  # name mangled
a.__dict__


43
Out[10]:
{'_A__private_attr': 43}

Class attributes

  • class attributes are class-global attributes
  • roughly the same as static attributes in C++

In [11]:
class A:
    class_attr = 42

Accessing class attributes via instances


In [12]:
a1 = A()
a1.class_attr


Out[12]:
42

Accessing class attributes via the class object


In [13]:
A.class_attr


Out[13]:
42

Setting the class object via the class


In [14]:
a1 = A()
a2 = A()

print(a1.class_attr, a2.class_attr)
A.class_attr = 43
a1.class_attr,  a2.class_attr


42 42
Out[14]:
(43, 43)

Cannot set via an instance


In [15]:
a1 = A()
a2 = A()
a1.class_attr = 11
a2.class_attr


Out[15]:
43

because this assignment creates a new attribute in the instance's namespace.


In [16]:
a1.__dict__


Out[16]:
{'class_attr': 11}

each object has a __class__ magic attribute that accesses the class object. We can use this to access the class attribute:


In [17]:
a1.__class__.class_attr


Out[17]:
43

a2 has not shadowed class_attr, so we can access it through the instance


In [18]:
a2.__dict__, a2.class_attr


Out[18]:
({}, 43)

Inheritance

  • Python supports inheritance and multiple inheritance

In [19]:
class A:
    pass

class B(A):
    pass

a = A()
b = B()
print(isinstance(a, B))
print(isinstance(b, A))
print(issubclass(B, A))
print(issubclass(A, B))


False
True
True
False

New style vs. old style classes

Python 2

  • Python 2.2 introduced a new inheritance mechanism
  • new style classes vs. old style classes
  • class is new style if it subclasses object or one of its predecessors subclasses object
  • wide range of previously unavailable functionality
  • old style classes are the default in Python 2

Python 3

  • only supports new style classes
  • every class implicitly subclasses object

The differences between old style and new style classes are listed here: https://wiki.python.org/moin/NewClassVsClassicClass


In [20]:
%%python2

class OldStyleClass:
    pass

class NewStyleClass(object):
    pass

class ThisIsAlsoNewStyleClass(NewStyleClass):
    pass

Python 3 implicitly subclasses object


In [21]:
class A: pass
class B(object): pass

print(issubclass(A, object))
print(issubclass(B, object))


True
True

Method inheritance

Methods are inherited and overridden in the usual way


In [22]:
class A(object):
    def foo(self):
        print("A.foo was called")
        
    def bar(self):
        print("A.bar was called")
        
class B(A):
    def foo(self):
        print("B.foo was called")
        
b = B()
b.foo()
b.bar()


B.foo was called
A.bar was called

Since data attributes can be created anywhere, they are only inherited if the code in the base class' method is called.


In [23]:
class A(object):
    
    def foo(self):
        self.value = 42
        
class B(A):
    pass

b = B()
print(b.__dict__)
a = A()
print(a.__dict__)
a.foo()
print(a.__dict__)


{}
{}
{'value': 42}

Calling the base class's constructor

  • since __init__ is not a constructor, the base class' init is not called automatically, if the subclass overrides it

In [24]:
class A(object):
    def __init__(self):
        print("A.__init__ called")        
class B(A):
    def __init__(self):
        print("B.__init__ called")        
class C(A): pass
        
b = B()
c = C()


B.__init__ called
A.__init__ called

The base class's methods can be called in at least two ways:

  1. explicitely via the class name
  2. using the super function

In [25]:
class A(object):
    def __init__(self):
        print("A.__init__ called")
        
        
class B(A):
    def __init__(self):
        A.__init__(self)
        print("B.__init__ called")
        
class C(B):
    def __init__(self):
        super().__init__()
        print("C.__init__ called")
        
print("Instantiating B")
b = B()
print("Instantiating C")
c = C()


Instantiating B
A.__init__ called
B.__init__ called
Instantiating C
A.__init__ called
B.__init__ called
C.__init__ called

super's usage was more complicated in Python 2


In [26]:
%%python2

class A(object):
    def __init__(self):
        print("A.__init__ called")
        
        
class B(A):
    def __init__(self):
        A.__init__(self)
        print("B.__init__ called")
        
class C(A):
    def __init__(self):
        super(C, self).__init__()
        print("B.__init__ called")
        
print("Instantiating B")
b = B()
print("Instantiating C")
c = C()


Instantiating B
A.__init__ called
B.__init__ called
Instantiating C
A.__init__ called
B.__init__ called

A complete example using super in the subclass's init:


In [27]:
class Person(object):
    
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __str__(self):
        return "{0}, age {1}".format(self.name, self.age)
        
class Employee(Person):
    
    def __init__(self, name, age, position, salary):
        self.position = position
        self.salary = salary
        super().__init__(name, age)
        
    def __str__(self):
        return "{0}, position: {1}, salary: {2}".format(super().__str__(), self.position, self.salary)
    
    
e = Employee("Jakab Gipsz", 33, "manager", 450000)
print(e)
print(Person(e.name, e.age))


Jakab Gipsz, age 33, position: manager, salary: 450000
Jakab Gipsz, age 33

Duck typing and interfaces

  • no built-in mechanism for interfacing
  • the Abstract Base Classes (abc) module implements interface-like features
  • not used extensively in Python in favor of duck typing

"In computer programming, duck typing is an application of the duck test in type safety. It requires that type checking be deferred to runtime, and is implemented by means of dynamic typing or reflection." -- Wikipedia

"If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck." -- Wikipedia

  • allows polymorphism without abstract base classes

In [28]:
class Cat(object):
    
    def make_sound(self):
        self.mieuw()
        
    def mieuw(self):
        print("Mieuw")
        
        
class Dog(object):
    
    def make_sound(self):
        self.bark()
        
    def bark(self):
        print("Vau")
        

animals = [Cat(), Dog()]
for animal in animals:
    # animal must have a make_sound method
    animal.make_sound()


Mieuw
Vau

NotImplementedError

  • emulating C++'s pure virtual function

In [29]:
class A(object):
    def foo(self):
        raise NotImplementedError()
        
class B(A):
    def foo(self):
        print("Yay.")
        
class C(A): pass
b = B()
b.foo()
c = C()
# c.foo()  # NotImplementedError why does this happen?


Yay.
  • we can still instantiate A

In [30]:
a = A()

Magic methods

  • mechanism to implement advanced OO features
  • dunder methods

__str__ method

  • returns the string representation of the object
  • Python 2 has two separate methods __str__ and __unicode__ for bytestrings and unicode strings

In [31]:
class ClassWithoutStr(object):
    def __init__(self, value=42):
        self.param = value
        
        
class ClassWithStr(object):
    def __init__(self, value=42):
        self.param = value
        
    def __str__(self):
        return "My id is {0} and my parameter is {1}".format(
            id(self), self.param)
    
    
print("Printint a class that does not __str__: {}".format(ClassWithoutStr(345)))
print("Printint a class that defines __str__: {}".format(ClassWithStr(345)))


Printint a class that does not __str__: <__main__.ClassWithoutStr object at 0x7f986c96b828>
Printint a class that defines __str__: My id is 140292633572240 and my parameter is 345

Operator overloading

  • operators are mapped to magic functions
  • defining these functions defines/overrides operators
  • comprehensive list of operator functions are here
  • some built-in functions are included as well
    • __len__: defines the behavior of len(obj)
    • __abs__: defines the behavior of abs(obj)

In [32]:
class Complex(object):
    def __init__(self, real=0.0, imag=0.0):
        self.real = real
        self.imag = imag
        
    def __abs__(self):
        return (self.real**2 + self.imag**2) ** 0.5
    
    def __eq__(self, other):  # right hand side
        return self.real == other.real and self.imag == other.imag
    
    def __gt__(self, other):
        return abs(self) > abs(other)
    
c1 = Complex()
c2 = Complex(1, 1)

abs(c2), c1 == c2


Out[32]:
(1.4142135623730951, False)

How can we define comparison between different types?

Let's define a comparison between Complex and strings. We can check for the right operand's type:


In [33]:
class Complex(object):
    def __init__(self, real=0.0, imag=0.0):
        self.real = real
        self.imag = imag
        
    def __abs__(self):
        return (self.real**2 + self.imag**2) ** 0.5
    
    def __eq__(self, other):  # right hand side
        return self.real == other.real and self.imag == other.imag
    
    def __gt__(self, other):
        if isinstance(other, str):
            return abs(self) > len(other)
        return abs(self) > abs(other)
    
c1 = Complex()
c2 = Complex(1, 1)

abs(c2), c1 == c2, c2 > "a", c2 > "ab"


Out[33]:
(1.4142135623730951, False, True, False)

if the built-in type is the left-operand for which comparison against Complex is not defined, the operands are automatically swithced:


In [34]:
"a" < c2


Out[34]:
True

Defining __gt__ does not automatically define __lt__:


In [35]:
# "a" > c2  # raises TypeError

Assignment operator

  • the assignment operator (=) cannot be overridden
  • it performs reference binding instead of copying
  • tightly bound to the garbage collector

Other useful overloads

Attributes can be set, get and deleted. 4 magic methods govern these:

  1. __setattr__: called when we set an attribute,
  2. __delattr__: called when we delete an attribute using del or delattr
  3. __getattribute__: called when accessing attributes
  4. __getattr__: called when the 'usual' attribute lookup fails (for example the attribute is not present in the object's namespace

In [36]:
class Noisy(object):
    def __setattr__(self, attr, value):
        print("Setting [{}] to value [{}]".format(attr, value))
        super().__setattr__(attr, value)
        
    def __getattr__(self, attr):
        print("Getting (getattr) [{}]".format(attr))
        super().__getattr__(attr) 
        
    def __getattribute__(self, attr):
        print("Getting (getattribute) [{}]".format(attr))
        super().__getattribute__(attr)
        
    def __delattr__(self, attr):
        print("You wish")

getting an attribute that doesn't exist yet calls

  1. getattribute first, which calls the base class' getattribute which fails
  2. getattr is called.

In [37]:
a = Noisy()
try:
    a.dog
except AttributeError:
    print("AttributeError raised")


Getting (getattribute) [dog]
Getting (getattr) [dog]
AttributeError raised

setting an attribute


In [38]:
a.dog = "vau"  # equivalent to setattr(a, "dog", "vau")


Setting [dog] to value [vau]

getting an existing attribute


In [39]:
a.dog  # equivalent to getattr(a, "dog")


Getting (getattribute) [dog]

modifying an attribute also calls __setattr__


In [40]:
a.dog = "Vau"  # equivalent to setattr(a, "dog", "Vau")


Setting [dog] to value [Vau]

deleting an attribute


In [41]:
del a.dog  # equivalent to delattr(a, "dog")


You wish

Dictionary-like behavior can be achieved by overloading []

We also define __iter__ to support iteration.


In [42]:
class DictLike(object):
    def __init__(self):
        self.d = {}
        
    def __setitem__(self, item, value):
        print("Setting {} to {}".format(item, value))
        self.d[item] = value
        
    def __getitem__(self, item):
        print("Getting {}".format(item))
        return self.d.get(item, None)
    
    def __iter__(self):
        return iter(self.d)
    
d = DictLike()
d["a"] = 1
d["b"] = 2

for k in d:
    print(k)


Setting a to 1
Setting b to 2
a
b

Shallow copy vs. deep copy

There are 3 types of assignment and copying:

  1. the assignment operator (=) creates a new reference to the same object,
  2. copy performs shallow copy,
  3. deepcopy recursively deepcopies everything.

The difference between shallow and deep copy is only relevant for compound objects.

Assignment operator


In [43]:
l1 = [[1, 2], [3, 4, 5]]
l2 = l1
id(l1[0]) == id(l2[0])


Out[43]:
True

In [44]:
l1[0][0] = 10
l2


Out[44]:
[[10, 2], [3, 4, 5]]

Shallow copy


In [45]:
from copy import copy

l1 = [[1, 2], [3, 4, 5]]
l2 = copy(l1)
id(l1) == id(l2), id(l1[0]) == id(l2[0])


Out[45]:
(False, True)

In [46]:
l1[0][0] = 10
l2


Out[46]:
[[10, 2], [3, 4, 5]]

Deep copy


In [47]:
from copy import deepcopy

l1 = [[1, 2], [3, 4, 5]]
l2 = deepcopy(l1)
id(l1) == id(l2), id(l1[0]) == id(l2[0])


Out[47]:
(False, False)

In [48]:
l1[0][0] = 10
l2


Out[48]:
[[1, 2], [3, 4, 5]]

Both can be defined via magic methods

  • note that these implementations do not check for infinite loops

In [49]:
from copy import copy, deepcopy

class ListOfLists(object):
    def __init__(self, lists):
        self.lists = lists
        self.list_lengths = [len(l) for l in self.lists]
        
    def __copy__(self):
        print("ListOfLists copy called")
        return ListOfLists(self.lists)
        
    def __deepcopy__(self, memo):
        print("ListOfLists deepcopy called")
        return ListOfLists(deepcopy(self.lists))
        
l1 = ListOfLists([[1, 2], [3, 4, 5]])
l2 = copy(l1)
l1.lists[0][0] = 12
print(l2.lists)
l3 = deepcopy(l1)


ListOfLists copy called
[[12, 2], [3, 4, 5]]
ListOfLists deepcopy called

However, these are very far from complete implementations. We need to take care of preventing infinite loops and support for pickling (serialization module).

Object creation and destruction: the __new__ and the __del__ method

The __new__ method is called to create a new instance of a class. __new__ is a static method that takes the class object as a first parameter.

Typical implementations create a new instance of the class by invoking the superclass’s __new__() method using super(currentclass, cls).__new__(cls[, ...]) with appropriate arguments and then modifying the newly-created instance as necessary before returning it.

__new__ has to return an instance of cls, on which __init__ is called.

The __del__ method is called when an object is about to be destroyed. Although technically a destructor, it is handled by the garbage collector. It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits.


In [50]:
class A(object):
    
    @classmethod
    def __new__(cls, *args, **kwargs):
        instance = super().__new__(cls)
        print("A.__new__ called")
        return instance
    
    def __init__(self):
        print("A.__init__ called")
        
    def __del__(self):
        print("A.__del__ called")
        try:
            super(A, self).__del__()
        except AttributeError:
            print("parent class does not have a __del__ method")
        
        
a = A()
del a


A.__new__ called
A.__init__ called
A.__del__ called
parent class does not have a __del__ method

Object introspection

  • support for full object introspection
  • dir lists every attribute of an object

In [51]:
class A(object):
    var = 12
    def __init__(self, value):
        self.value = value
        
    def foo(self):
        print("bar")
  
", ".join(dir(A))


Out[51]:
'__class__, __delattr__, __dict__, __dir__, __doc__, __eq__, __format__, __ge__, __getattribute__, __gt__, __hash__, __init__, __init_subclass__, __le__, __lt__, __module__, __ne__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__, __weakref__, foo, var'

Class A does not have a value attribute, since it is bounded to an instance. However, it does have the class global var attribute.

An instance of A has both:


In [52]:
", ".join(dir(A(12)))


Out[52]:
'__class__, __delattr__, __dict__, __dir__, __doc__, __eq__, __format__, __ge__, __getattribute__, __gt__, __hash__, __init__, __init_subclass__, __le__, __lt__, __module__, __ne__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__, __weakref__, foo, value, var'

isinstance, issubclass


In [53]:
class A(object):
    pass

class B(A):
    pass

b = B()
a = A()

print(isinstance(a, A))
print(isinstance(a, B))
print(isinstance(b, A))
print(isinstance(b, object))


True
False
True
True

Every object has a __code__ attribute, which contains everything needed to call the function.


In [54]:
def evaluate(x):
    a = 12
    b = 3
    return a*x + b
    
print(evaluate.__code__)
#dir(evaluate.__code__)


<code object evaluate at 0x7f986c152540, file "<ipython-input-54-e5dc2d2bcdd5>", line 1>

In [55]:
evaluate.__code__.co_varnames, evaluate.__code__.co_freevars, evaluate.__code__.co_stacksize


Out[55]:
(('x', 'a', 'b'), (), 2)

The inspect module provides further code introspection tools, including the getsourcelines function, which returns the source code itself.


In [56]:
from inspect import getsourcelines

getsourcelines(evaluate)


Out[56]:
(['def evaluate(x):\n', '    a = 12\n', '    b = 3\n', '    return a*x + b\n'],
 1)

Class decorators

Many OO features are achieved via a syntax sugar called decorators. We will talk about decorators in detail later.

The most common features are:

  1. staticmethod,
  2. classmethod,
  3. property.

Static methods

  • defined inside a class but not bound to an instance (no self parameter)
  • analogous to C++'s static methods

In [57]:
class A(object):
    instance_count = 0
    
    def __init__(self, value=42):
        self.value = value
        A.increase_instance_count()
        
    @staticmethod
    def increase_instance_count():
        A.instance_count += 1
        
        
a1 = A()
print(A.instance_count)
a2 = A()
print(A.instance_count)


1
2

Class methods

  • bound to the class instead of an instance of the class
  • first argument is a class instance
    • called cls by convention
  • typical usage: factory methods for the class

Let's create a Complex class that can be initialized with either a string such as "5+j6" or with two numbers.


In [58]:
class Complex(object):
    def __init__(self, real, imag):
        self.real = real
        self.imag = imag
        
    def __str__(self):
        return '{0}+j{1}'.format(self.real, self.imag)
    
    @classmethod
    def from_str(cls, complex_str):
        real, imag = complex_str.split('+')
        imag = imag.lstrip('ij')
        print("Instantiating {}".format(cls.__name__))
        return cls(float(real), float(imag))

class ChildComplex(Complex): pass

c1 = Complex.from_str("3.45+j2")
print(c1)
c2 = Complex(3, 4)
print(c2)
c1 = ChildComplex.from_str("3.45+j2")


Instantiating Complex
3.45+j2.0
3+j4
Instantiating ChildComplex

Properties

  • attributes with getters, setters and deleters

Properties are attributes with getters, setters and deleters. Property works as both a built-in function and as separate decorators.


In [59]:
class Person(object):
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    @property
    def age(self):
        return self._age
    
    @age.setter
    def age(self, age):
        try:
            if 0 <= age <= 150:
                self._age = age
        except TypeError:
            pass
            
    def __str__(self):
        return "Name: {0}, age: {1}".format(self.name, self.age)
            

p = Person("John", 12)
print(p)
p.age = "abc"
print(p)
p.age = 85
print(p)


Name: John, age: 12
Name: John, age: 12
Name: John, age: 85

In [60]:
p = Person("Pete", 17)
",".join(dir(p))


Out[60]:
'__class__,__delattr__,__dict__,__dir__,__doc__,__eq__,__format__,__ge__,__getattribute__,__gt__,__hash__,__init__,__init_subclass__,__le__,__lt__,__module__,__ne__,__new__,__reduce__,__reduce_ex__,__repr__,__setattr__,__sizeof__,__str__,__subclasshook__,__weakref__,_age,age,name'

Multiple inheritance

  • no interface inheritance in Python
  • since every class subclasses object, the diamond problem is present
  • method resolution order (MRO) defines the way methods are inherited
    • very different between old and new style classes

In [61]:
class A(object):
    def __init__(self, value):
        print("A init called")
        self.value = value
        
class B(object):
    def __init__(self):
        print("B init called")

class C(A, B):
    def __init__(self, value1, value2):
        print("C init called")
        self.value2 = value2
        super(C, self).__init__(value1)
        
class D(B, A): pass
        
print("Instantiating C")
c = C(1, 2)
print("Instantiating D")
d = D()


Instantiating C
C init called
A init called
Instantiating D
B init called