Classes

One of the main features in the Python programming language is its object oriented structure. Thus, beside procedual programming (scripting) it's also possible to use Python for object oriented Programming (OOP).
In a nutshell, everything in Python is an object and can be understood as an instance of a specific class. Therefore, a class is like a blueprint of how an object is structured and how it should behave. With that in mind, learning to write your own custom classes means implementing more or less any functionality into Python that you can think of. Nowadays, there is hardly anything in the field of information science, that is not implemented in Python.

Let's have a look at some of the objects.


In [ ]:
f = open('afile.txt', 'w')
print(f)
print(f.__class__)
print(type(f))
print(f.readline)
f.close()

The file object is already implemented in Python, just like thousands of other classes, therefore we do not have to bother with reading and writing files in Pthon. Therefore, let's have a look at defining our own classes.
A class can be defined using the class statement followed by a class name. This is very similar to def. Everything inside the class namespace is now part of that class. The shortest possible class does now define nothing inside the namespace (and will therefore have no attributes and no functionality). Nevertheless, it can be instantiated and a reference to the class instance can be assigned to a variable.


In [ ]:
# define class
class Car:
    pass

# create two instances
vw = Car()
audi= Car()

print('vw: ', type(vw), 'audi: ', type(audi))
print('vw: ', vw.__class__, 'audi: ', audi.__class__)
print('vw: ', str(vw), 'audi: ', str(audi))

Methods

The shown class Car is not really useful. But we can define functions inside the class namespace. These functions are called methods. To be correct here: they are called instance methods and should not be confused with class methods, which will not be covered here.
Although, we did not define methods so far, there are already some methods assigned to Car, which Python created for us. These very generic methods handle the return of the type or str function if invoked on a Car instance.
We will first focus on a special method, the __init__. This method is already defined, but doesn't do anything. But we can do that and fill the method. It will be called on object instantiation. This way we can set default values and define what a Car instance should look like after creation.

Let's define an actual speed and maximum speed for our car, because this is what a car needs.


In [ ]:
# redefine class
class Car:
    def __init__(self):
        self.speed = 0
        self.max_speed = 100

# create two instances
vw = Car()
audi = Car()
print('vw: speed: %d max speed: %d' % (vw.speed, vw.max_speed))
print('audi: speed: %d max speed: %d' % (audi.speed, audi.max_speed))

audi.max_speed = 250
audi.speed = 260
vw.speed = - 50.4

print('vw: speed: %d max speed: %d' % (vw.speed, vw.max_speed))
print('audi: speed: %d max speed: %d' % (audi.speed, audi.max_speed))

This is better, but still somehow wrong. A car should not be allowed to drive faster than the maximum possible speed. A Volkswagen might not be the best car in the world, but it can do definitely better than negative speeds. A better approach would be to define some methods for accelerating and decelerating the car.
Define two methods accelerate and decelerate that accept a value and set the new speed for the car. Prevent the car from negative speeds and stick to the maximum speed.


In [ ]:
# redefine class
class Car:
   pass 
   
vw = Car()
print(vw.speed)
vw.accelerate(60)
print(vw.speed)
vw.accelerate(45)
print(vw.speed)
vw.decelerate(10)
print(vw.speed)
vw.decelerate(2000)
print(vw.speed)

Magic Methods

Maybe you recognized the two underscores in the __init__ method. A defined set of function names following this name pattern are called magic methods in Python, because they are influcencing the object behaviour using magic. Beside __init__ two other very important magic methods are __repr__ and __str__.
The return value of __str__ defines the string representation of the object instance. This way you can define the return value whenever str is called on an object instance. The __repr__ method is very similar, but returns the object representation. Whenever possible, the object shall be recoverable from this returned string. However, with most custom classes this is not easily possible and __repr__ shall return a one line string that clearly identifies the object instance. This is really useful for debugging your code.


In [ ]:
print('str(vw) old:' , str(vw))

class Car:
    pass
    

vw = Car()
vw.accelerate(45)
print('str(vw) new:', str(vw))

Using these functions, almost any behaviour of the Car instance can be influenced. Imagine you are using it in a conditional statement and test two instances for equality or if one instance is bigger than the other one.

  • Are these two variables equal if they reference exactly the same instance?
  • Are they equal in case they are of the same model
  • Is one instance bigger in case it's actually faster?
  • or has the higher maximum speed?

Let's define a new attribute model, which is requested by __init__ as an argument. Then the magic method __eq__ can be used to check the models of the two instances.
The __eq__ method can be defined like: __eq__(self, other) and return either True or False.


In [ ]:
class Car:
    pass

vw = Car('vw')
vw2 = Car('vw')
audi = Car('audi')

print('vw equals vw2?   ',vw == vw2)
print('vw equals vw?    ',vw == vw)
print('vw equals audi?  ', vw == audi)
print('is vw exactly 9? ', vw == 9)

private methods and attributes

The Car class has two methods which are meant to be used for mainpulating the actual speed. Nevertheless, one could directly assign new values, even of other types than integers, to the speed and max_speed attribute. Thus, one would call these attributes public attributes, just like accelerate and decelerate are public methods. This implies to other developers, 'It's ok to directly use these attributes and methods, that's why I putted them there.'


In [ ]:
vw = Car('audi')
print('Speed: ', vw.speed)
vw.speed = 900
print('Speed: ', vw.speed)
vw.speed = -11023048282
print('Speed: ', vw.speed)
vw.speed = Car('vw')
print('Speed: ', vw.speed)

Consequently, we want to protect this attribute from access from outside the class itself. Other languages use the keyword private to achieve this. Here, Python is not very explicit, as it does not define a keyword or statement for this. You'll have to prefix your attribute or method name with double underscores. Renaming Car.speed to Car.__speed will therefore not work like shown above.

As the user or other developers cannot access the speed anymore, we have to offer a new interface for accessing this attribute. We could either define a method getSpeed returning the actual speed or implement a so called property. This will be introduced in a later shown example.
Note: Some jupyter notebooks allow accessing a protected attribute, but your Python console won't allow this.


In [ ]:
class Car:
    pass


vw = Car('vw')
vw.accelerate(45)
print(vw)
vw.decelerate(20)
print(vw)
print(vw.getSpeed())

class attributes

All attributes and methods defined so far have one thing in common. They are bound to the instance. That means you can only access or invoke them using a reference to this instance. In most cases this is exactly what you want and would expect, as altering one instance won't influence other class instances. But in some cases this is exactly the desired behaviour. A typical example is counting object instances. For our Car class this would mean an attribute storing the current amount of instanciated cars. It is not possible to implement this using instance attibutes and methods.
One (bad) solution would be shifting the declaration of Car from the global namespace to a function returning a new car instance. Then the function could increment a global variable. The downside is, that destroyed car instances won't decrement this global variable. A function like this would, by the way, be called a ClassFactory in the Python world.
The second (way better) solution are using a class attribute. These attributes are bound to the class, not an instance of that class. That means all instances will operate on the same variable. In the field of data analysis one would implement a counter like this for example for counting the instances of a class handling large data amounts like a raster image. Then the amount of instances could be limited.


In [ ]:
class Car:
    pass


vw = Car('vw')
print(vw.count)
audi = Car('audi')
print(audi.count)


bmw = Car('bmw')
print('BMW:', bmw.max_speed)
print('VW:', vw.max_speed)
print('Audi:', audi.max_speed)
print(vw.count)

Inheritance

As a proper OOP language Python does also implement inheritance. This means, that one can define a class which inherits the attibutes and classes from another class. You can put other classes into the parenthesis of your class signature and the new class will inherit from these classes. One would call this new class a child class and the class it inherits from a parent class. Every of that child classes can of course inherit to as many children as needed. Then these children will inherit from its parent and all their parents.
In case a method or attribute gets re-defined, the child method or attribute will overwrtie the parent methods and attributes.
A real world example of this concept is the definition of a class that can read different file formats and transform the content into a inner-application special format. You could then first write a class that can do the transformation. Next, another class is defined inheriting from this base class. This class can now read all text files on a very generic level. From here different class can be defined, each one capable of exactly one specific text-based format, like a CSV or JSON reader. Now, each of these specific classes know all the methods from all prent classes and the transformation does not have to be redefined on each level. The second advantage is, that at a later point of time one could decide to implement a generic database reader as well. Then different database engine specific reader could be defined and again inherit all the transformation stuff.

Here, we will use this concept to write two inheriting class es VW and Audi, which both just set the model into a protected attribute.
How could this concept be extended?


In [ ]:
class VW(Car):
    def __init__(self):
        super(VW, self).__init__('vw')

class Audi(Car):
    def __init__(self):
        super(Audi, self).__init__('audi')
        
vw = VW()
audi = Audi()

vw.accelerate(40)
audi.accelerate(400)
print(vw)
print(audi)
print(vw == audi)
print(isinstance(vw, VW))
print(isinstance(vw, Car))

Property

Sometimes it would be really handy if an attribute could be altered or calculated before returning it to the user. Or even better: if one could make a function to behave like an attribute. That's exactly what a property does. These are methods with no other argument than self and therefore be executed without parentheses. Using a property like this enables us to reimplement the speed attribute. We're just using a property.
The property function is a built-in function that needs a function as only argument and returns exactly the same function again with the added property behaviour. In information science a function expecting another function, altering it and returing it back for usage are called decorators (a concept borrowed from Java). Decorating functions is in Python even easier as you can just use the decorator operator: @.


In [ ]:
class MyInt(int):
    def as_string(self):
        return 'The value is %s' % self

i = MyInt(5)
print(i.as_string())

class MyInt(int):
    @property
    def as_string(self):
        return 'The value is %s' % self
    
x = MyInt(7)
print(x.as_string)

In [ ]:
class Car:    
    pass

class VW(Car):
    def __init__(self):
        super(VW, self).__init__('vw')
        
vw = VW()
vw.accelerate(60)
print(vw.speed)

Property.setter

Obviously, the protectec __speed attribute cannot be changed and the speed property is a function and thus, cannot be set. In the example of the Car, this absolutely makes sense, but nevertheless, setting a property is also possible. This time the property function is defined again accepting an additional positional argument. This will be filled by the assigned value. The Decorator for the redefinition is the @property.setter function.


In [ ]:
class Model(object):
    def __init__(self, name):
        self.__model = self.check_model(name)
        
    def check_model(self, name):
        if name.lower() not in ('vw', 'audi'):
            return 'VW'
        else:
            return name.upper()
    
    @property
    def model(self):
        return self.__model
    
    @model.setter
    def model(self, value):
        self.__model = self.check_model(value)
        
car = Model('audi')
print(car.model)
car.model = 'vw'
print(car.model)
car.model = 'mercedes'
print(car.model)
setattr(car, '__model', 'mercedes')
print(car.model)