Object Oriented Programming

What is an Object?

First some semantics:

  • An object is essentially a container which holds some data, and crucially some associated methods for working with that data.
  • We define objects, and their behaviours, using something called a class.
  • We create objects by instantiating classes, so, objects are instances of classes.

Note, these are very similar to structures, with associated functions attached.

Why do we need objects?

This is all very nice, but why bother with the overhead and confusion of objects and classes? People have been working with functional programs for decades and they seem to work!

A few core ideas:

  • Modularity
  • Separation of concerns
  • Abstraction over complex mechanisms

We've used a lot of objects already!

Most of the code we've been using already has made heavy use of object-oriented programming:

  • NumPy arrays are objects (with attributes like shape and methods like mean())
  • Iris cubes are objects
  • CIS datasets are objects
  • Matplotlib axes/figures/lines etc. are all objects

Object-Oriented Programming in Python

In many languages we're forced into using classes and objects for everything (e.g. Java and C#), but some languages don't support objects at all (e.g. R and Fortran 77).

In python we have (in my opinion) a nice half-way house, we have a full OO implementation when we need it (including multiple inheritance, abstract classes etc), but we can use functional code when it's more desirable to do so.

Defining a class in Python is easy:


In [1]:
class A(object):
    pass

Note the reference to object, this means that our new class inherits from object. We won't be going into too much detail about inheritance, but for now you should always inherit from object when defining a class.

Once a class is defined you can create an instance of that class, which is an object. In Python we do this by calling the class name as if it were a function:


In [2]:
a_object = A()
print(type(a_object))


<class '__main__.A'>

A class can store some data (after all, an empty class isn't very interesting!):


In [3]:
class B(object):
    
    value = 1

We can access variables stored in a class by writing the name of the instance followed by a dot and then the name of the variable:


In [4]:
b_object = B()
print(b_object.value)


1

Classes can also contain functions. Functions attached to classes are called methods:


In [5]:
class B(object):
    
    value = 1
    
    def show_value(self):
        print('self.value is {}'.format(self.value))

The first argument to every method automatically refers to the object we're calling the method on, by convention we call that argument self.


In [6]:
b1 = B()
b1.show_value()
b1.value = 999
b1.show_value()


self.value is 1
self.value is 999

Notice we don't have to pass the self argument, Python's object system does this for you.

Some methods are called special methods. Their names start and end with a double underscore. A particularly useful special method is __init__, which initializes an object.


In [7]:
class C(object):
    
    def __init__(self, value):
        self.var = value

The __init__ method is called when we create an instance of a class. Now when we call the class name we can pass the arguments required by __init__:


In [8]:
c1 = C("Python!")
c2 = C("Hello")
print(c1.var)
print(c2.var)


Python!
Hello

Methods on an object have acces to the variables defined on the object:


In [9]:
class Counter(object):
    
    def __init__(self, start=0):
        self.value = start

    def increment(self):
        self.value += 1

In [10]:
counter1 = Counter()
print(counter1.value)
counter1.increment()
print(counter1.value)


0
1

In [11]:
counter2 = Counter(start=10)
counter2.increment()
counter2.increment()
print(counter2.value)


12

An example use case: EOF analysis in Python with eofs

The process of computing and analysing EOFs and related structures is non-trivial, and highly error prone.

Typically implemented as separate procedures to compute each required output (e.g. EOFs, associated time-series, projecting a field onto EOFs):

  • Cannot automatically ensure the internal-consistency of the analysis outputs
  • The user is responsible for keeping track of the integrity of the analysis (e.g. did you remember to remove weights after reconstruction?)

eofs resolves these problems by taking advantage of object-oriented design:

  • A solver object: encapsulates core information about the dataset being decomposed (input data + weights)
  • Method calls compute required quantities (e.g. EOFs, PC time series, projected fields)

This allows a user to produce a self-consistent decomposition of a dataset. This is not only convenient for the programmer as it removes a lot of tedious overheads, but also ensures correctness of the resulting quantities.

Summary

Object-oriented programming is a useful technique when dealing with complex data structures. In particular it can be used to hide complexity by grouping data and operations on the data together.

Furthermore, it can make your code more understandable and extensible.

We've only covered the very basics here. You may hear about "inheritance" and "polymorphism" in the context of object-oriented programming. These are interesting topics that may be of use to you, but don't let a lack of knowledge about these aspects hold you back from making use of objects, they are not essential to your use of object-oriented programming (at least not in Python).