Object-oriented programming may seem like some big scary thing that only professional programmers do. While it can be complicated, it doesn't have to be. Understanding objects is key to being able to write well-organized code (and read others' code!), especially in Python.
OOP is a programming paradigm, that is, an entire approach to programming. Although Python is multi-paradigm, it is heavily object-oriented (we will see later that has some elements of functional programming as well, which is in some respects the antithesis of OOP). Other languages like Java are entirely object-oriented.
What is an object? An object combines state and behavior. In Python, an object is something that you can put a dot after and then do something specific to that object. In Python, everything is an object.
Consider a list. A list has state, its contents. And it has behavior, for example you can append to it.
In [ ]:
class NothingObject:
pass
Not very interesting? Let's still see what we can do with it.
In [ ]:
first_one = NothingObject()
second_one = NothingObject()
first_one == second_one
In [ ]:
another_name_for_the_first_one = first_one
first_one == another_name_for_the_first_one
In [ ]:
first_one is another_name_for_the_first_one
In [ ]:
first_one = NothingObject()
first_one is another_name_for_the_first_one
OK, not very interesting. We can make them, and we can check if they're the same.
Let's try something more interesting. Imagine we're doing a physics simulation of a rocket ship.
In [ ]:
class Rocket:
"""
Simulation of a rocket.
"""
def __init__(self, x0):
self.x = x0
self.y = 0 # rockets start on the ground
The first thing you usually do with a class is write an __init__()
method.
This is the the method that gets automatically called when you make a new object.
We've defined the class Rocket
, but to actually make a rocket we have to call it:
In [ ]:
my_rocket = Rocket()
Oops.
In [ ]:
my_rocket = Rocket(0)
What we see here is that self
is always the first argument of a method.
In the case of __init__()
, the other arguments are what is required to make a new object.
In this case we need to provide x0
every time we make a new Rocket
.
So far our rocket only has two pieces of data, which we can access with the dot:
In [ ]:
print(my_rocket.x, my_rocket.y)
We can also change them:
In [ ]:
my_rocket.x = 3
print(my_rocket.x, my_rocket.y)
Let's pause for some terminology.
An object attribute is a piece of data associated with an object.
Right now our rocket has attributes x
and y
.
An object method is a function associated with the object.
Right now we just have __init__()
.
Remember I said that everything in Python is an object?
Rocket
is just as much an object as my_rocket
is.
To distinguish them we can call Rocket
a class and my_rocket
an instance object.
Let's see:
In [ ]:
type(my_rocket)
(__main__
is a stand-in module name for things that didn't come from a module but rather from the command-line or notebook.
In short, it's the root namespace.)
In [ ]:
type(Rocket)
Rocket
is itself a type!
That's a little weird.
But you can think of classes as defining new types.
Rocket
is in the same category of things as str
and list
:
In [ ]:
type(str)
So far our Rocket
has state (x
and y
) but no behavior. Let's change that.
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
In [ ]:
# We need to re-instantiate my_rocket
# since it was created with the old class
my_rocket = Rocket(0)
my_rocket.move_up()
my_rocket.move_up(3)
print(my_rocket.y)
We can create as many instances of Rocket
as we want; they will be completely separate.
In [ ]:
fleet = [Rocket(x) for x in range(-2, 3)]
Let's prove that they're attributes are separate. First let's make a function for displaying the fleet position.
In [ ]:
def print_fleet_position(rocket_fleet):
for idx, rocket in enumerate(rocket_fleet):
print('Rocket {i}: ({x:2}, {y})'.format(
i=idx, x=rocket.x, y=rocket.y))
In [ ]:
print_fleet_position(fleet)
Now we can see what happens if we change just one of them.
In [ ]:
fleet[3].move_up(2)
print_fleet_position(fleet)
In [ ]:
import random
for t in range(1, 6):
print('t = {}'.format(t))
print('-----------------')
# Choose at random one rocket to move
random.choice(fleet).move_up()
print_fleet_position(fleet)
print()
Exercise: Add a method to Rocket
that takes one input, another Rocket
, and returns True
if the rockets are at the same location.
I've made a template.
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
pass
Scroll down for a solution...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
# Just as good:
# self.x == other_rocket.x and self.y == other.rocket.y
In [ ]:
class SpaceShuttle(Rocket):
def __init__(self, x0, trips_completed=0):
super().__init__(x0)
self.trips_completed = trips_completed
def complete_trip(self):
self.y = 0
self.trips_completed += 1
In this example, Rocket
is the superclass or parent class of SpaceShuttle
.
Here we override the __init__()
method from Rocket
.
That means we replace it with our own method.
Note the call to super()
though.
That returns the superclass.
So, here we call the __init__()
method of the superclass, before making our own modifications.
In general, when we override the __init__()
method, we usually call super().__init__()
.
Let's try our shuttle:
In [ ]:
my_shuttle = SpaceShuttle(0)
my_shuttle.move_up(10)
print('({}, {:2})'.format(my_shuttle.x, my_shuttle.y))
print('trips: {}'.format(my_shuttle.trips_completed))
my_shuttle.complete_trip()
print('({}, {:2})'.format(my_shuttle.x, my_shuttle.y))
print('trips: {}'.format(my_shuttle.trips_completed))
Can our shuttle interact with regular old rocket?
In [ ]:
my_rocket = Rocket(0)
my_shuttle.is_colliding(my_rocket)
And just to be sure, it works the other way as well:
In [ ]:
my_rocket.is_colliding(my_shuttle)
In [ ]:
my_rocket.move_up()
my_rocket.is_colliding(my_shuttle)
Remember that the buit-in types are sort of like classes too?
We can inherit from them as well.
This is how a lot of the objects in collections
work, like Counter
that we saw earlier.
Other useful objects there are defaultdict
and OrderedDict
which inherit from dict
, and deque
which inherits from list
.
Let's make our own type of dict
:
In [ ]:
class HackedDict(dict):
def get(self, key, default=None):
return 'HACKED!'
my_dict = HackedDict({'a': 1, 'b': 2})
my_dict.get('a')
Magic methods are methods that start and end with two underscores.
They're magic because Python will call them for us when they're needed and we rarely call them directly.
The most common magic method is __init__()
.
Note that we defined it, and it was definitely run, but we never called it directly.
Magic methods tell Python how it should handle your object in certain situations.
Let's see what happens when we print our Rocket
s:
In [ ]:
for rocket in fleet:
print(rocket)
That's ugly! It just gives us the type name and the memory address.
We can spruce it up with a new magic method __str__()
, which tells Python what the string representation of the object should be.
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
def __str__(self):
return 'Rocket at ({}, {})'.format(self.x, self.y)
In [ ]:
new_rocket = Rocket(1)
print(new_rocket)
I like to think of magic methods as providing affordance (or maybe effectivities...).
What does the object afford?
So far rockets afford instantiating (with __init__()
) and displaying (with __str__()
).
What if we want to add rockets?
In [ ]:
new_rocket + 2
Fair enough, Python doesn't like to guess. But we can be explicit. What should we have it do when we add to it?
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
def __str__(self):
return 'Rocket at ({}, {})'.format(self.x, self.y)
def __add__(self, other):
new_rocket = Rocket(self.x)
new_rocket.y = self.y + other
return new_rocket
In [ ]:
new_rocket = Rocket(0)
print(new_rocket)
print(new_rocket + 2)
Notice how in __add__
we create a new Rocket
from scratch.
That's because rocket + 2
shouldn't necessarily change the original rocket
(cf. 1 + 2
).
If we wanted to change the original we might do rocket += 2
:
In [ ]:
new_rocket += 2
print(new_rocket)
Python figured this out because it expanded the operation to
new_rocket = new_rocket + 2
and it knew what to do with the right hand side. But it's still creating a new rocket somewhere in there:
In [ ]:
rocket = Rocket(0)
same_rocket = rocket
rocket is same_rocket
In [ ]:
rocket += 2
rocket is same_rocket
In [ ]:
print(rocket)
print(same_rocket)
We didn't actually change the original rocket at all! We just made a knew one and renamed it with the old variable name. This would screw up our shuttles for example, because they would lose their count of trips completed.
We could get around this with another magic method, __iadd__()
:
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
def __str__(self):
return 'Rocket at ({}, {})'.format(self.x, self.y)
def __iadd__(self, other):
self.y += other
return self
In [ ]:
rocket = Rocket(10)
print(rocket)
rocket += 1
print(rocket)
That's so simple we might decide the confusion with the original __add__()
method is not worth keeping it around.
This is what we really wanted to do all along
Are single numbers all we might want to add? What if we want to move the rocket in two dimensions?
In [ ]:
rocket += (2, 2)
Let's make fixing that an exercise... edit the next cell and use the one after to try it out.
This one's a bit tricky.
Hint: What you want to do depends on what sort of thing other
is.
You can use isinstance
to test this out. Look up its signature first:
In [ ]:
isinstance?
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
def __str__(self):
return 'Rocket at ({}, {})'.format(self.x, self.y)
def __iadd__(self, other):
self.y += other
return self
In [ ]:
rocket = Rocket(0)
print(rocket)
rocket += (2, 2)
print(rocket)
rocket += 3
print(rocket)
You know the drill...
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
In [ ]:
class Rocket:
def __init__(self, x0):
self.x = x0
self.y = 0
def move_up(self, distance=1):
self.y += distance
def is_colliding(self, other_rocket):
return (self.x, self.y) == (other_rocket.x, other_rocket.y)
def __str__(self):
return 'Rocket at ({}, {})'.format(self.x, self.y)
def __iadd__(self, other):
if isinstance(other, (int, float)):
self.y += other
elif isinstance(other, (list, tuple)) and len(other) == 2:
self.x += other[0]
self.y += other[1]
else:
raise TypeError("Can't add Rocket to {}".format(other))
return self
Awesome. Now our rockets afford adding. There are lots of other magic methods. You will use them more than you will write your own, but it's always nice to know what's going on behind the scenes.
Remember how Counter
did something sort of unexpected when we subracted a dict from it?
We can thank __sub__()
for that.
Same with the operations we saw last week on sets.
There are lots of other magic methods.
For example, if you define the methods __len__()
, __getitem__()
, __setitem__()
, and __delitem__()
, you can create an object that you can use indexing with (i.e. the square braces []
) like lists and dicts.
Create a Stack
class.
Let's define a stack as a last-in-first-out data structure.
Imagine a stack of cafeteria trays.
The first one in is going to be the last one out.
The last one put on top is going to be the first one taken out.
*items
in __init__()
so the Stack
can be initialized with any number of items.push()
method that adds an item to the top of the stack.pop()
method that removes an item from the top of the stack and returns it.peek()
method that returns the top item without removing it.isempty()
method that returns True
if the stack is empty.__str__()
method that prints the items.I almost forgot.
What's composition?
It's an alternative to inheritance.
It's a simple pattern where one data structure is an attribute of another data structure.
In this exercise, you might be tempted to create a subclass of list
.
But it usually easier and more flexible to use composition.
Your Stack
should have an attribute, named items
perhaps, which is a list of the items.
The idea is that the user doesn't interact with items
directly but instead through the methods defined above.
In [ ]:
# Write your class here
In [ ]:
# Test it here
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
In [ ]:
class Stack:
def __init__(self, *items):
self.items = list(items) # Otherwise we have an immutable tuple
def push(self, item):
self.items.append(item)
def pop(self):
return self.items.pop()
def peek(self):
return self.items[-1]
def isempty(self):
return len(self.items) == 0
def __str__(self):
return str(self.items)
In [ ]:
stack = Stack(1, 2, 3)
stack.pop()
In [ ]:
stack.push('asd')
stack.peek()
In [ ]:
stack.pop()
In [ ]:
while not stack.isempty():
print(stack)
print(stack.pop())
print(stack)
Create a list
subclass that lets you use any integer to grab an item, using modular arithmetic.
For example, imagine we have a list of length 3.
[3]
would usually cause an error, because [2]
gets the last element.
But let's say we would want [3]
to get the first element, and [6]
, [9]
, etc.
Hint: override the methods __getitem__
, __setitem__
, and __delitem__
.
Since there will be some similar logic in all of these, it maybe useful to make a function or method that converts the index before calling the super()
version of the method.
In [ ]:
class ModulusList(list):
def __getitem__(self, item):
result = super().__getitem__(item)
return result
def __setitem__(self, item, value):
super().__setitem__(item, value)
def __delitem__(self, item):
super().__delitem__(item)
In [ ]:
# Test getitem
mlist = ModulusList([1, 2, 3])
print(mlist[0])
print(mlist[3])
In [ ]:
# Test delitem
mlist = ModulusList([1, 2, 3])
print(mlist)
del mlist[5]
print(mlist)
In [ ]:
# Test setitem
mlist = ModulusList([1, 2, 3])
mlist[0] = 10
print(mlist)
mlist[1325126] = 25
print(mlist)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
In [ ]:
class ModulusList(list):
def _convert_idx(self, idx):
return idx % len(self)
def __getitem__(self, item):
result = super().__getitem__(self._convert_idx(item))
return result
def __setitem__(self, item, value):
super().__setitem__(self._convert_idx(item), value)
def __delitem__(self, item):
super().__delitem__(self._convert_idx(item))
Thanks to modular arithmetic, this works out of the box with negative indices.
In [ ]:
mlist = ModulusList([1, 2, 3])
print(mlist[-1])
print(mlist[-23616])
It won't however, work for slices:
In [ ]:
mlist[:2]
That would require another method to convert slices, and some clauses like if isinstance(item, slice)
to decide which converter to use.
Also note above that I named the method _convert_idx()
starting with an underscore.
That's Python convention for "this is an private method.
I need it inside my class but if you're interacting with this object from the outside you shouldn't be using it."
Unlike in other languages, this isn't enforced. The Pythonism is "we're all consenting adults." If someone wants to really mess around with the inner guts of my class, they're welcome to. The underscore is just a warning.
Food for thought: Why is inheritance easier than composition in this example?
Note that in general, we should almost always choose composition.
http://en.wikipedia.org/wiki/Autoregressive_model
$$ X_t = \sum_{i=1}^p a_i X_{t-i} + \epsilon(t) $$where $a_1,\cdots,a_p$ are the model parameters, and $\epsilon(t)$ is white noise with mean 0 and standard deviation $\sigma$.
(Each lagged element gets multiplied by a parameter)
In [ ]:
from collections import deque
from numpy.random import randn # Random from normal distribution
class AR:
def __init__(self, a, x_init=None, sigma=1):
self.p = len(a)
self.a = a
self.sigma = sigma
self.x = deque(x_init or [0], maxlen=self.p)
# Fill up self.x if necessary.
while len(self.x) < self.p:
self.x.append(self.x[-1])
def __iter__(self):
return self
def __next__(self):
noise = randn() * self.sigma
deterministic = sum(a*x for a, x in zip(self.a, reversed(self.x)))
self.x.append(deterministic + noise) # Record the history.
return self.x[-1]
In [ ]:
import itertools
import matplotlib.pyplot as plt
%matplotlib inline
def simulate_AR(n, *args, **kwargs):
model = AR(*args, **kwargs)
x = list(model.x) # Initialize with existing history.
while len(x) < n:
x.append(next(model))
return x
plt.plot(simulate_AR(100, [0.5]))