Interfaces in Python

References

This tutorial is primarily inspired by chapters 9,11,12 in Luciano Ramalho's book "Fluent Python".


In [174]:
# imports
import math,decimal,random

Introduction

...or, a very brief overview of object-oriented programming and type systems.

Definition: objects are “a location in memory having a value and possibly referenced by an identifier.”

Objects have a type, which a classification scheme to reduce the probability of errors.

In the programming language context:

  • variables refer to objects
  • a class is a template for creating objects
  • instances of classes are often used to represent objects

There are important differences between types and classes, but we'll use them interchangably here.

In Python, objects are almost always instances of classes.

  • Python class definitions are class instances themselves

An object's class enumerates the object’s properties:

  • Identity (inheritance)
  • Attributes
  • Methods of interaction

Definition: an interface (or protocol) is an agreed-upon set of rules by which unrelated objects interact

The key question, in which Python offers an interesting choice: do we interact with objects according to their identity, or (some subset of) their attributes?

For efficient Python programming, it's important to understand the costs and benefits for your answer to this question.

"Traditional" Inheritence

Paradigm: an object’s capabilities are defined by its identity...its unique attributes and its parents’ attributes. Sets of related attributes are assigned to an object via inheritence.


In [101]:
class Animal:
    """ 
    the Animal class can be used to describe something that has a well-defined number of legs
    """
    n_legs = -1

In [102]:
a = Animal()
a.n_legs


Out[102]:
-1

At this point, I'm not too worried about the mechanism by which object attributes are set. However, if the thing represented by an Animal instance truly always has a well-defined number of legs, and that number doesn't change (no starfish, no apputation), then we should set this at object creation time.


In [103]:
class Animal:
    def __init__(self,n_legs=-1):
        """use the constructor's kw args 'n_legs' to set the number of legs"""
        self.n_legs = n_legs

In [104]:
cow = Animal(4)
cow.n_legs


Out[104]:
4

In [105]:
snake = Animal(0)
snake.n_legs


Out[105]:
0

To represent a more nuanced set of identities, we need to provide more classes. For example, pets have names, but are also animals.


In [106]:
class Pet(Animal):
    def __init__(self,name=None,n_legs=-1):
        self.name = name
        super().__init__(n_legs)

In [107]:
fido = Pet(name="Fido",n_legs=4)
print("The pet's name is " + fido.name + '.')
print("It has " + str(fido.n_legs) + " legs.")


The pet's name is Fido.
It has 4 legs.

We're interested in more than simple, variable attributes. What about interaction? Remember that encapsulation of data provides a more robust framework for abstracting operations.


In [108]:
class Cat(Pet):
    def make_a_sound(self):
        return "Meow"
class Dog(Pet):
    def make_a_sound(self):
        """return a random sound"""
        sounds = ['Arf','Grrrrrr']
        return sounds[round(random.random())]

In [110]:
pets = []
pets.append(Cat(name="Kitty"))
pets.append(Dog(name="Buddy"))
for pet in pets:
    print(pet.name + ' says "' + pet.make_a_sound() + '"')


Kitty says "Meow"
Buddy says "Arf"

A more realistic example

Add functionality via subclassing.


In [111]:
class ListOfThings:
    def __init__(self,x):
        self.things = x
    def get_the_things(self):
        return self.things
    
class OrderedListOfThings(ListOfThings):
    def get_the_things(self):
        return sorted(self.things)

In [113]:
a_list = ListOfThings([1,3,4,2])
a_list.get_the_things()


Out[113]:
[1, 3, 4, 2]

In [114]:
an_ordered_list = OrderedListOfThings([1,3,4,2])
an_ordered_list.get_the_things()


Out[114]:
[1, 2, 3, 4]

Problems

Problems with defining use solely by inheritence:

  • Multiple inheritence is hard. What is the method resolution order?
  • Rigid/brittle structure...what if a base class definition changes?
  • There are Python-specific issues with inheriting from builtin classes

Duck Typing

"Don’t check whether it is-a duck: check whether it quacks-like-a duck, walks-like-a duck, etc, etc, depending on exactly what subset of duck-like behavior you need to play your language-games with. (comp.lang.python, Jul. 26, 2000) — Alex Martelli"

The paradigm: classify and interact with objects according to their attributes, not according to their identity.

While Python is very much an object-oriented programming language (i.e. objects have identity, sometimes more than one), it broadly uses protocols rather than object identity to implement functionality.

Another way of making the contrast: in the traditional inheritence model, we enable an object to do a useful thing by specifying its identity. In Python, we start with the useful thing, and define how objects must behave to do that thing.


In [115]:
# simple example: make two objects

# this object has a clear sense of length
x = [4,3,2,1]

# what would the length of an integer be?
y = 3

In [116]:
len(x)


Out[116]:
4

In [117]:
len(y)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-117-cd3288d06d3d> in <module>()
----> 1 len(y)

TypeError: object of type 'int' has no len()

In the previous example, we see that some objects follow the length protocol, and some don't. Specifically, the length protocol defines a global function len, and the method by which it interfaces with objects...namely, their __len__ method.

With reference to the duck metaphor, the length protocol say (two different versions):

  • "If you are a thing that has size or length, then you should implement a __len__ method, so that unrelated objects know how to interact with you".
  • "If it acts like a thing with size or length, i.e. implements a __len__ method, then I know how to get its length."

In [118]:
# can we _force_ something to follow a protocol?

def my_identity_function(x):
    return x

my_identity_function('three')


Out[118]:
'three'

In [119]:
# now explicitly set the value of the __len__ attribute

setattr(my_identity_function,'__len__','my length!')
dir(my_identity_function)


Out[119]:
['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__kwdefaults__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

In [120]:
len(my_identity_function)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-120-25909769e695> in <module>()
----> 1 len(my_identity_function)

TypeError: object of type 'function' has no len()

[sad trombone]...functions types are defined in C and can't be truly modified, despite our modification of the object's namespace dictionary.

NOTE:

Be careful with modifying or subclassing Python's builtin types: str, int, float, list, dict, and the like, as well as functions, class definition objects, and other such objects. Let's spend a minute seeing how this fails, then we'll hop back to our attempt to make a modifiable integer.


In [122]:
# make a dict that replaces the value with a pair of the value

class DoppelDict(dict):
    def __setitem__(self, key, value):
        """__setitem__ is called by the [] operator"""
        super().__setitem__(key, [value] * 2)

# set one k,v pair via the constructor
dd = DoppelDict(one=1)
# set another k,v pair with the square bracket operators
dd['two'] = 2
dd


Out[122]:
{'one': 1, 'two': [2, 2]}

Because dict is a builtin class, it ignores attribute modifications applied via namespace changes.

Let's now try to define a length-y decimal object. To define a modifiable class, let's use Python's decimal package, which is designed to represent a decimal interface.


In [156]:
y = decimal.Decimal(5)
y


Out[156]:
Decimal('5')

In [157]:
len(y)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-157-cd3288d06d3d> in <module>()
----> 1 len(y)

TypeError: object of type 'decimal.Decimal' has no len()

Yay! It didn't work!


In [158]:
# now for our decimal with length

# arbitrarily define length the number of digits to the left of the decimal point
class LengthyDecimal(decimal.Decimal):
    def __len__(self):
        return math.floor(math.log10(self)) + 1

In [159]:
y = LengthyDecimal(5)
y


Out[159]:
Decimal('5')

In [160]:
# length is the integer representation of log10
len(y)


Out[160]:
1

In [161]:
y = LengthyDecimal(555.44)
len(y)


Out[161]:
3

Yay!

Now let's try an integer that follows the length protocol.


In [162]:
# let the Decimal class manage construction and all the other attributes
# enforce integer qualities only when __len__ is called

# arbitrarily define length as the log10 of the integer representation of the Decimal
class LengthyInteger(decimal.Decimal):
    def __len__(self):
        return int(math.log(int(self),10))
y = LengthyInteger(6)
print(len(y))
y = LengthyInteger(16)
print(len(y))


0
1

Nice! We made our object quack like a duck without defining it to be a duck. LengthyInteger/LengthDecimal doe not inherit from a class that provides the needed functionality.

You've seen an example of taking an object that does a thing, and modifying it to conform to an interface. But before you go off and start thinking about defining new interfaces, let's step back...

Interfaces and the python data model

General idea: cooperate with essential protocols as much as possible.

Essential protocols

Often defined in terms of global functions (len, print) acting on correspondingly named object attributes (__len__, __repr__).

Other examples:

  • callability: implement __call__

  • iterability, iterables, sequence are related protocols

  • a "file-like" object: implements the functions needed to read / write bytes-like data.

A Pythonic object

This example implements many common Python interfaces.


In [163]:
# copied directy from "Fluent Python", pp. 298-300

from array import array 
import reprlib
import math
import numbers
import functools
import operator
import itertools

class Vector: 
    typecode = 'd'
    def __init__(self, components):
        self._components = array(self.typecode, components)
    def __iter__(self):
        return iter(self._components)
    def __repr__(self):
        components = reprlib.repr(self._components) 
        components = components[components.find('['):-1] 
        return 'Vector({})'.format(components)
    def __str__(self):
        return str(tuple(self))
    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
            bytes(self._components))
    def __eq__(self, other):
        return (len(self) == len(other) and
                all(a == b for a, b in zip(self, other)))
    def __hash__(self):
        hashes = (hash(x) for x in self)
        return functools.reduce(operator.xor, hashes, 0)
    def __abs__(self):
        return math.sqrt(sum(x * x for x in self))
    def __bool__(self):
        return bool(abs(self))
    def __len__(self):
        return len(self._components)
    def __getitem__(self, index): 
        cls = type(self)
        if isinstance(index, slice):
            return cls(self._components[index])
        elif isinstance(index, numbers.Integral): return self._components[index]
        else:
            msg = '{.__name__} indices must be integers' 
            raise TypeError(msg.format(cls))
    
    shortcut_names = 'xyzt'

    def __getattr__(self, name): 
        cls = type(self)
        if len(name) == 1:
            pos = cls.shortcut_names.find(name) 
            if 0 <= pos < len(self._components):
                return self._components[pos]
        msg = '{.__name__!r} object has no attribute {!r}' 
        raise AttributeError(msg.format(cls, name))
    
    def angle(self, n):
        r = math.sqrt(sum(x * x for x in self[n:])) 
        a = math.atan2(r, self[n-1])
        if (n == len(self) - 1) and (self[-1] < 0):
            return math.pi * 2 - a 
        else:
            return a
    
    def angles(self):
        return (self.angle(n) for n in range(1, len(self)))
    
    def __format__(self, fmt_spec=''):
        if fmt_spec.endswith('h'): # hyperspherical coordinates
            fmt_spec = fmt_spec[:-1]
            coords = itertools.chain([abs(self)],
                                     self.angles())
            outer_fmt = '<{}>' 
        else:
            coords = self
            outer_fmt = '({})'
        components = (format(c, fmt_spec) for c in coords) 
        return outer_fmt.format(', '.join(components))
    
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])
        memv = memoryview(octets[1:]).cast(typecode) 
        return cls(memv)

So what should I do?

Don't...

  • Don't implement every possible interface for every object you build. Just do enough that it works.
  • Don't define new interfaces. The Python data model includes a very robust set of interface definitions.
  • Don't define new abstract base classes (unless you're building a brand new framework).

Type Tests

Don't test an object's type. Test its conformity to the protocol that matters. Use try blocks.


In [164]:
def object_to_str(obj):
    try:
        return str(obj)
    except TypeError:
        return "Don't know how to represent argument as a string"

In [165]:
object_to_str({'a':1,'b':[5]})


Out[165]:
"{'a': 1, 'b': [5]}"

In [166]:
object_to_str(open('tmp.txt','w'))


Out[166]:
"<_io.TextIOWrapper name='tmp.txt' mode='w' encoding='UTF-8'>"

Type Tests II

Do test an object's interface with an Abstract Base Class


In [175]:
from collections import abc
my_dict = {}
isinstance(my_dict, abc.Mapping)


Out[175]:
True

Object building

Construct an object so that it follows the protocols that define the desired functionality.