SQLAlchemy is an object relational mapper (ORM) for Python. An object relational mapper maps objects (defined by classes) to relational storage (i.e. a relational database management system like SQLite, MySQL or PostgreSQL). There is an excellent tutorial for SQLAlchemy. In Anaconda SQLAlchemy can be installed or updated with the command:
conda install sqlalchemy
We'll use SQLite as our database so ensure that that is also installed with:
conda install sqlite
SQLAlchemy broadly consists of three components:
Tools to model the types of objects you want to store
A query layer to express SQL operations in Python-type abstractions
A database engine to connect to relational database management systems
Let's start by importing the module and making a connection to the database, just to check that all is well:
In [1]:
import sqlalchemy
print sqlalchemy.__version__
In [2]:
from sqlalchemy import create_engine
engine = create_engine('sqlite:///:memory:', echo=True)
In [3]:
print engine
If you saw Engine(sqlite:///:memory:)
then you have made a connection to an in-memory SQLite database. The SQLAlchemy documentation has a guide to database URLs that explains how to connect to different types of database, but for now our in-memory DB is good enough.
The next component to explore is the way SQLAlchemy models are built. First, however, consider a SQL database structure, for example one storing a simple address book. It might contains entries for people and addresses. Since more than one person can share an address, a database that stores a many to one relationship between people and addresses can use two tables:
CREATE TABLE addresses (id INTEGER PRIMARY KEY,
number INTEGER NOT NULL,
street VARCHAR(100) NOT NULL,
suburb VARCHAR(100) NOT NULL,
postal_code INTEGER NOT NULL);
CREATE TABLE persons (id INTEGER PRIMARY KEY,
firstname VARCHAR(40) NOT NULL,
surname VARCHAR(60) NOT NULL,
address_id INTEGER NOT NULL,
FOREIGN KEY(address_id) REFERENCES addresses(id));
These tables naturally map to classes. The rows in a table are each instances, for example:
ID First Name Surname
----------------------------------------------------------------------------
1 Peter van Heusden
2 Alan Christoffels
and the columns are attributes of each instance. So let's create some classes to represent our database structure.
The classes with which SQLAlchemy models a database are all subclasses of a special class called the declarative base. Let's start with the Address class:
In [4]:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Integer, String, Column
Base = declarative_base()
class Address(Base):
__tablename__ = 'addresses'
id = Column(Integer, primary_key=True)
number = Column(Integer, nullable=False)
street = Column(String(100), nullable=False)
suburb = Column(String(100), nullable=False)
postal_code = Column(Integer, nullable=False)
def __repr__(self):
return "<Address(number={}, street='{}', suburb='{}', postal_code={})>".format(
self.number, self.street, self.suburb, self.postal_code)
Note a few things here:
There is no __init__
method. That gets inherited from the Base
class.
A __repr__
method is provided to provide a string representation of the object (aka. instance). The difference between the use of the __repr__
method and the __str__
method is explained in this StackOverflow answer.
The __tablename__
attribute tells SQLAlchemy what table to store objects of this type in.
Now let's use our new class.
In [5]:
address1 = Address(number=42, street="Robert Sobukwe Road",
suburb="Bellville", postal_code=7535)
print address1
print address1.street
print type(address1.street)
print address1.id
The Address
class is a model that describes an address and its attributes in a way that can be mapped to a table, but it is not yet connected to the database. Now we create the other part of our address book, the Person
class. This uses some extra functions to create a foreign key relationship between Person
and Address
.
In [6]:
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship, backref
class Person(Base):
__tablename__ = 'persons'
id = Column(Integer, primary_key=True)
firstname = Column(String(40), nullable=False)
lastname = Column(String(60), nullable=False)
address_id = Column(Integer, ForeignKey('addresses.id'),
nullable=False)
address = relationship('Address',
backref=backref('inhabitants', order_by=id))
def __repr__(self):
return "<Person(firstname='{}', lastname='{}', address='{}')>".format(
self.firstname, self.lastname, self.address)
The ForeignKey
constraint ensures that a person must be associated with an address before they can be stored in the phone book and the relationship
function allows the relationship between the addresses
and persons
tables is reflected in a relationship between objects of the Address
and Person
class.
In [15]:
SANBI = Address(number=42, street='Robert Sobukwe Road',
suburb='Bellville', postal_code=7535)
person1 = Person(firstname='Peter', lastname='van Heusden',
address=SANBI)
person2 = Person(firstname='Alan', lastname='Christoffels',
address=SANBI)
address2 = Address(number=64, street='Anton Fransch Street',
suburb='Makhaya', postal_code='5335')
person3 = Person(firstname='Titus', lastname='Brown', address=address2)
print "inhabitants of SANBI:", SANBI.inhabitants
None of these classes or objects are connected to database storage yet. To make the connections, first we need to create the tables in the database, using the database engine. To refresh, we need an engine to connect to the database. Then we use this together with our Base
class to create the tables we need. With echo=True
specified we will get a lot of feedback on what is happening in the database.
In [16]:
engine = create_engine('sqlite:///:memory:', echo=True)
Base.metadata.create_all(engine)
While the engine provides the mechanism for connecting to our database, to interact with the database we use a session. The sessionmaker
function takes an engine and creates a new class (in our example called Session
). The Session
class can then create database sessions on request.
In [17]:
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
In [18]:
session = Session()
print session
Using the our newly created session we can now add objects to the database. This doesn't actually do anything yet, but it sets things up for a future transaction. A transaction is a unit of work in a database management system. For example, a new person and their associated address can be added to the session in a single transaction so that either all or none of the data is added to the database. This ensures that the database moves from one consistent state to another consistent state and is never inconsistent.
In [19]:
session.add(SANBI)
session.add(person1)
session.add(person2)
session.add(person3)
session.add(address2)
The commit()
operation commits a transaction to the database. Note that in the generated SQL there is a BEGIN
statement, a series of INSERT
statements and finally a COMMIT
statement. The BEGIN
and COMMIT
statements denote the beginning and the end of the transaction. Only when you issue the commit()
are changes actually made to the database.
In [20]:
session.commit()
Also note that the id
field of the objects is now filled in. This field is computed at INSERT
time - it is a so-called autoincrement column in the database.
In [22]:
print "show SANBI id:", SANBI.id
Below we start a transaction by creating an object in the Address
class and add that to our session with session.add()
. We decide to abandon the transaction and instead of issuing a commit()
we issue a rollback()
. This effectively clears the current transaction.
In [23]:
address3 = Address(number='64', street='Owen Wilson Road',
suburb='Bellville', postal_code=7535)
In [24]:
session.add(address3)
In [25]:
session.rollback()
The example below creates an invalid Address
. It is missing the suburb
attribute (recall that this was defined with nullable=False
in the class definition). No errors occur until we try to commit()
this transaction to the database.
In [26]:
address4 = Address(number=53, street='Albertyn Road',
postal_code=7535)
In [27]:
session.add(address4)
In [28]:
session.commit()
Our session now contains invalid data (it has an IntegrityError
). We need to abandon this transaction with session.rollback()
before we can use the session object again.
In [30]:
session.rollback()
The session
object is also to query the database. The query is analogous to the SELECT
status in SQL and must specify what class of object we want returned (or, as we'll see below, what columns we are interested in). Used by itself query()
returns an iterator which we can use in a for
loop to loop through our results.
In [33]:
for person in session.query(Person):
print "PERSON:", person.firstname, person.lastname
As with SELECT
statements, we can use order_by
to define the order in which data is returned from our query.
In [35]:
for person in session.query(Person).order_by(Person.lastname):
print "PERSON:",person.firstname, person.lastname
Instead of getting an iterator back we can use .all()
to get a list of all results.
In [36]:
persons = session.query(Person).all()
In [37]:
print persons
And instead of asking for a complete object, we can specify columns we are interested in. The result then becomes a tuple of values (the values in the columns of interest) instead of an object.
In [38]:
for surname in session.query(Person.lastname).order_by(Person.lastname):
print "LAST NAME:", surname
The filter_by()
method allows simple filter operations on query results. In the example below we use an established Address
object to limit the list of Person
objects returned from the database.
In [49]:
SANBI_people = session.query(Person).filter_by(address=SANBI).all()
for person in SANBI_people:
print "PERSON:", person.firstname, person.lastname
The filter_by()
methods can be chained, resulting in an AND
logic. The example below also illustrates the use of one()
. If a query returns a single value then one()
returns the result as a single object (instead of a list). If the query does not result in a single object, however, one()
will raise an exception: effectively one()
imposes a constract that there must be a single result.
In [50]:
director = session.query(Person).filter_by(address=SANBI
).filter_by(
lastname='Christoffels').one()
The filter()
method is more flexible filtering method which effectively allows a WHERE
clause to be composed. The argument for filter()
is a Boolean expression. For more information, see the SQLAlchemy tutorial.
In [51]:
for person in session.query(Person).filter(
Person.lastname < "van Heusden"):
print "PERSON", person.firstname, person.lastname
In [60]:
from sqlalchemy import and_
director = session.query(Person).filter(and_(
Person.address == SANBI,
Person.lastname == "Christoffels")).one()
print director.firstname
The text()
function even allows literal text to be included in the SQL expression.
In [54]:
from sqlalchemy import text
person = session.query(Person).filter(text('id < 2')).all()
SQLAlchemy queries can also execute SQL joins. SQL joins are well illustrated on Jeff Atwood's coding horror blog. Thus the following example produces the inner join between the persons
and addresses
table.
In [57]:
details = session.query(Person.lastname,
Address.suburb).filter(
Person.address_id == Address.id).all()
for detail in details:
print detail
A final note: relationship
definitions in SQLAlchemy models allow SQLAlchemy to executed queries on demand to extract related objects. The different options for when related objects are retrieved from the database are explained in the documentation on relationship loading techniques.
In [59]:
location = session.query(Address).filter(
Address.street.like('%Sobukwe%')).one()
for person in location.inhabitants:
print "PERSON:",person.firstname, person.lastname
Enjoy using SQLAlchemy and don't forget to Google.