Using SQLAlchemy to Talk to a Database

SqlAlchemy helps you use a database to store and retrieve information from python. It abstracts the specific storage engine from te way you use it - so it doesn't care if you end up using MySQL, SQLite, or whatever else. In addition, you can use core and the object-relational mapper (ORM) to avoid writing any SQL at all. The SQLAlchemy homepage has lots of good examples and full documentation.


In [2]:
from sqlalchemy import *
import datetime

Basic SQL Generation

The core library generates SQL for you. Read more about it on their expression language tutorial page. Below are some basic examples.

Creating a Table

Read more about defining and creating tables.


In [3]:
# add `echo=True` to see log statements of all the SQL that is generated
engine = create_engine('sqlite:///:memory:',echo=True) # just save the db in memory for now (ie. not on disk)
metadata = MetaData()
# define a table to use
queries = Table('query', metadata,
    Column('id', Integer, primary_key=True),
    Column('keywords', String(400), nullable=False),
    Column('timestamp', DateTime, default=datetime.datetime.now),
)
metadata.create_all(engine) # and create the tables in the database


2016-12-08 03:51:50,429 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2016-12-08 03:51:50,429 INFO sqlalchemy.engine.base.Engine ()
2016-12-08 03:51:50,430 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2016-12-08 03:51:50,431 INFO sqlalchemy.engine.base.Engine ()
2016-12-08 03:51:50,433 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("query")
2016-12-08 03:51:50,434 INFO sqlalchemy.engine.base.Engine ()
2016-12-08 03:51:50,435 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE "query" (
	id INTEGER NOT NULL, 
	keywords VARCHAR(400) NOT NULL, 
	timestamp DATETIME, 
	PRIMARY KEY (id)
)


2016-12-08 03:51:50,436 INFO sqlalchemy.engine.base.Engine ()
2016-12-08 03:51:50,437 INFO sqlalchemy.engine.base.Engine COMMIT

Inserting Data

Read more about generating SQL insert statements.


In [5]:
insert_stmt = queries.insert()
str(insert_stmt) # see an example of what this will do


Out[5]:
'INSERT INTO query (id, keywords, timestamp) VALUES (:id, :keywords, :timestamp)'

In [4]:
insert_stmt = queries.insert().values(keywords="puppies")
str(insert_stmt)


Out[4]:
'INSERT INTO queries (keywords, timestamp) VALUES (:keywords, :timestamp)'

In [5]:
db_conn = engine.connect()
result = db_conn.execute(insert_stmt)
result.inserted_primary_key # print out the primary key it was assigned


2016-12-04 18:07:27,708 INFO sqlalchemy.engine.base.Engine INSERT INTO queries (keywords, timestamp) VALUES (?, ?)
2016-12-04 18:07:27,709 INFO sqlalchemy.engine.base.Engine ('puppies', '2016-12-04 18:07:27.707960')
2016-12-04 18:07:27,710 INFO sqlalchemy.engine.base.Engine COMMIT
Out[5]:
[1]

In [6]:
insert_stmt = queries.insert().values(keywords="kittens")
result = db_conn.execute(insert_stmt)
result.inserted_primary_key # print out the primary key it was assigned


2016-12-04 18:07:28,915 INFO sqlalchemy.engine.base.Engine INSERT INTO queries (keywords, timestamp) VALUES (?, ?)
2016-12-04 18:07:28,916 INFO sqlalchemy.engine.base.Engine ('kittens', '2016-12-04 18:07:28.915375')
2016-12-04 18:07:28,917 INFO sqlalchemy.engine.base.Engine COMMIT
Out[6]:
[2]

Retrieving Data

Read more about using SQL select statments.


In [7]:
from sqlalchemy.sql import select
select_stmt = select([queries])
results = db_conn.execute(select_stmt)
for row in results:
    print row


2016-12-04 18:07:31,338 INFO sqlalchemy.engine.base.Engine SELECT queries.id, queries.keywords, queries.timestamp 
FROM queries
2016-12-04 18:07:31,339 INFO sqlalchemy.engine.base.Engine ()
(1, u'puppies', datetime.datetime(2016, 12, 4, 18, 7, 27, 707960))
(2, u'kittens', datetime.datetime(2016, 12, 4, 18, 7, 28, 915375))

In [8]:
select_stmt = select([queries]).where(queries.c.id==1)
for row in db_conn.execute(select_stmt):
    print row


2016-12-04 18:07:33,123 INFO sqlalchemy.engine.base.Engine SELECT queries.id, queries.keywords, queries.timestamp 
FROM queries 
WHERE queries.id = ?
2016-12-04 18:07:33,124 INFO sqlalchemy.engine.base.Engine (1,)
(1, u'puppies', datetime.datetime(2016, 12, 4, 18, 7, 27, 707960))

In [9]:
select_stmt = select([queries]).where(queries.c.keywords.like('p%'))
for row in db_conn.execute(select_stmt):
    print row


2016-12-04 18:07:33,859 INFO sqlalchemy.engine.base.Engine SELECT queries.id, queries.keywords, queries.timestamp 
FROM queries 
WHERE queries.keywords LIKE ?
2016-12-04 18:07:33,860 INFO sqlalchemy.engine.base.Engine ('p%',)
(1, u'puppies', datetime.datetime(2016, 12, 4, 18, 7, 27, 707960))

ORM

You can use their ORM library to handle the translation into full-fledged python objects. This can help you build the Model for you MVC solution.


In [10]:
import datetime
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
Base = declarative_base()

Creating a class mapping

Read more about creating a mapping.


In [11]:
class Query(Base):
    __tablename__ = 'queries'
    id = Column(Integer, primary_key=True)
    keywords = Column(String(400))
    timestamp = Column(DateTime,default=datetime.datetime.now)
    def __repr__(self):
        return "<Query(keywords='%s')>" % (self.keywords)
Query.__table__


Out[11]:
Table('queries', MetaData(bind=None), Column('id', Integer(), table=<queries>, primary_key=True, nullable=False), Column('keywords', String(length=400), table=<queries>), Column('timestamp', DateTime(), table=<queries>, default=ColumnDefault(<function <lambda> at 0x1065d4aa0>)), schema=None)

Creating a connection and session

Read more about creating this stuff.


In [12]:
engine = create_engine('sqlite:///:memory:') # just save the db in memory for now (ie. not on disk)
Base.metadata.create_all(engine)
Session = sessionmaker(bind=engine)
my_session = Session()

Inserting Data

Read more about inserting data with an ORM.


In [13]:
query = Query(keywords="iguana")
query.keywords


Out[13]:
'iguana'

In [14]:
my_session.add(query)
my_session.commit()
query.id


Out[14]:
1

Retrieving Data

Read more about retrieving data from the db via an ORM class.


In [15]:
for q in my_session.query(Query).order_by(Query.timestamp):
    print q


<Query(keywords='iguana')>

In [16]:
query1 = Query(keywords="robot")
query2 = Query(keywords="puppy")
my_session.add_all([query1,query2])
my_session.commit()

In [17]:
for q in my_session.query(Query).order_by(Query.timestamp):
    print q


<Query(keywords='iguana')>
<Query(keywords='robot')>
<Query(keywords='puppy')>

In [18]:
for q in my_session.query(Query).filter(Query.keywords.like('r%')):
    print q


<Query(keywords='robot')>

In [21]:
import mediacloud, datetime
mc = mediacloud.api.MediaCloud('48d9fd877527e6c8c5ff25553c124b1634b8b2926310e2d259d6518221e2a5f2')
agreement = mc.sentenceCount('( colombia AND peace AND agreement)', solr_filter=[mc.publish_date_query( datetime.date( 2015, 10, 1), datetime.date( 2016, 12, 2) ), 'tags_id_media:1' ])
print agreement['count']


396