SQLite with Python (Basics)

The standard Python distribution ships with a basic SQLite3 inteface.

Connect to a database

Import the sqlite3 module, create a connection and open a cursor to operate on the database.


In [1]:
import sqlite3
import os

conn = sqlite3.connect("sqlite-python-basics.sqlite")
cur = conn.cursor()

After using the database make sure you close the connection to avoid locking yourself out of the database.


In [2]:
conn.close()
os.remove("sqlite-python-basics.sqlite")

Create a database

As soon as you connect to a database, if it doesn't exist it will create it.


In [3]:
conn = sqlite3.connect(":memory:")
cur = conn.cursor()

Let's create two tables representing an event log (events) and the project (projects) they belong to.


In [4]:
cur.execute("""
CREATE TABLE IF NOT EXISTS projects (
    id INTEGER PRIMARY KEY,
    created DATE NOT NULL,
    name VARCHAR(50) NOT NULL UNIQUE
)
""")
cur.execute("""
CREATE TABLE IF NOT EXISTS events (
    id INTEGER PRIMARY KEY,
    date DATE,
    project_id INTEGER,
    comments TEXT,
    FOREIGN KEY (project_id) REFERENCES projects(id)
)
""");

We can verify they exist by querying the sqlite_master table.


In [5]:
cur.execute("SELECT name FROM sqlite_master WHERE type='table'")
cur.fetchall()


Out[5]:
[('projects',), ('events',)]

To add a column you use ALTER TABLE. Documentation: https://www.sqlite.org/lang_altertable.html


In [6]:
cur.execute("ALTER TABLE events ADD COLUMN effort REAL");

Note: Check the Helpers section to avoid attempting to add a column that already exists.

Operate on a database

Documentation: https://www.sqlite.org/fullsql.html

SQLite implements most of the SQL standard and extends it with a few optional features like JSON.

Most operations are done as part of a transaction and will be kept in a journal until they are commited with conn.commit(). This means that if you insert a row into a table, you will not see it in a query until you commit it.

Let's insert a couple of projects:


In [7]:
cur.execute("INSERT INTO projects (created, name) VALUES (date('now'), 'project A')");
cur.execute("INSERT INTO projects (created, name) VALUES (date('now'), 'project B')");

In [8]:
cur.execute("SELECT * FROM projects")
cur.fetchall()


Out[8]:
[(1, '2018-03-31', 'project A'), (2, '2018-03-31', 'project B')]

And a few events:


In [9]:
events = [("2018-03-12", 1, 1.5, "A stuff"),
          ("2018-03-13", 1, 0.5, "More A stuff"),
          ("2018-03-13", 2, 1.0, "B stuff")]
for event in events:
    cur.execute("INSERT INTO events (date, project_id, effort, comments) VALUES (?, ?, ?, ?)", event)

In [10]:
cur.execute("SELECT * FROM events")
cur.fetchall()


Out[10]:
[(1, '2018-03-12', 1, 'A stuff', 1.5),
 (2, '2018-03-13', 1, 'More A stuff', 0.5),
 (3, '2018-03-13', 2, 'B stuff', 1.0)]

In [11]:
cur.execute("""
SELECT e.id, e.date, p.name AS project, e.effort
FROM events AS e
INNER JOIN projects AS p ON e.project_id = p.id
""")
cur.fetchall()


Out[11]:
[(1, '2018-03-12', 'project A', 1.5),
 (2, '2018-03-13', 'project A', 0.5),
 (3, '2018-03-13', 'project B', 1.0)]

Let's insert a few more records, this time in CSV.


In [12]:
import csv

with open('events.csv', 'r') as events:
    rows = [(r['date'], r['project_id'], r['comments'], r['effort']) for r in csv.DictReader(events)]
    cur.executemany("""
    INSERT INTO events (date, project_id, comments, effort) VALUES (?, ?, ?, ?)
    """, rows)
    conn.commit()
    
cur.execute("""
SELECT e.id, e.date, p.name AS project, e.effort
FROM events AS e
INNER JOIN projects AS p ON e.project_id = p.id
""")
cur.fetchall()


Out[12]:
[(1, '2018-03-12', 'project A', 1.5),
 (2, '2018-03-13', 'project A', 0.5),
 (3, '2018-03-13', 'project B', 1.0),
 (4, '2018-03-01', 'project A', 2.5),
 (5, '2018-03-01', 'project B', 6.0),
 (6, '2018-03-02', 'project B', 8.0),
 (7, '2018-03-03', 'project A', 7.5),
 (8, '2018-03-04', 'project A', 7.5),
 (9, '2018-03-05', 'project B', 2.0)]

And let's change the name of project B to C:


In [13]:
cur.execute("UPDATE projects SET name = ('project C') WHERE id = 2")
cur.execute("SELECT * FROM projects")
cur.fetchall()


Out[13]:
[(1, '2018-03-31', 'project A'), (2, '2018-03-31', 'project C')]

Indexes


In [14]:
cur.execute("CREATE INDEX IF NOT EXISTS event_dates ON events (date)");

In [15]:
cur.execute("SELECT date, effort FROM events GROUP BY date")
cur.fetchall()


Out[15]:
[('2018-03-01', 6.0),
 ('2018-03-02', 8.0),
 ('2018-03-03', 7.5),
 ('2018-03-04', 7.5),
 ('2018-03-05', 2.0),
 ('2018-03-12', 1.5),
 ('2018-03-13', 1.0)]

In [16]:
cur.execute("DROP INDEX IF EXISTS event_dates");

Types

Documentation: https://www.sqlite.org/datatype3.html

  • NULL. The value is a NULL value.
  • INTEGER. The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
  • REAL. The value is a floating point value, stored as an 8-byte IEEE floating point number.
  • TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
  • BLOB. The value is a blob of data, stored exactly as it was input.

SQLite does not have a separate Boolean storage class. Boolean values are stored as integers 0 and 1. It doesn't have a date or datetime type. Instead, you use date and time functions to store them as TEXT, REAL or INTEGER.

Helpers

To check for the table info, you can use the PRAGMA command. Documentation: http://www.sqlite.org/pragma.html#pragma_table_info

The following function could be used instead of the try/except block back in the first example of altering the events table.


In [17]:
def column_exists(table, column):
    res = cur.execute("PRAGMA table_info({})".format(table))
    return len ([x for (_, x, _, _, _, _) in res if x == column]) != 0

column_exists('events', 'effort')


Out[17]:
True

Disconnect from a database


In [18]:
conn.commit()
conn.close()