SQL

Accessing data stored in databases is a routine exercise. I demonstrate a few helpful methods in the Jupyter Notebook.

pip install ipython-sql


In [1]:
%load_ext sql


/Users/jbw/anaconda/lib/python2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated. You should import from traitlets.config instead.
  "You should import from traitlets.config instead.", ShimWarning)
/Users/jbw/anaconda/lib/python2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
  warn("IPython.utils.traitlets has moved to a top-level traitlets package.")

In [2]:
%config SqlMagic.autopandas=True

In [3]:
import pandas as pd
import sqlite3

In [4]:
%%sql sqlite://
CREATE TABLE presidents (first_name, last_name, year_of_birth);
INSERT INTO presidents VALUES ('George', 'Washington', 1732);
INSERT INTO presidents VALUES ('John', 'Adams', 1735);
INSERT INTO presidents VALUES ('Thomas', 'Jefferson', 1743);
INSERT INTO presidents VALUES ('James', 'Madison', 1751);
INSERT INTO presidents VALUES ('James', 'Monroe', 1758);
INSERT INTO presidents VALUES ('Zachary', 'Taylor', 1784);
INSERT INTO presidents VALUES ('Abraham', 'Lincoln', 1809);
INSERT INTO presidents VALUES ('Theodore', 'Roosevelt', 1858);
INSERT INTO presidents VALUES ('Richard', 'Nixon', 1913);
INSERT INTO presidents VALUES ('Barack', 'Obama', 1961);


Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
Out[4]:

In [5]:
later_presidents = %sql SELECT * FROM presidents WHERE year_of_birth > 1825
later_presidents


Done.
Out[5]:
first_name last_name year_of_birth
0 Theodore Roosevelt 1858
1 Richard Nixon 1913
2 Barack Obama 1961

In [6]:
type(later_presidents)


Out[6]:
pandas.core.frame.DataFrame

In [7]:
con = sqlite3.connect("presidents.sqlite")
later_presidents.to_sql("presidents", con, if_exists='replace')

Through pandas directly


In [8]:
con = sqlite3.connect("presidents.sqlite")
cur = con.cursor()

new_dataframe = pd.read_sql("SELECT * \
                 FROM presidents", 
                 con=con)

con.close()

In [9]:
new_dataframe


Out[9]:
index first_name last_name year_of_birth
0 0 Theodore Roosevelt 1858
1 1 Richard Nixon 1913
2 2 Barack Obama 1961

In [10]:
type(new_dataframe)


Out[10]:
pandas.core.frame.DataFrame

In [ ]: