Why use a database?

  • Many scientists and engineers happy to use Ms Excel, CSV files, or even text files

  • Nothing wrong with that. It works.

    Until it doesn't.

    Example: Current project. We are working with a 5GB csv file. This grew over time, till it has become unmanageable. It can't be opened in Ms Office (OpenOffice), any text editor, or any other utility.

But databases routinely handle Gigabytes of data without any problems

Download the file: http://download.cms.gov/nppes/NPI_Files.html

Remembe to use 7zip to unzip it!

Real Reason (TM) programmers don't use a database: They are scared of the weird syntax of SQL, which looks a lot like it was a language invented for business types in the 60s-70s (it was!). But it's not that scary.


In [ ]:


In [22]:
import sqlite3

In [24]:
conn = sqlite3.connect(":memory:")
c = conn.cursor()

In [25]:
c.execute("CREATE TABLE my_table (Name TEXT NOT NULL, Salary INTEGER NOT NULL DEFAULT 0)")


Out[25]:
<sqlite3.Cursor at 0x42c93b0>

In [26]:
conn.commit()

In [28]:
c.execute("INSERT INTO my_table VALUES ('Joe', 3000)")
conn.commit()

In [29]:
names = [('James', 2000), ('Sally', 5000)]

c.executemany("INSERT INTO my_table VALUES (?,?)", names)
conn.commit()

In [30]:
c.execute("SELECT * FROM my_table WHERE name = 'Joe'")
result = c.fetchall()
print(result)


[('Joe', 3000)]

In [31]:
c.execute("SELECT * FROM my_table WHERE name = 'MrT'")
result = c.fetchall()
print(result)


[]

In [32]:
c.execute("SELECT * FROM my_table WHERE salary > 2000")
result = c.fetchall()
print(result)


[('Joe', 3000), ('Sally', 5000)]

In [33]:
conn.commit()
conn.close()

In [ ]: