PyData Boston, July 2013
Continuum Analytics </span>
Clone my tutorial notes from github: https://github.com/ijstokes/sql-analysis-with-django-orm.git
Using MySQL Employees Test DB for trial data, with modified employees.sql
to work with SQLite3.
Don't download this now -- 25 MB compressed, 250 MB uncompressed
Unzip the Employees test database into the GitHub directory
employees.sql
, but some manual search-and-replace is still required.%s/),/);^M INSERT INTO `replace_with_table_name` VALUES/g
id
field to all tables except employees
.add_idx.py load_TABLENAME.dump
on all dump tables except load_employees.dump
sqlite3 -init employees.sql employees.db
This will create 6 tables. The main one is:
emp_no
birth_date
first_name
last_name
gender
hire_date
The other 5 describe:
Slurping this data in takes a little while (about 30 minutes for me).
You need to have Django installed. A few of these options will work if you don't have it already:
pip install django
conda install django
Now initialize the Django project:
django-admin.py startproject datasnoop
mv datasnoop/datasnoop/* datasnoop
mv datasnoop/manage.py .
rm -Rf datasnoop/datasnoop
This sets up the basic Django pieces. Now we need to add the "app" that will be the specific container for the Employee data:
pushd datasnoop
./manage.py startapp employees
popd
Edit datasnoop/settings.py
to point to employee.db
:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, '../employees_db/employees.db'),
}
}
And also modify datasnoop/settings.py
to include datasnoop.employees
in INSTALLED_APPS
:
INSTALLED_APPS = (
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'datasnoop.employees'
)
Now we can use inspectdb
to extract models from the DB:
./manage.py inspectdb > datasnoop/employees/models.py
Let's take a look at what this has given us in datasnoop/employees/models.py
:
class Employees(models.Model):
emp_no = models.IntegerField(primary_key=True)
birth_date = models.DateField()
first_name = models.CharField(max_length=14)
last_name = models.CharField(max_length=16)
gender = models.CharField(max_length=1)
hire_date = models.DateField()
class Meta:
managed = False
db_table = 'employees'
Nice! We now have an ORM definition for interacting with the DB.
For our whizzy Django Admin interface to work, we need to register the auto-generated models using Django's admin class format.
We need to create a file datasnoop/employees/admin.py
that contains:
from django.contrib import admin
from datasnoop.employees.models import Departments, DeptEmp, DeptManager
from datasnoop.employees.models import Employees, Salaries, Titles
for cls in (Departments, DeptEmp, DeptManager, Employees, Salaries, Titles):
admin.site.register(cls)
We'll see shortly that a bit more work will be needed but this is a good start.
Take a look at our current tables:
$ sqlite3 employees_db/employees.db .tables
departments dept_manager salaries
dept_emp employees titles
Django has some of its own administrative tables it needs, so we'll create these:
./manage.py syncdb
When prompted, add an admin user account with a valid email address and password, then look at the tables that now exist:
$ sqlite3 employees_db/employees.db .tables
auth_group dept_manager
auth_group_permissions django_admin_log
auth_permission django_content_type
auth_user django_session
auth_user_groups employees
auth_user_user_permissions salaries
departments titles
dept_emp
We'll startup a simple webserver to connect to the admin interface of our Django website:
./manage.py runserver
Now connect to the admin interface via http://localhost:8000/admin/
with the username and password you set earlier.
Well, not bad, but the aggregated lists of objects aren't very informative. We'll fix that next.
Take a look at datasnoop/employees/admin_basic.py
, which is what we started with:
from datasnoop.employees.models import Departments, DeptEmp, DeptManager, Employees, Salaries, Titles
for cls in (Departments, DeptEmp, DeptManager, Employees, Salaries, Titles):
admin.site.register(cls)
We're just taking advantage of the automatic admin interface. Instead, we need to specify exactly which fields we want to display in our list view, which we do in datasnoop/employees/admin_list.py
:
class EmployeesAdmin(admin.ModelAdmin):
list_display = ('emp_no', 'last_name', 'first_name', 'gender', 'birth_date', 'hire_date')
admin.site.register(Employees, EmployeesAdmin)
Now let's see what this give us through our Admin web interface: http://localhost:8000/admin/
If we click on the column headers, we can see it will sort the results by that column. Django Admin also provides mechanisms to facilitate search and filter. Let's add those in, by looking at datasnoop/employees/admin_filter.py
:
class EmployeesAdmin(admin.ModelAdmin):
list_display = ('emp_no', 'last_name', 'first_name', 'gender', 'birth_date', 'hire_date')
list_filter = ('gender', 'birth_date', 'hire_date')
search_fields = ['last_name']
date_hierarchy = 'birth_date'
admin.site.register(Employees, EmployeesAdmin)
If you didn't perform the SQL data transformations described earlier, then the auto-generated models from inspectdb
won't identify foreign keys, so unfortunately we can't reference through.
As a challenge, see if you can modify datasnoop/employees/models.py
to reference ForeignKeys correctly -- there are some catches because of how Django wants to auto-index all keys.
Without model.ForeignKey('Foo')
references we can't grab data from referenced fields.