Copyright Jana Schaich Borg/Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

MySQL Exercise 7: Joining Tables with Inner Joins

Before completing these exercises, I strongly recommend that you watch the video called "What are Joins?" that describe what joins are, and how different types of joins work.

As one of the last building blocks we need to address our Dognition analysis questions, in this lesson we will learn how to combine tables using inner joins.

1. Inner Joins between 2 tables

To begin, load the sql library, connect to the Dognition database, and set the Dognition database as the default.


In [55]:
%load_ext sql
%sql mysql://studentuser:studentpw@mysqlserver/dognitiondb
%sql USE dognitiondb

%config SqlMagic.displaylimit=25


The sql extension is already loaded. To reload it, use:
  %reload_ext sql
(_mysql_exceptions.OperationalError) (2006, 'MySQL server has gone away') [SQL: 'USE dognitiondb']

Recall that tables in relational databases are linked through primary keys and sometimes other fields that are common to multiple tables (as is the case with our Dognition data set). Our goal when we execute a JOIN or make a joined table is to use those common columns to let the database figure out which rows in one table match up to which rows in another table. Once that mapping is established using at least one common field or column, the database can pull any columns you want out of the mapped, or joined, tables and output the matched data to one common table.

An inner join is a join that outputs only rows that have an exact match in both tables being joined:

To illustrate how this works, let's find out whether dog owners that are particularly surprised by their dog's performance on Dognition tests tend to own similar breeds (or breed types, or breed groups) of dogs. There are many ways to address this question, but let's start by focusing on the dog owners who provided at least 10 ratings for one or more of their dogs in the ratings table. Of these owners, which 200 owners reported the highest average amount of surprise at their dog's performance, and what was the breed, breed_type, and breed_group of each of these owner's dog?

The surprise ratings are stored in the reviews table. The dog breed information is provided in the dogs table. There are two columns that are common to both tables: user_guid and dog_guid. How do we use the common columns to combine information from the two tables?

To join the tables, you can use a WHERE clause and add a couple of details to the FROM clause so that the database knows from what table each field in your SELECT clause comes.

First, start by adding all the columns we want to examine to the SELECT statement:

SELECT dog_guid AS DogID, user_guid AS UserID, AVG(rating) AS AvgRating, 
       COUNT(rating) AS NumRatings, breed, breed_group, breed_type

then list all the tables from which the fields we are interested in come, separated by commas (with no comma at the end of the list):

FROM dogs, reviews

then add the other restrictions:

GROUP BY user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200

Try running this query and see what happens:


In [2]:
%%sql
SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating,
    COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid AND d.user_guid=r.user_guid
GROUP BY d.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200


38 rows affected.
Out[2]:
DogID UserID AvgRating NumRatings breed breed_group breed_type
fdbf39f8-7144-11e5-ba71-058fbc01cf0b ce987914-7144-11e5-ba71-058fbc01cf0b 8.0000 12 Canaan Dog Herding Pure Breed
fdc09a82-7144-11e5-ba71-058fbc01cf0b ce99bb12-7144-11e5-ba71-058fbc01cf0b 5.0000 10 Golden Doodle None Popular Hybrid
fdbef330-7144-11e5-ba71-058fbc01cf0b ce984d2c-7144-11e5-ba71-058fbc01cf0b 4.5385 13 Havanese Toy Pure Breed
fdc0518a-7144-11e5-ba71-058fbc01cf0b ce99661c-7144-11e5-ba71-058fbc01cf0b 3.7333 15 Mixed None Mixed Breed/ Other/ I Don't Know
fd684cf6-7144-11e5-ba71-058fbc01cf0b ce473856-7144-11e5-ba71-058fbc01cf0b 3.7222 18 Labrador Retriever Sporting Pure Breed
fdbf66e4-7144-11e5-ba71-058fbc01cf0b ce98b9b0-7144-11e5-ba71-058fbc01cf0b 3.5000 14 Golden Retriever Sporting Pure Breed
fdc09b0e-7144-11e5-ba71-058fbc01cf0b ce99bb76-7144-11e5-ba71-058fbc01cf0b 3.2308 13 Golden Doodle None Popular Hybrid
fdbeedc2-7144-11e5-ba71-058fbc01cf0b ce9847dc-7144-11e5-ba71-058fbc01cf0b 3.1538 13 Eurasier None Pure Breed
fdc180be-7144-11e5-ba71-058fbc01cf0b ce9a7110-7144-11e5-ba71-058fbc01cf0b 3.0625 16 Chihuahua-Dachshund Mix None Cross Breed
fd6aabd6-7144-11e5-ba71-058fbc01cf0b ce6cd804-7144-11e5-ba71-058fbc01cf0b 3.0000 13 American Pit Bull Terrier None Pure Breed
fdbfc0b2-7144-11e5-ba71-058fbc01cf0b ce98f3bc-7144-11e5-ba71-058fbc01cf0b 2.6923 13 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf96dc-7144-11e5-ba71-058fbc01cf0b ce98ce0a-7144-11e5-ba71-058fbc01cf0b 2.5714 14 American Pit Bull Terrier-American Staffordshire Terrier Mix None Cross Breed
fdbedef4-7144-11e5-ba71-058fbc01cf0b ce982e46-7144-11e5-ba71-058fbc01cf0b 2.4615 13 Mixed None Mixed Breed/ Other/ I Don't Know
fdbec8ce-7144-11e5-ba71-058fbc01cf0b ce980e3e-7144-11e5-ba71-058fbc01cf0b 2.4545 11 Basset Hound Hound Pure Breed
fdbeff06-7144-11e5-ba71-058fbc01cf0b ce9856f0-7144-11e5-ba71-058fbc01cf0b 2.4000 10 Bearded Collie Herding Pure Breed
fdbed7ce-7144-11e5-ba71-058fbc01cf0b ce981dc0-7144-11e5-ba71-058fbc01cf0b 2.3333 18 Havanese Toy Pure Breed
fdc1c31c-7144-11e5-ba71-058fbc01cf0b ce9ac674-7144-11e5-ba71-058fbc01cf0b 2.3077 13 Labrador Retriever Sporting Pure Breed
fdc1a224-7144-11e5-ba71-058fbc01cf0b ce9ab1f2-7144-11e5-ba71-058fbc01cf0b 2.2857 14 Other None Mixed Breed/ Other/ I Don't Know
fdc0c49e-7144-11e5-ba71-058fbc01cf0b ce99d674-7144-11e5-ba71-058fbc01cf0b 2.2778 18 Other None Mixed Breed/ Other/ I Don't Know
fdc19b30-7144-11e5-ba71-058fbc01cf0b ce9aab30-7144-11e5-ba71-058fbc01cf0b 2.2222 18 Brittany Sporting Pure Breed
fdc0c502-7144-11e5-ba71-058fbc01cf0b ce99d732-7144-11e5-ba71-058fbc01cf0b 2.0000 15 Golden Retriever Sporting Pure Breed
fdbedf94-7144-11e5-ba71-058fbc01cf0b ce982ed2-7144-11e5-ba71-058fbc01cf0b 2.0000 11 Mixed None Mixed Breed/ Other/ I Don't Know
fd7104ae-7144-11e5-ba71-058fbc01cf0b ce7212a6-7144-11e5-ba71-058fbc01cf0b 1.9000 10 Mixed None Mixed Breed/ Other/ I Don't Know
fdbfddfe-7144-11e5-ba71-058fbc01cf0b ce99049c-7144-11e5-ba71-058fbc01cf0b 1.8000 40 Irish Water Spaniel Sporting Pure Breed
fdc198d8-7144-11e5-ba71-058fbc01cf0b ce9a9500-7144-11e5-ba71-058fbc01cf0b 1.8000 10 German Shepherd Dog Herding Pure Breed
38 rows, truncated to displaylimit of 25

You should receive an error message stating that the identity of dog_guid and user_guid in the field list is ambiguous. The reason is that the column title exists in both tables, and MySQL doesn't know which one we want. We have to specify the table name before stating the field name, and separate the two names by a period (NOTE: read this entire section before deciding whether you want to execute this query)<mark>:

SELECT dogs.dog_guid AS DogID, dogs.user_guid AS UserID, AVG(reviews.rating) AS AvgRating,     
       COUNT(reviews.rating) AS NumRatings, dogs.breed, dogs.breed_group, dogs.breed_type
FROM dogs, reviews
GROUP BY dogs.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200

You can also take advantage of aliases so that you don't have to write out the name of the tables each time. Here I will introduce another syntax for aliases that omits the AS completely. In this syntax, the alias is whatever word (or phrase, if you use quotation marks) follows immediately after the field or table name, separated by a space. So we could write:

SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
GROUP BY d.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200

I am tempted to tell you to run this query so that you will see what happens, but instead, I will explain what will happen and let you decide if you want to see what the output looks...and feels...like.

There is nothing built into the database table definitions that can instruct the server how to combine the tables on its own (remember, this is how relational databases save space and remain flexible). Further, the query as written does not tell the database how the two tables are related. As a consequence, rather than match up the two tables according to the values in the user_id and/or dog_id column, the database will do the only thing it knows how to do which is output every single combination of the records in the dogs table with the records in the reviews table. In other words, every single row of the dogs table will get paired with every single row of the reviews table. This is known as a Cartesian product. Not only will it be a heavy burden on the database to output a table that has the full length of one table multiplied times the full length of another (and frustrating to you, because the query would take a very long time to run), the output would be close to useless.

To prevent this from happening, tell the database how to relate the tables in the WHERE clause:

SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid
GROUP BY d.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200

To be very careful and exclude any incorrect dog_guid or user_guid entries, you can include both shared columns in the WHERE clause:

SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid AND d.user_guid=r.user_guid
GROUP BY d.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200

Try running this query now:


In [3]:
%%sql
SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid AND d.user_guid=r.user_guid
GROUP BY d.user_guid
HAVING NumRatings >= 10
ORDER BY AvgRating DESC
LIMIT 200


38 rows affected.
Out[3]:
DogID UserID AvgRating NumRatings breed breed_group breed_type
fdbf39f8-7144-11e5-ba71-058fbc01cf0b ce987914-7144-11e5-ba71-058fbc01cf0b 8.0000 12 Canaan Dog Herding Pure Breed
fdc09a82-7144-11e5-ba71-058fbc01cf0b ce99bb12-7144-11e5-ba71-058fbc01cf0b 5.0000 10 Golden Doodle None Popular Hybrid
fdbef330-7144-11e5-ba71-058fbc01cf0b ce984d2c-7144-11e5-ba71-058fbc01cf0b 4.5385 13 Havanese Toy Pure Breed
fdc0518a-7144-11e5-ba71-058fbc01cf0b ce99661c-7144-11e5-ba71-058fbc01cf0b 3.7333 15 Mixed None Mixed Breed/ Other/ I Don't Know
fd684cf6-7144-11e5-ba71-058fbc01cf0b ce473856-7144-11e5-ba71-058fbc01cf0b 3.7222 18 Labrador Retriever Sporting Pure Breed
fdbf66e4-7144-11e5-ba71-058fbc01cf0b ce98b9b0-7144-11e5-ba71-058fbc01cf0b 3.5000 14 Golden Retriever Sporting Pure Breed
fdc09b0e-7144-11e5-ba71-058fbc01cf0b ce99bb76-7144-11e5-ba71-058fbc01cf0b 3.2308 13 Golden Doodle None Popular Hybrid
fdbeedc2-7144-11e5-ba71-058fbc01cf0b ce9847dc-7144-11e5-ba71-058fbc01cf0b 3.1538 13 Eurasier None Pure Breed
fdc180be-7144-11e5-ba71-058fbc01cf0b ce9a7110-7144-11e5-ba71-058fbc01cf0b 3.0625 16 Chihuahua-Dachshund Mix None Cross Breed
fd6aabd6-7144-11e5-ba71-058fbc01cf0b ce6cd804-7144-11e5-ba71-058fbc01cf0b 3.0000 13 American Pit Bull Terrier None Pure Breed
fdbfc0b2-7144-11e5-ba71-058fbc01cf0b ce98f3bc-7144-11e5-ba71-058fbc01cf0b 2.6923 13 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf96dc-7144-11e5-ba71-058fbc01cf0b ce98ce0a-7144-11e5-ba71-058fbc01cf0b 2.5714 14 American Pit Bull Terrier-American Staffordshire Terrier Mix None Cross Breed
fdbedef4-7144-11e5-ba71-058fbc01cf0b ce982e46-7144-11e5-ba71-058fbc01cf0b 2.4615 13 Mixed None Mixed Breed/ Other/ I Don't Know
fdbec8ce-7144-11e5-ba71-058fbc01cf0b ce980e3e-7144-11e5-ba71-058fbc01cf0b 2.4545 11 Basset Hound Hound Pure Breed
fdbeff06-7144-11e5-ba71-058fbc01cf0b ce9856f0-7144-11e5-ba71-058fbc01cf0b 2.4000 10 Bearded Collie Herding Pure Breed
fdbed7ce-7144-11e5-ba71-058fbc01cf0b ce981dc0-7144-11e5-ba71-058fbc01cf0b 2.3333 18 Havanese Toy Pure Breed
fdc1c31c-7144-11e5-ba71-058fbc01cf0b ce9ac674-7144-11e5-ba71-058fbc01cf0b 2.3077 13 Labrador Retriever Sporting Pure Breed
fdc1a224-7144-11e5-ba71-058fbc01cf0b ce9ab1f2-7144-11e5-ba71-058fbc01cf0b 2.2857 14 Other None Mixed Breed/ Other/ I Don't Know
fdc0c49e-7144-11e5-ba71-058fbc01cf0b ce99d674-7144-11e5-ba71-058fbc01cf0b 2.2778 18 Other None Mixed Breed/ Other/ I Don't Know
fdc19b30-7144-11e5-ba71-058fbc01cf0b ce9aab30-7144-11e5-ba71-058fbc01cf0b 2.2222 18 Brittany Sporting Pure Breed
fdbedf94-7144-11e5-ba71-058fbc01cf0b ce982ed2-7144-11e5-ba71-058fbc01cf0b 2.0000 11 Mixed None Mixed Breed/ Other/ I Don't Know
fdc0c502-7144-11e5-ba71-058fbc01cf0b ce99d732-7144-11e5-ba71-058fbc01cf0b 2.0000 15 Golden Retriever Sporting Pure Breed
fd7104ae-7144-11e5-ba71-058fbc01cf0b ce7212a6-7144-11e5-ba71-058fbc01cf0b 1.9000 10 Mixed None Mixed Breed/ Other/ I Don't Know
fdbfddfe-7144-11e5-ba71-058fbc01cf0b ce99049c-7144-11e5-ba71-058fbc01cf0b 1.8000 40 Irish Water Spaniel Sporting Pure Breed
fdc198d8-7144-11e5-ba71-058fbc01cf0b ce9a9500-7144-11e5-ba71-058fbc01cf0b 1.8000 10 German Shepherd Dog Herding Pure Breed
38 rows, truncated to displaylimit of 25

The query should execute quickly. This would NOT have been the case if you did not include the WHERE clause to combine the two tables. If you accidentally request a Cartesian product from datasets with billions of rows, you could be waiting for your query output for days (and will probably get in trouble with your database administrator). So always remember to tell the database how to join your tables!

Let's examine our joined table a bit further. The joined table outputted by the query above should have 38 rows, despite the fact that we set our LIMIT at 200. The reason for this is that it turns out that a relatively small number of customers provided 10 or more reviews. If you remove the HAVING and LIMIT BY clause from the query, you should end up with 389 rows. Go ahead and try it:


In [4]:
%%sql
SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid AND d.user_guid=r.user_guid
GROUP BY d.user_guid
ORDER BY AvgRating DESC


389 rows affected.
Out[4]:
DogID UserID AvgRating NumRatings breed breed_group breed_type
fdc19554-7144-11e5-ba71-058fbc01cf0b ce9a8a24-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Miniature Pinscher Toy Pure Breed
fdc147ac-7144-11e5-ba71-058fbc01cf0b ce9a545a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Tibetan Terrier Non-Sporting Pure Breed
fdc121dc-7144-11e5-ba71-058fbc01cf0b ce9a310a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdbedd14-7144-11e5-ba71-058fbc01cf0b ce975994-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc1c880-7144-11e5-ba71-058fbc01cf0b ce96c768-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Maltese Toy Pure Breed
fdbf178e-7144-11e5-ba71-058fbc01cf0b ce986546-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Airedale Terrier Terrier Pure Breed
fdbf94de-7144-11e5-ba71-058fbc01cf0b ce98cc34-7144-11e5-ba71-058fbc01cf0b 9.0000 2 Pembroke Welsh Corgi Herding Pure Breed
fdc019c2-7144-11e5-ba71-058fbc01cf0b ce99489e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc057ac-7144-11e5-ba71-058fbc01cf0b ce996b9e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc059fa-7144-11e5-ba71-058fbc01cf0b ce996dc4-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Irish Water Spaniel Sporting Pure Breed
fdbf39f8-7144-11e5-ba71-058fbc01cf0b ce987914-7144-11e5-ba71-058fbc01cf0b 8.0000 12 Canaan Dog Herding Pure Breed
fd41bca8-7144-11e5-ba71-058fbc01cf0b ce7b055a-7144-11e5-ba71-058fbc01cf0b 8.0000 1 Labrador Retriever Sporting Pure Breed
fdc1e1a8-7144-11e5-ba71-058fbc01cf0b ce9ad934-7144-11e5-ba71-058fbc01cf0b 7.5000 2 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf940c-7144-11e5-ba71-058fbc01cf0b ce98cb80-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Soft Coated Wheaten Terrier Terrier Pure Breed
fdc1baca-7144-11e5-ba71-058fbc01cf0b ce9ac2fa-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Poodle-Miniature Schnauzer Mix None Cross Breed
fdbe7a40-7144-11e5-ba71-058fbc01cf0b ce97df9a-7144-11e5-ba71-058fbc01cf0b 7.0000 6 Mixed None Mixed Breed/ Other/ I Don't Know
fdbe8774-7144-11e5-ba71-058fbc01cf0b ce97ebfc-7144-11e5-ba71-058fbc01cf0b 7.0000 4 Labrador Retriever Sporting Pure Breed
fdc2417a-7144-11e5-ba71-058fbc01cf0b ce9b35e6-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Bichon Frise Non-Sporting Pure Breed
fdc1706a-7144-11e5-ba71-058fbc01cf0b ce9a6846-7144-11e5-ba71-058fbc01cf0b 6.8000 5 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf8598-7144-11e5-ba71-058fbc01cf0b ce98c46e-7144-11e5-ba71-058fbc01cf0b 6.5000 2 Labrador Retriever Sporting Pure Breed
fdc0c2aa-7144-11e5-ba71-058fbc01cf0b ce99d322-7144-11e5-ba71-058fbc01cf0b 6.5000 2 Border Collie Herding Pure Breed
fdc0827c-7144-11e5-ba71-058fbc01cf0b ce99af32-7144-11e5-ba71-058fbc01cf0b 6.0000 1 Irish Setter Sporting Pure Breed
fdc0345c-7144-11e5-ba71-058fbc01cf0b ce995f78-7144-11e5-ba71-058fbc01cf0b 6.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc0d75e-7144-11e5-ba71-058fbc01cf0b ce99e466-7144-11e5-ba71-058fbc01cf0b 6.0000 1 Dachshund Hound Pure Breed
fdc1ab52-7144-11e5-ba71-058fbc01cf0b ce9ab8f0-7144-11e5-ba71-058fbc01cf0b 6.0000 1 Golden Retriever Sporting Pure Breed
389 rows, truncated to displaylimit of 25

It's clear from looking at this output that (A) not many customers provided ratings, and (B) when they did, they usually were not very surprised by their dog's performance. Therefore, these ratings are probably not going to provide a lot of instructive insight into how to improve Dognition's completion rate. However, the ratings table still provides a great opportunity to illustrate the results of different types of joins.

To help prepare us for this:

Questions 1-4: How many unique dog_guids and user_guids are there in the reviews and dogs table independently?


In [5]:
%%sql
SELECT COUNT(DISTINCT dog_guid) AS uniq_dog_guid, COUNT(DISTINCT user_guid) AS uniq_user_guid
FROM reviews


1 rows affected.
Out[5]:
uniq_dog_guid uniq_user_guid
5991 5586

In [6]:
%%sql
SELECT COUNT(DISTINCT dog_guid) AS uniq_dog_guid, COUNT(DISTINCT user_guid) AS uniq_user_guid
FROM dogs


1 rows affected.
Out[6]:
uniq_dog_guid uniq_user_guid
35050 30967

In [ ]:


In [ ]:

These counts indicate some important things:

  • Many customers in both the reviews and the dogs table have multiple dogs
  • There are many more unique dog_guids and user_guids in the dogs table than the reviews table
  • There are many more unique dog_guids and user_guids in the reviews table than in the output of our inner join

Let's test one more thing.

Try the inner join query once with just the dog_guid or once with just the user_guid clause in the WHERE statement:


In [7]:
%%sql
SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.dog_guid=r.dog_guid
GROUP BY d.user_guid
ORDER BY AvgRating DESC


389 rows affected.
Out[7]:
DogID UserID AvgRating NumRatings breed breed_group breed_type
fdc121dc-7144-11e5-ba71-058fbc01cf0b ce9a310a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdbedd14-7144-11e5-ba71-058fbc01cf0b ce975994-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc1c880-7144-11e5-ba71-058fbc01cf0b ce96c768-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Maltese Toy Pure Breed
fdbf178e-7144-11e5-ba71-058fbc01cf0b ce986546-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Airedale Terrier Terrier Pure Breed
fdbf94de-7144-11e5-ba71-058fbc01cf0b ce98cc34-7144-11e5-ba71-058fbc01cf0b 9.0000 2 Pembroke Welsh Corgi Herding Pure Breed
fdc019c2-7144-11e5-ba71-058fbc01cf0b ce99489e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc057ac-7144-11e5-ba71-058fbc01cf0b ce996b9e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc059fa-7144-11e5-ba71-058fbc01cf0b ce996dc4-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Irish Water Spaniel Sporting Pure Breed
fdc19554-7144-11e5-ba71-058fbc01cf0b ce9a8a24-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Miniature Pinscher Toy Pure Breed
fdc147ac-7144-11e5-ba71-058fbc01cf0b ce9a545a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Tibetan Terrier Non-Sporting Pure Breed
fd41bca8-7144-11e5-ba71-058fbc01cf0b ce7b055a-7144-11e5-ba71-058fbc01cf0b 8.0000 1 Labrador Retriever Sporting Pure Breed
fdbf39f8-7144-11e5-ba71-058fbc01cf0b ce987914-7144-11e5-ba71-058fbc01cf0b 8.0000 12 Canaan Dog Herding Pure Breed
fdc1e1a8-7144-11e5-ba71-058fbc01cf0b ce9ad934-7144-11e5-ba71-058fbc01cf0b 7.5000 2 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf940c-7144-11e5-ba71-058fbc01cf0b ce98cb80-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Soft Coated Wheaten Terrier Terrier Pure Breed
fdc1baca-7144-11e5-ba71-058fbc01cf0b ce9ac2fa-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Poodle-Miniature Schnauzer Mix None Cross Breed
fdbe7a40-7144-11e5-ba71-058fbc01cf0b ce97df9a-7144-11e5-ba71-058fbc01cf0b 7.0000 6 Mixed None Mixed Breed/ Other/ I Don't Know
fdbe8774-7144-11e5-ba71-058fbc01cf0b ce97ebfc-7144-11e5-ba71-058fbc01cf0b 7.0000 4 Labrador Retriever Sporting Pure Breed
fdc2417a-7144-11e5-ba71-058fbc01cf0b ce9b35e6-7144-11e5-ba71-058fbc01cf0b 7.0000 1 Bichon Frise Non-Sporting Pure Breed
fdc1706a-7144-11e5-ba71-058fbc01cf0b ce9a6846-7144-11e5-ba71-058fbc01cf0b 6.8000 5 Mixed None Mixed Breed/ Other/ I Don't Know
fdbf8598-7144-11e5-ba71-058fbc01cf0b ce98c46e-7144-11e5-ba71-058fbc01cf0b 6.5000 2 Labrador Retriever Sporting Pure Breed
fdc0c2aa-7144-11e5-ba71-058fbc01cf0b ce99d322-7144-11e5-ba71-058fbc01cf0b 6.5000 2 Border Collie Herding Pure Breed
fdc07f66-7144-11e5-ba71-058fbc01cf0b ce99ac9e-7144-11e5-ba71-058fbc01cf0b 6.0000 2 Pug Toy Pure Breed
fdc0ef00-7144-11e5-ba71-058fbc01cf0b ce99fe4c-7144-11e5-ba71-058fbc01cf0b 6.0000 1 I Don't Know None Mixed Breed/ Other/ I Don't Know
fdbee278-7144-11e5-ba71-058fbc01cf0b ce9782c0-7144-11e5-ba71-058fbc01cf0b 6.0000 1 Miniature Pinscher Toy Pure Breed
fdc0cebc-7144-11e5-ba71-058fbc01cf0b ce99ddd6-7144-11e5-ba71-058fbc01cf0b 6.0000 2 Mixed None Mixed Breed/ Other/ I Don't Know
389 rows, truncated to displaylimit of 25

In [8]:
%%sql
SELECT d.dog_guid AS DogID, d.user_guid AS UserID, AVG(r.rating) AS AvgRating, 
       COUNT(r.rating) AS NumRatings, d.breed, d.breed_group, d.breed_type
FROM dogs d, reviews r
WHERE d.user_guid=r.user_guid
GROUP BY d.user_guid
ORDER BY AvgRating DESC


5586 rows affected.
Out[8]:
DogID UserID AvgRating NumRatings breed breed_group breed_type
fdb5a064-7144-11e5-ba71-058fbc01cf0b ce90d574-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Dachshund Hound Pure Breed
fd6dfdd6-7144-11e5-ba71-058fbc01cf0b ce6f6754-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdc19554-7144-11e5-ba71-058fbc01cf0b ce9a8a24-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Miniature Pinscher Toy Pure Breed
fd74bebe-7144-11e5-ba71-058fbc01cf0b ce75182a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdb98882-7144-11e5-ba71-058fbc01cf0b ce948d0e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Labradoodle None Popular Hybrid
fdb07be8-7144-11e5-ba71-058fbc01cf0b ce872114-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Soft Coated Wheaten Terrier Terrier Pure Breed
fd6f6e50-7144-11e5-ba71-058fbc01cf0b ce70a56a-7144-11e5-ba71-058fbc01cf0b 9.0000 2 Mixed None Mixed Breed/ Other/ I Don't Know
fdb967c6-7144-11e5-ba71-058fbc01cf0b ce946d1a-7144-11e5-ba71-058fbc01cf0b 9.0000 2 Jack Russell Terrier Terrier Pure Breed
fdad6188-7144-11e5-ba71-058fbc01cf0b ce83a5c0-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Dalmatian Non-Sporting Pure Breed
fdb338b0-7144-11e5-ba71-058fbc01cf0b ce898bde-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Maltese Toy Pure Breed
fd7e6856-7144-11e5-ba71-058fbc01cf0b ce7c7a20-7144-11e5-ba71-058fbc01cf0b 9.0000 2 Mixed None Mixed Breed/ Other/ I Don't Know
fdc147ac-7144-11e5-ba71-058fbc01cf0b ce9a545a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Tibetan Terrier Non-Sporting Pure Breed
fdbbbc60-7144-11e5-ba71-058fbc01cf0b ce967cfe-7144-11e5-ba71-058fbc01cf0b 9.0000 1 German Shepherd Dog-Border Collie Mix None Cross Breed
fdb1683c-7144-11e5-ba71-058fbc01cf0b ce880700-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Shih Tzu Toy Pure Breed
fd6f91d2-7144-11e5-ba71-058fbc01cf0b ce70c298-7144-11e5-ba71-058fbc01cf0b 9.0000 1 French Bulldog Non-Sporting Pure Breed
fd6d535e-7144-11e5-ba71-058fbc01cf0b ce6ecd80-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Vizsla Sporting Pure Breed
fdc019c2-7144-11e5-ba71-058fbc01cf0b ce99489e-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Mixed None Mixed Breed/ Other/ I Don't Know
fdba5780-7144-11e5-ba71-058fbc01cf0b ce956d78-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Maltese-Bichon Frise Mix None Cross Breed
fd6ee764-7144-11e5-ba71-058fbc01cf0b ce6fe8c8-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Pyrenean Shepherd Herding Pure Breed
fd89a3b0-7144-11e5-ba71-058fbc01cf0b ce7fc45a-7144-11e5-ba71-058fbc01cf0b 9.0000 1 I Don't Know None Mixed Breed/ Other/ I Don't Know
fd7537ea-7144-11e5-ba71-058fbc01cf0b ce757f9a-7144-11e5-ba71-058fbc01cf0b 9.0000 2 German Shepherd Dog Herding Pure Breed
fd6e8076-7144-11e5-ba71-058fbc01cf0b ce6fcdb6-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Labrador Retriever Sporting Pure Breed
fd51c72e-7144-11e5-ba71-058fbc01cf0b ce32fd00-7144-11e5-ba71-058fbc01cf0b 9.0000 1 Miniature Schnauzer Terrier Pure Breed
fdb67b74-7144-11e5-ba71-058fbc01cf0b ce91bfd4-7144-11e5-ba71-058fbc01cf0b 9.0000 2 American Pit Bull Terrier-Vizsla Mix None Cross Breed
fd6ac47c-7144-11e5-ba71-058fbc01cf0b ce6d4c80-7144-11e5-ba71-058fbc01cf0b 9.0000 1 German Shepherd Dog-Boxer Mix None Cross Breed
5586 rows, truncated to displaylimit of 25

When you run the query by joining on the dog_guid only, you still get 389 rows in your output. When you run the query by joining on the user_guid only, you get 5586 rows in your output. This means that:

  • All of the user_guids in the reviews table are in the dogs table
  • Only 389 of the over 5000 dog_guids in the reviews table are in the dogs table

Perhaps most importantly for our current purposes, these COUNT queries show you that *inner joins only output the data from rows that have equivalent values in both tables being joined*. If you wanted to include all the dog_guids or user_guids in one or both of the tables, you would have to use an outer join, which we will practice in the next lesson.

Try an inner join on your own.

Question 5: How would you extract the user_guid, dog_guid, breed, breed_type, and breed_group for all animals who completed the "Yawn Warm-up" game (you should get 20,845 rows if you join on dog_guid only)?


In [12]:
%%sql
SELECT d.user_guid, d.dog_guid, d.breed, d.breed_type, d.breed_group
FROM dogs d, complete_tests t
WHERE d.dog_guid = t.dog_guid AND t.test_name='Yawn Warm-up'


20845 rows affected.
Out[12]:
user_guid dog_guid breed breed_type breed_group
ce135e14-7144-11e5-ba71-058fbc01cf0b fd27b86c-7144-11e5-ba71-058fbc01cf0b Shih Tzu Pure Breed Toy
ce135f2c-7144-11e5-ba71-058fbc01cf0b fd27bbbe-7144-11e5-ba71-058fbc01cf0b Mixed Mixed Breed/ Other/ I Don't Know None
ce135e14-7144-11e5-ba71-058fbc01cf0b fd27ba1a-7144-11e5-ba71-058fbc01cf0b Shih Tzu Pure Breed Toy
ce136ac6-7144-11e5-ba71-058fbc01cf0b fd27c5be-7144-11e5-ba71-058fbc01cf0b Shih Tzu-Poodle Mix Cross Breed None
ce13615c-7144-11e5-ba71-058fbc01cf0b fd27b948-7144-11e5-ba71-058fbc01cf0b Siberian Husky Pure Breed Working
ce1353d8-7144-11e5-ba71-058fbc01cf0b fd27b5ba-7144-11e5-ba71-058fbc01cf0b Shetland Sheepdog Pure Breed Herding
ce136ee0-7144-11e5-ba71-058fbc01cf0b fd27c852-7144-11e5-ba71-058fbc01cf0b Pug Pure Breed Toy
ce136f94-7144-11e5-ba71-058fbc01cf0b fd27c8d4-7144-11e5-ba71-058fbc01cf0b Boxer Pure Breed Working
ce13750c-7144-11e5-ba71-058fbc01cf0b fd27cf28-7144-11e5-ba71-058fbc01cf0b Chesapeake Bay Retriever Pure Breed Sporting
ce136e36-7144-11e5-ba71-058fbc01cf0b fd27c7d0-7144-11e5-ba71-058fbc01cf0b Vizsla Pure Breed Sporting
ce1373ae-7144-11e5-ba71-058fbc01cf0b fd27cea6-7144-11e5-ba71-058fbc01cf0b Mixed Mixed Breed/ Other/ I Don't Know None
ce136c24-7144-11e5-ba71-058fbc01cf0b fd27c74e-7144-11e5-ba71-058fbc01cf0b German Shepherd Dog-Pembroke Welsh Corgi Mix Cross Breed None
ce1377b4-7144-11e5-ba71-058fbc01cf0b fd27d02c-7144-11e5-ba71-058fbc01cf0b Belgian Malinois Pure Breed Herding
ce138312-7144-11e5-ba71-058fbc01cf0b fd27dd38-7144-11e5-ba71-058fbc01cf0b Parson Russell Terrier-Beagle Mix Cross Breed None
ce13807e-7144-11e5-ba71-058fbc01cf0b fd27db08-7144-11e5-ba71-058fbc01cf0b German Shepherd Dog Pure Breed Herding
ce135e14-7144-11e5-ba71-058fbc01cf0b fd27ed46-7144-11e5-ba71-058fbc01cf0b Shih Tzu Pure Breed Toy
ce135194-7144-11e5-ba71-058fbc01cf0b fd27e026-7144-11e5-ba71-058fbc01cf0b Dalmatian Pure Breed Non-Sporting
ce135194-7144-11e5-ba71-058fbc01cf0b fd27e0d0-7144-11e5-ba71-058fbc01cf0b I Don't Know Mixed Breed/ Other/ I Don't Know None
ce138722-7144-11e5-ba71-058fbc01cf0b fd27eae4-7144-11e5-ba71-058fbc01cf0b Australian Terrier Pure Breed Terrier
ce1375b6-7144-11e5-ba71-058fbc01cf0b fd27cfaa-7144-11e5-ba71-058fbc01cf0b Border Collie Pure Breed Herding
ce1389d4-7144-11e5-ba71-058fbc01cf0b fd27efb2-7144-11e5-ba71-058fbc01cf0b Golden Retriever Pure Breed Sporting
ce1390f0-7144-11e5-ba71-058fbc01cf0b fd27f868-7144-11e5-ba71-058fbc01cf0b Mixed Mixed Breed/ Other/ I Don't Know None
ce1387cc-7144-11e5-ba71-058fbc01cf0b fd27f110-7144-11e5-ba71-058fbc01cf0b Bernese Mountain Dog Pure Breed Working
ce136a1c-7144-11e5-ba71-058fbc01cf0b fd27c1c2-7144-11e5-ba71-058fbc01cf0b Labrador Retriever Pure Breed Sporting
ce137700-7144-11e5-ba71-058fbc01cf0b fd27d0b8-7144-11e5-ba71-058fbc01cf0b Australian Shepherd-German Shepherd Dog Mix Cross Breed None
20845 rows, truncated to displaylimit of 25

In [9]:
%%sql
show tables


6 rows affected.
Out[9]:
Tables_in_dognitiondb
complete_tests
dogs
exam_answers
reviews
site_activities
users

In [10]:
%%sql
Describe complete_tests


6 rows affected.
Out[10]:
Field Type Null Key Default Extra
created_at datetime NO None
updated_at datetime NO None
user_guid varchar(60) YES MUL None
dog_guid varchar(60) YES MUL None
test_name varchar(60) YES None
subcategory_name varchar(60) YES None

2. Joining More than 2 Tables

In theory, you can join as many tables together as you want or need. To join multiple tables you take the same approach as we took when we were joining two tables together: list all the fields you want to extract in the SELECT statement, specify which table they came from in the SELECT statement, list all the tables from which you will need to extract the fields in the FROM statement, and then tell the database how to connect the tables in the WHERE statement.

To extract the user_guid, user's state of residence, user's zip code, dog_guid, breed, breed_type, and breed_group for all animals who completed the "Yawn Warm-up" game, you might be tempted to query:

SELECT c.user_guid AS UserID, u.state, u.zip, d.dog_guid AS DogID, d.breed, d.breed_type, d.breed_group
FROM dogs d, complete_tests c, users u
WHERE d.dog_guid=c.dog_guid 
   AND c.user_guid=u.user_guid
   AND c.test_name="Yawn Warm-up";

This query focuses the relationships primarily on the complete_tests table. However, it turns out that our Dognition dataset has only NULL values in the user_guid column of the complete_tests table. If you were to execute the query above, you would not get an error message, but your output would have 0 rows. However, the power of relational databases will come in handy here. You can use the dogs table to link the complete_tests and users table (pay attention to the difference between the WHERE statement in this query vs. the WHERE statement in the query above):

SELECT d.user_guid AS UserID, u.state, u.zip, d.dog_guid AS DogID, d.breed, d.breed_type, d.breed_group
FROM dogs d, complete_tests c, users u
WHERE d.dog_guid=c.dog_guid 
   AND d.user_guid=u.user_guid
   AND c.test_name="Yawn Warm-up";

Of note, joins are very resource intensive, so try not to join unnecessarily. In general, the more joins you have to execute, the slower your query performance will be.

Question 6: How would you extract the user_guid, membership_type, and dog_guid of all the golden retrievers who completed at least 1 Dognition test (you should get 711 rows)?


In [39]:
%%sql
SELECT DISTINCT u.user_guid, u.membership_type, d.dog_guid
FROM users u, dogs d, complete_tests t
WHERE
    d.dog_guid = t.dog_guid
    AND u.user_guid = d.user_guid
    AND d.breed = 'Golden Retriever'


711 rows affected.
Out[39]:
user_guid membership_type dog_guid
ce135ab8-7144-11e5-ba71-058fbc01cf0b 1 fd27b6b4-7144-11e5-ba71-058fbc01cf0b
ce13507c-7144-11e5-ba71-058fbc01cf0b 1 fd27b79a-7144-11e5-ba71-058fbc01cf0b
ce1389d4-7144-11e5-ba71-058fbc01cf0b 1 fd27efb2-7144-11e5-ba71-058fbc01cf0b
ce21f122-7144-11e5-ba71-058fbc01cf0b 2 fd3d03fc-7144-11e5-ba71-058fbc01cf0b
ce220bb2-7144-11e5-ba71-058fbc01cf0b 2 fd3d10cc-7144-11e5-ba71-058fbc01cf0b
ce2237f4-7144-11e5-ba71-058fbc01cf0b 2 fd3d3b24-7144-11e5-ba71-058fbc01cf0b
ce2243e8-7144-11e5-ba71-058fbc01cf0b 2 fd3d4b8c-7144-11e5-ba71-058fbc01cf0b
ce240b24-7144-11e5-ba71-058fbc01cf0b 2 fd3fe96e-7144-11e5-ba71-058fbc01cf0b
ce24476a-7144-11e5-ba71-058fbc01cf0b 2 fd404ff8-7144-11e5-ba71-058fbc01cf0b
ce245cdc-7144-11e5-ba71-058fbc01cf0b 2 fd4059e4-7144-11e5-ba71-058fbc01cf0b
ce246be6-7144-11e5-ba71-058fbc01cf0b 2 fd40738e-7144-11e5-ba71-058fbc01cf0b
ce249774-7144-11e5-ba71-058fbc01cf0b 1 fd408ed2-7144-11e5-ba71-058fbc01cf0b
ce249c6a-7144-11e5-ba71-058fbc01cf0b 2 fd40945e-7144-11e5-ba71-058fbc01cf0b
ce24c1c2-7144-11e5-ba71-058fbc01cf0b 2 fd40ee72-7144-11e5-ba71-058fbc01cf0b
ce24a106-7144-11e5-ba71-058fbc01cf0b 2 fd40f822-7144-11e5-ba71-058fbc01cf0b
ce24d572-7144-11e5-ba71-058fbc01cf0b 2 fd4107cc-7144-11e5-ba71-058fbc01cf0b
ce24d914-7144-11e5-ba71-058fbc01cf0b 2 fd410cb8-7144-11e5-ba71-058fbc01cf0b
ce24de50-7144-11e5-ba71-058fbc01cf0b 1 fd411690-7144-11e5-ba71-058fbc01cf0b
ce24e116-7144-11e5-ba71-058fbc01cf0b 1 fd4118e8-7144-11e5-ba71-058fbc01cf0b
ce24e684-7144-11e5-ba71-058fbc01cf0b 2 fd411b0e-7144-11e5-ba71-058fbc01cf0b
ce253a6c-7144-11e5-ba71-058fbc01cf0b 2 fd418b02-7144-11e5-ba71-058fbc01cf0b
ce2556a0-7144-11e5-ba71-058fbc01cf0b 1 fd41c5f4-7144-11e5-ba71-058fbc01cf0b
ce255c54-7144-11e5-ba71-058fbc01cf0b 2 fd41c84c-7144-11e5-ba71-058fbc01cf0b
ce257fc2-7144-11e5-ba71-058fbc01cf0b 2 fd41db34-7144-11e5-ba71-058fbc01cf0b
ce2585a8-7144-11e5-ba71-058fbc01cf0b 2 fd41e232-7144-11e5-ba71-058fbc01cf0b
711 rows, truncated to displaylimit of 25

In [6]:
%%sql
show tables


6 rows affected.
Out[6]:
Tables_in_dognitiondb
complete_tests
dogs
exam_answers
reviews
site_activities
users

In [7]:
%%sql
Describe complete_tests


6 rows affected.
Out[7]:
Field Type Null Key Default Extra
created_at datetime NO None
updated_at datetime NO None
user_guid varchar(60) YES MUL None
dog_guid varchar(60) YES MUL None
test_name varchar(60) YES None
subcategory_name varchar(60) YES None

In [10]:
%%sql
Describe users


16 rows affected.
Out[10]:
Field Type Null Key Default Extra
sign_in_count int(11) YES 0
created_at datetime NO None
updated_at datetime NO None
max_dogs int(11) YES 0
membership_id int(11) YES None
subscribed tinyint(1) YES 0
exclude tinyint(1) YES None
free_start_user tinyint(1) YES None
last_active_at datetime YES None
membership_type int(11) YES None
user_guid text YES MUL None
city varchar(255) YES None
state varchar(255) YES None
zip varchar(255) YES None
country varchar(255) YES None
utc_correction varchar(255) YES None

In [18]:
%%sql
Describe dogs


21 rows affected.
Out[18]:
Field Type Null Key Default Extra
gender varchar(255) YES None
birthday varchar(255) YES None
breed varchar(255) YES None
weight int(11) YES None
dog_fixed tinyint(1) YES None
dna_tested tinyint(1) YES None
created_at datetime NO None
updated_at datetime NO None
dimension varchar(255) YES None
exclude tinyint(1) YES None
breed_type varchar(255) YES None
breed_group varchar(255) YES None
dog_guid varchar(60) YES MUL None
user_guid varchar(60) YES MUL None
total_tests_completed varchar(255) YES None
mean_iti_days varchar(255) YES None
mean_iti_minutes varchar(255) YES None
median_iti_days varchar(255) YES None
median_iti_minutes varchar(255) YES None
time_diff_between_first_and_last_game_days varchar(255) YES None
time_diff_between_first_and_last_game_minutes varchar(255) YES None

In [17]:
%%sql
SELECT count(breed)
from dogs
where breed = 'Golden Retriever'
limit 5


1 rows affected.
Out[17]:
count(breed)
1296

In [36]:
%%sql
SELECT  distinct d.dog_guid
FROM dogs d, complete_tests t
WHERE
    d.dog_guid = t.dog_guid
    AND d.breed = 'Golden Retriever'


711 rows affected.
Out[36]:
dog_guid
fd27b6b4-7144-11e5-ba71-058fbc01cf0b
fd27b79a-7144-11e5-ba71-058fbc01cf0b
fd27efb2-7144-11e5-ba71-058fbc01cf0b
fd3d03fc-7144-11e5-ba71-058fbc01cf0b
fd3d10cc-7144-11e5-ba71-058fbc01cf0b
fd3d3b24-7144-11e5-ba71-058fbc01cf0b
fd3d4b8c-7144-11e5-ba71-058fbc01cf0b
fd3fe96e-7144-11e5-ba71-058fbc01cf0b
fd404ff8-7144-11e5-ba71-058fbc01cf0b
fd4059e4-7144-11e5-ba71-058fbc01cf0b
fd40738e-7144-11e5-ba71-058fbc01cf0b
fd408ed2-7144-11e5-ba71-058fbc01cf0b
fd40945e-7144-11e5-ba71-058fbc01cf0b
fd40ee72-7144-11e5-ba71-058fbc01cf0b
fd40f822-7144-11e5-ba71-058fbc01cf0b
fd4107cc-7144-11e5-ba71-058fbc01cf0b
fd410cb8-7144-11e5-ba71-058fbc01cf0b
fd411690-7144-11e5-ba71-058fbc01cf0b
fd4118e8-7144-11e5-ba71-058fbc01cf0b
fd411b0e-7144-11e5-ba71-058fbc01cf0b
fd418b02-7144-11e5-ba71-058fbc01cf0b
fd41c5f4-7144-11e5-ba71-058fbc01cf0b
fd41c84c-7144-11e5-ba71-058fbc01cf0b
fd41db34-7144-11e5-ba71-058fbc01cf0b
fd41e232-7144-11e5-ba71-058fbc01cf0b
711 rows, truncated to displaylimit of 25

Practice inner joining your own tables!

Question 7: How many unique Golden Retrievers who live in North Carolina are there in the Dognition database (you should get 30)?


In [43]:
%%sql
SELECT COUNT(distinct d.dog_guid)
FROM users u, dogs d
WHERE
    u.user_guid = d.user_guid
    AND d.breed = 'Golden Retriever'
    AND u.state = 'NC'


1 rows affected.
Out[43]:
COUNT(distinct d.dog_guid)
30

In [44]:
%%sql
SELECT COUNT(distinct d.dog_guid)
FROM users u, dogs d
WHERE
    u.user_guid = d.user_guid
    AND d.breed = 'Golden Retriever'
GROUP BY u.state
HAVING u.state='NC'


1 rows affected.
Out[44]:
COUNT(distinct d.dog_guid)
30

Question 8: How many unique customers within each membership type provided reviews (there should be 3208 in the membership type with the greatest number of customers, and 18 in the membership type with the fewest number of customers)?


In [49]:
%%sql
SELECT COUNT(DISTINCT u.user_guid)
FROM users u, reviews r
WHERE u.user_guid = r.user_guid
GROUP BY u.membership_type
LIMIT 5;


5 rows affected.
Out[49]:
COUNT(DISTINCT u.user_guid)
3208
1226
259
875
18

Question 9: For which 3 dog breeds do we have the greatest amount of site_activity data, (as defined by non-NULL values in script_detail_id)(your answers should be "Mixed", "Labrador Retriever", and "Labrador Retriever-Golden Retriever Mix"?


In [56]:
%%sql
SELECT d.breed, COUNT(s.script_detail_id) AS site_activity_amount
FROM dogs d, site_activities s
WHERE d.dog_guid = s.dog_guid
    AND s.script_detail_id IS NOT NULL
GROUP BY d.breed
ORDER BY site_activity_amount DESC
LIMIT 0, 5


5 rows affected.
Out[56]:
breed site_activity_amount
Mixed 93415
Labrador Retriever 38804
Labrador Retriever-Golden Retriever Mix 27498
Golden Retriever 19802
Poodle 14938

In [51]:
%%sql
Describe site_activities


11 rows affected.
Out[51]:
Field Type Null Key Default Extra
activity_type varchar(150) YES MUL None
description text YES None
membership_id int(11) YES None
category_id int(11) YES None
script_id int(11) YES None
created_at datetime NO None
updated_at datetime NO None
user_guid varchar(255) YES MUL None
script_detail_id int(11) YES None
test_name varchar(255) YES None
dog_guid varchar(255) YES MUL None

Practice any other inner joins you would like to try here!


In [ ]: