MySQL Exercise 9: Subqueries and Derived Tables

Now that you understand how joins work, in this lesson we are going to learn how to incorporate subqueries and derived tables into our queries.

Subqueries, which are also sometimes called inner queries or nested queries, are queries that are embedded within the context of another query. The output of a subquery is incorporated into the queries that surround it. Subqueries can be used in SELECT, WHERE, and FROM clauses. When they are used in FROM clauses they create what are called derived tables.

The main reasons to use subqueries are:

Sometimes they are the most logical way to retrieve the information you want
They can be used to isolate each logical part of a statement, which can be helpful for troubleshooting long and complicated queries
Sometimes they run faster than joins

Some people find subqueries easier to read than joins. However, that is often a result of not feeling comfortable with the concepts behind joins in the first place (I prefer join syntax, so admittedly, that is my preference).

Subqueries must be enclosed in parentheses. Subqueries have a couple of rules that joins don't:

ORDER BY phrases cannot be used in subqueries (although ORDER BY phrases can still be used in outer queries that contain subqueries).
Subqueries in SELECT or WHERE clauses that return more than one row must be used in combination with operators that are explicitly designed to handle multiple values, such as the IN operator. Otherwise, subqueries in SELECT or WHERE statements can output no more than 1 row.

So why would you use subqueries?

Let's look at some examples.

Start by loading the sql library and database, and making the Dognition database your default database:



In [2]:

    
%load_ext sql
%sql mysql://studentuser:studentpw@mysqlserver/dognitiondb
%sql USE dognitiondb

%config SqlMagic.displaylimit=25









    



0 rows affected.

1) "On the fly calculations" (or, doing calculations as you need them)

One of the main uses of subqueries is to calculate values as you need them. This allows you to use a summary calculation in your query without having to enter the value outputted by the calculation explicitly. A situation when this capability would be useful is if you wanted to see all the records that were greater than the average value of a subset of your data.

Recall one of the queries we wrote in "MySQL Exercise 4: Summarizing your Data" to calculate the average amount of time it took customers to complete all of the tests in the exam_answers table (we had to exclude negative durations from the calculation due to some abnormalities in the data):

SELECT AVG(TIMESTAMPDIFF(minute,start_time,end_time)) AS AvgDuration
FROM exam_answers
WHERE TIMESTAMPDIFF(minute,start_time,end_time)>0;

What if we wanted to look at just the data from rows whose durations were greater than the average, so that we could determine whether there are any features that seem to correlate with dogs taking a longer time to finish their tests? We could use a subquery to calculate the average duration, and then indicate in our SELECT and WHERE clauses that we only wanted to retrieve the rows whose durations were greater than the average. Here's what the query would look like:

SELECT *
FROM exam_answers 
WHERE TIMESTAMPDIFF(minute,start_time,end_time) >
    (SELECT AVG(TIMESTAMPDIFF(minute,start_time,end_time)) AS AvgDuration
     FROM exam_answers
     WHERE TIMESTAMPDIFF(minute,start_time,end_time)>0);

You can see that TIMESTAMPDIFF gets compared to the singular average value outputted by the subquery surrounded by parentheses. You can also see that it's easier to read the query as a whole if you indent and align all the clauses associated with the subquery, relative to the main query.

Question 1: How could you use a subquery to extract all the data from exam_answers that had test durations that were greater than the average duration for the "Yawn Warm-Up" game? Start by writing the query that gives you the average duration for the "Yawn Warm-Up" game by itself (and don't forget to exclude negative values; your average duration should be about 9934):



In [4]:

    
%%sql
SELECT AVG(TIMESTAMPDIFF(minute,start_time,end_time)) AS AvgDuration_YWU
FROM exam_answers
WHERE TIMESTAMPDIFF(minute,start_time,end_time)>0 
    AND test_name='Yawn Warm-Up'









    



1 rows affected.






    Out[4]:





    
        AvgDuration_YWU
    
    
        9933.5197



In [ ]:

Question 2: Once you've verified that your subquery is written correctly on its own, incorporate it into a main query to extract all the data from exam_answers that had test durations that were greater than the average duration for the "Yawn Warm-Up" game (you will get 11059 rows):



In [7]:

    
%%sql
SELECT test_name, TIMESTAMPDIFF(minute,start_time,end_time)
FROM exam_answers
WHERE TIMESTAMPDIFF(minute,start_time,end_time)>
    (SELECT AVG(TIMESTAMPDIFF(minute,start_time,end_time)) AS AvgDuration_YWU
        FROM exam_answers
        WHERE TIMESTAMPDIFF(minute,start_time,end_time)>0 
            AND test_name='Yawn Warm-Up')









    



11059 rows affected.






    Out[7]:





    
        test_name
        TIMESTAMPDIFF(minute,start_time,end_time)
    
    
        Sociability
        345139
    
    
        Emotions
        345139
    
    
        Shy/Boldness
        345139
    
    
        Perception
        345138
    
    
        Recall
        345138
    
    
        Attachment
        345138
    
    
        Puzzles
        345138
    
    
        Shy/Boldness
        345138
    
    
        Shy/Boldness
        345137
    
    
        Partnership
        345137
    
    
        Emotions
        345137
    
    
        Perception
        345137
    
    
        Obedience
        345137
    
    
        Attachment
        345136
    
    
        Attachment
        345136
    
    
        Puzzles
        345136
    
    
        Recall
        345135
    
    
        Obedience
        345135
    
    
        Perception
        345134
    
    
        Sociability
        345134
    
    
        Sociability
        345134
    
    
        Sociability
        345134
    
    
        Puzzles
        345133
    
    
        Excitability
        345132
    
    
        Attachment
        345132
    

11059 rows, truncated to displaylimit of 25

Now double check the results you just retrieved by replacing the subquery with "9934"; you should get the same results. It is helpful to get into the habit of including these kinds of quality checks into your query-writing process.

This example shows you how subqueries allow you retrieve information dynamically, rather than having to hard code in specific numbers or names. This capability is particularly useful when you need to build the output of your queries into reports or dashboards that are supposed to display real-time information.

2) Testing membership

Subqueries can also be useful for assessing whether groups of rows are members of other groups of rows. To use them in this capacity, we need to know about and practice the IN, NOT IN, EXISTS, and NOT EXISTS operators.

Recall from MySQL Exercise 2: Selecting Data Subsets Using WHERE that the IN operator allows you to use a WHERE clause to say how you want your results to relate to a list of multiple values. It's basically a condensed way of writing a sequence of OR statements. The following query would select all the users who live in the state of North Carolina (abbreviated "NC") or New York (abbreviated "NY"):

SELECT * 
FROM users
WHERE state IN ('NC','NY');

Notice the quotation marks around the members of the list referred to by the IN statement. These quotation marks are required since the state names are strings of text.

A query that would give an equivalent result would be:

SELECT * 
FROM users
WHERE state ='NC' OR state ='NY';

A query that would select all the users who do NOT live in the state of North Carolina or New York would be:

SELECT * 
FROM users
WHERE state NOT IN ('NC','NY');

Question 3: Use an IN operator to determine how many entries in the exam_answers tables are from the "Puzzles", "Numerosity", or "Bark Game" tests. You should get a count of 163022.



In [10]:

    
%%sql
SELECT COUNT(*)
FROM exam_answers
WHERE subcategory_name IN ('Puzzles', 'Numerosity', 'Bark Game');









    



1 rows affected.






    Out[10]:





    
        COUNT(*)
    
    
        163022

Question 4: Use a NOT IN operator to determine how many unique dogs in the dog table are NOT in the "Working", "Sporting", or "Herding" breeding groups. You should get an answer of 7961.



In [11]:

    
%%sql
SELECT COUNT(*)
FROM dogs
WHERE breed_group NOT IN ('Working', 'Sporting', 'Herding');









    



1 rows affected.






    Out[11]:





    
        COUNT(*)
    
    
        7961

EXISTS and NOT EXISTS perform similar functions to IN and NOT IN, but EXISTS and NOT EXISTS can only be used in subqueries. The syntax for EXISTS and NOT EXISTS statements is a little different than that of IN statements because EXISTS is not preceded by a column name or any other expression. The most important difference between EXISTS/NOT EXISTS and IN/NOT IN statements, though, is that unlike IN/NOT IN statements, EXISTS/NOT EXISTS are logical statements. Rather than returning raw data, per se, EXISTS/NOT EXISTS statements return a value of TRUE or FALSE. As a practical consequence, EXISTS statements are often written using an asterisk after the SELECT clause rather than explicit column names. The asterisk is faster to write, and since the output is just going to be a logical true/false either way, it does not matter whether you use an asterisk or explicit column names.

We can use EXISTS and a subquery to compare the users who are in the users table and dogs table, similar to what we practiced previously using joins. If we wanted to retrieve a list of all the users in the users table who were also in the dogs table, we could write:

SELECT DISTINCT u.user_guid AS uUserID
FROM users u
WHERE EXISTS (SELECT d.user_guid
              FROM dogs d 
              WHERE u.user_guid =d.user_guid);

You would get the same result if you wrote:

SELECT DISTINCT u.user_guid AS uUserID
FROM users u
WHERE EXISTS (SELECT *
              FROM dogs d 
              WHERE u.user_guid =d.user_guid);

Essentially, both of these queries say give me all the distinct user_guids from the users table that have a value of "TRUE" in my EXISTS clause. The results would be equivalent to an inner join with GROUP BY query. Now...

Question 5: How could you determine the number of unique users in the users table who were NOT in the dogs table using a NOT EXISTS clause? You should get the 2226, the same result as you got in Question 10 of MySQL Exercise 8: Joining Tables with Outer Joins.



In [30]:

    
%%sql
SELECT COUNT(DISTINCT user_guid)
FROM users u
WHERE NOT EXISTS (SELECT *
                  FROM dogs d
                  WHERE u.user_guid=d.user_guid
                )









    



1 rows affected.






    Out[30]:





    
        COUNT(DISTINCT user_guid)
    
    
        2226

Similarly, the query above can get result much faster then the query below:



In [ ]:

    
%%sql
SELECT COUNT(u.user_guid), d.user_guid
FROM users u LEFT JOIN dogs d on u.user_guid = d.user_guid
GROUP BY d.user_guid
HAVING d.user_guid IS NULL

Why the query below one can be much faster than the other one?



In [18]:

    
%%sql
DESCRIBE
SELECT DISTINCT u.user_guid AS uUserID
FROM users u
WHERE EXISTS (SELECT *
              FROM dogs d 
              WHERE u.user_guid =d.user_guid);









    



2 rows affected.






    Out[18]:





    
        id
        select_type
        table
        type
        possible_keys
        key
        key_len
        ref
        rows
        Extra
    
    
        1
        PRIMARY
        u
        ALL
        None
        None
        None
        None
        35182
        Using where; Using temporary
    
    
        2
        DEPENDENT SUBQUERY
        d
        ref
        idx_dogs_user_guid
        idx_dogs_user_guid
        183
        dognitiondb.u.user_guid
        1
        Using where; Using index



In [17]:

    
%%sql
DESCRIBE
SELECT DISTINCT u.user_guid AS uUserID
FROM users u JOIN dogs d ON u.user_guid=d.user_guid;









    



2 rows affected.






    Out[17]:





    
        id
        select_type
        table
        type
        possible_keys
        key
        key_len
        ref
        rows
        Extra
    
    
        1
        SIMPLE
        d
        index
        idx_dogs_user_guid
        idx_dogs_user_guid
        183
        None
        34859
        Using index; Using temporary
    
    
        1
        SIMPLE
        u
        ref
        unq_users_guid
        unq_users_guid
        111
        dognitiondb.d.user_guid
        1
        Using where



In [21]:

    
%%sql
SELECT DISTINCT u.user_guid AS uUserID
FROM users u
WHERE EXISTS (SELECT *
              FROM dogs d 
              WHERE u.user_guid =d.user_guid);









    



30967 rows affected.






    Out[21]:





    
        uUserID
    
    
        ce134e42-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1353d8-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135ab8-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13507c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135e14-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13615c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135f2c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136a1c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136ac6-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136c24-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136e36-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136ee0-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136f94-7144-11e5-ba71-058fbc01cf0b
    
    
        ce134be0-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1371a6-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1373ae-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13750c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1375b6-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1377b4-7144-11e5-ba71-058fbc01cf0b
    
    
        ce137700-7144-11e5-ba71-058fbc01cf0b
    
    
        ce137868-7144-11e5-ba71-058fbc01cf0b
    
    
        ce137912-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1379c6-7144-11e5-ba71-058fbc01cf0b
    
    
        ce137a7a-7144-11e5-ba71-058fbc01cf0b
    
    
        ce137034-7144-11e5-ba71-058fbc01cf0b
    

30967 rows, truncated to displaylimit of 25



In [20]:

    
%%sql
SELECT DISTINCT u.user_guid AS uUserID
FROM users u JOIN dogs d ON u.user_guid=d.user_guid;









    



30967 rows affected.






    Out[20]:





    
        uUserID
    
    
        ce134492-7144-11e5-ba71-058fbc01cf0b
    
    
        ce134a78-7144-11e5-ba71-058fbc01cf0b
    
    
        ce134be0-7144-11e5-ba71-058fbc01cf0b
    
    
        ce134d16-7144-11e5-ba71-058fbc01cf0b
    
    
        ce134e42-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13507c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135194-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1352ac-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1353d8-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135766-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135ab8-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135bd0-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135cf2-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135e14-7144-11e5-ba71-058fbc01cf0b
    
    
        ce135f2c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13615c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136210-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1362ba-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136378-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136580-7144-11e5-ba71-058fbc01cf0b
    
    
        ce1366e8-7144-11e5-ba71-058fbc01cf0b
    
    
        ce13697c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136a1c-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136ac6-7144-11e5-ba71-058fbc01cf0b
    
    
        ce136c24-7144-11e5-ba71-058fbc01cf0b
    

30967 rows, truncated to displaylimit of 25

On SQL Server, this will list all the indexes for a specified table:



In [23]:

    
%%sql
select * from sys.indexes
where object_id = (select object_id from sys.objects where name = 'users')









    



(_mysql_exceptions.OperationalError) (1142, "SELECT command denied to user 'studentuser'@'172.17.0.26' for table 'indexes'") [SQL: "select * from sys.indexes\nwhere object_id = (select object_id from sys.objects where name = 'users')"]

3) Accurate logical representations of desired output and Derived Tables

A third situation in which subqueries can be useful is when they simply represent the logic of what you want better than joins.

We saw an example of this in our last MySQL Exercise. We wanted a list of each dog a user in the users table owns, with its accompanying breed information whenever possible. To achieve this, we wrote this query in Question 6:

SELECT u.user_guid AS uUserID, d.user_guid AS dUserID, d.dog_guid AS dDogID, d.breed
FROM users u LEFT JOIN dogs d
   ON u.user_guid=d.user_guid

Once we saw the "exploding rows" phenomenon due to duplicate rows, we wrote a follow-up query in Question 7 to assess how many rows would be outputted per user_id when we left joined the users table on the dogs table:

SELECT u.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM users u LEFT JOIN dogs d
   ON u.user_guid=d.user_guid
GROUP BY u.user_guid
ORDER BY numrows DESC

This same general query without the COUNT function could have been used to output a complete list of all the distinct users in the users table, their dogs, and their dogs' breed information. However, the method we used to arrive at this was not very pretty or logically satisfying. Rather than joining many duplicated rows and fixing the results later with the GROUP BY clause, it would be much more elegant if we could simply join the distinct UserIDs in the first place. There is no way to do that with join syntax, on its own. However, you can use subqueries in combination with joins to achieve this goal.

To complete the join on ONLY distinct UserIDs from the users table, we could write:

SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u) AS DistinctUUsersID 
LEFT JOIN dogs d
   ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC

Try it yourself:



In [6]:

    
%%sql
SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u) AS DistinctUUsersID 
LEFT JOIN dogs d
   ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC









    



33193 rows affected.






    Out[6]:





    
        uUserID
        dUserID
        numrows
    
    
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        1819
    
    
        ce225842-7144-11e5-ba71-058fbc01cf0b
        ce225842-7144-11e5-ba71-058fbc01cf0b
        26
    
    
        ce2258a6-7144-11e5-ba71-058fbc01cf0b
        ce2258a6-7144-11e5-ba71-058fbc01cf0b
        20
    
    
        ce135e14-7144-11e5-ba71-058fbc01cf0b
        ce135e14-7144-11e5-ba71-058fbc01cf0b
        13
    
    
        ce29675e-7144-11e5-ba71-058fbc01cf0b
        ce29675e-7144-11e5-ba71-058fbc01cf0b
        11
    
    
        ce134492-7144-11e5-ba71-058fbc01cf0b
        ce134492-7144-11e5-ba71-058fbc01cf0b
        9
    
    
        ce6676d0-7144-11e5-ba71-058fbc01cf0b
        ce6676d0-7144-11e5-ba71-058fbc01cf0b
        8
    
    
        ce83d2ca-7144-11e5-ba71-058fbc01cf0b
        ce83d2ca-7144-11e5-ba71-058fbc01cf0b
        8
    
    
        ce32305a-7144-11e5-ba71-058fbc01cf0b
        ce32305a-7144-11e5-ba71-058fbc01cf0b
        7
    
    
        ce7adeea-7144-11e5-ba71-058fbc01cf0b
        ce7adeea-7144-11e5-ba71-058fbc01cf0b
        7
    
    
        ce135766-7144-11e5-ba71-058fbc01cf0b
        ce135766-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce47264a-7144-11e5-ba71-058fbc01cf0b
        ce47264a-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce66713a-7144-11e5-ba71-058fbc01cf0b
        ce66713a-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce66b9b0-7144-11e5-ba71-058fbc01cf0b
        ce66b9b0-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce8c2d08-7144-11e5-ba71-058fbc01cf0b
        ce8c2d08-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce964888-7144-11e5-ba71-058fbc01cf0b
        ce964888-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce9a381c-7144-11e5-ba71-058fbc01cf0b
        ce9a381c-7144-11e5-ba71-058fbc01cf0b
        6
    
    
        ce262364-7144-11e5-ba71-058fbc01cf0b
        ce262364-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce26b266-7144-11e5-ba71-058fbc01cf0b
        ce26b266-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce26e790-7144-11e5-ba71-058fbc01cf0b
        ce26e790-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce6e67e6-7144-11e5-ba71-058fbc01cf0b
        ce6e67e6-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce728bf0-7144-11e5-ba71-058fbc01cf0b
        ce728bf0-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce739220-7144-11e5-ba71-058fbc01cf0b
        ce739220-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce7c2a66-7144-11e5-ba71-058fbc01cf0b
        ce7c2a66-7144-11e5-ba71-058fbc01cf0b
        5
    
    
        ce7da332-7144-11e5-ba71-058fbc01cf0b
        ce7da332-7144-11e5-ba71-058fbc01cf0b
        5
    

33193 rows, truncated to displaylimit of 25

Queries that include subqueries always run the innermost subquery first, and then run subsequent queries sequentially in order from the innermost query to the outermost query.

Therefore, the query we just wrote extracts the distinct user_guids from the users table first, and then left joins that reduced subset of user_guids on the dogs table. As mentioned at the beginning of the lesson, since the subquery is in the FROM statement, it actually creates a temporary table, called a derived table, that is then incorporated into the rest of the query.

There are several important points to notice about the syntax of this subquery. First, an alias of "DistinctUUsersID" is used to name the results of the subquery. We are required to give an alias to any derived table we create in subqueries within FROM statements. Otherwise there would be no way for the database to refer to the multiple columns within the temporary results we create.

Second, we need to use this alias every time we want to execute a function that uses the derived table. Remember that the results in which we are interested require a join between the dogs table and the temporary table, not the dogs table and the original users table with duplicates. That means we need to make sure we reference the temporary table alias in the ON, GROUP BY, and SELECT clauses.

Third, relatedly, aliases used within subqueries can refer to tables outside of the subqueries. However, outer queries cannot refer to aliases created within subqueries unless those aliases are explicitly part of the subquery output. In other words, if you wrote the first line of the query above as:

SELECT u.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
...

the query would not execute because the alias "u" is contained inside the subquery, but is not included in the output. Go ahead and try it to see what the error message looks like:



In [7]:

    
%%sql
SELECT u.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u) AS DistinctUUsersID 
LEFT JOIN dogs d
   ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC









    



(_mysql_exceptions.OperationalError) (1054, "Unknown column 'u.user_guid' in 'field list'") [SQL: 'SELECT u.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows\nFROM (SELECT DISTINCT u.user_guid \n      FROM users u) AS DistinctUUsersID \nLEFT JOIN dogs d\n   ON DistinctUUsersID.user_guid=d.user_guid\nGROUP BY DistinctUUsersID.user_guid\nORDER BY numrows DESC']

A similar thing would happen if you tried to use the alias u in the GROUP BY statement.

Another thing to take note of is that when you use subqueries in FROM statements, the temporary table you create can have multiple columns in the output (unlike when you use subqueries in outside SELECT statements). But for that same reason, subqueries in FROM statements can be very computationally intensive. Therefore, it's a good idea to use them sparingly, especially when you have very large data sets.

Overall, subqueries and joins can often be used interchangeably. Some people strongly prefer one approach over another, but there is no consensus about which approach is best. When you are analyzing very large datasets, it's a good idea to test which approach will likely be faster or easier to troubleshoot for your particular application.

Let's practice some more subqueries!

Question 6: Write a query using an IN clause and equijoin syntax that outputs the dog_guid, breed group, state of the owner, and zip of the owner for each distinct dog in the Working, Sporting, and Herding breed groups. (You should get 10,254 rows; the query will be a little slower than some of the others we have practiced)



In [8]:

    
%%sql
SELECT DISTINCT d.dog_guid, d.breed_group, u.state, u.zip
FROM dogs d, users u
WHERE breed_group IN ('Working','Sporting','Herding') AND d.user_guid=u.user_guid;









    



10254 rows affected.






    Out[8]:





    
        dog_guid
        breed_group
        state
        zip
    
    
        fd27b272-7144-11e5-ba71-058fbc01cf0b
        Sporting
        ND
        58201
    
    
        fd27b5ba-7144-11e5-ba71-058fbc01cf0b
        Herding
        MA
        1005
    
    
        fd3fb0f2-7144-11e5-ba71-058fbc01cf0b
        Herding
        MA
        1005
    
    
        fd27b6b4-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CT
        6820
    
    
        fd27b79a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        IL
        60093
    
    
        fd27b948-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98001
    
    
        fd27c1c2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WA
        98117
    
    
        fd27c0fa-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WA
        98117
    
    
        fd27c7d0-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        95003
    
    
        fd27c8d4-7144-11e5-ba71-058fbc01cf0b
        Working
        VA
        22903
    
    
        fd27cf28-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WY
        82401
    
    
        fd27cfaa-7144-11e5-ba71-058fbc01cf0b
        Herding
        IL
        60030
    
    
        fd27d02c-7144-11e5-ba71-058fbc01cf0b
        Herding
        VA
        22044
    
    
        fd27d2ca-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        90045
    
    
        fd27d34c-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d3d8-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d45a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        AZ
        85635
    
    
        fd27d770-7144-11e5-ba71-058fbc01cf0b
        Herding
        NC
        27713
    
    
        fd27db08-7144-11e5-ba71-058fbc01cf0b
        Herding
        CO
        80304
    
    
        fd27e1e8-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e31e-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e580-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        94952
    
    
        fd27ec1a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        NC
        27371
    
    
        fd27efb2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        92116
    
    
        fd27f110-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98225
    

10254 rows, truncated to displaylimit of 25



In [13]:

    
%%sql
SELECT DISTINCT dd.dog_guid, dd.breed_group, u.state, u.zip
FROM users u, (
    SELECT d.user_guid, d.dog_guid, d.breed_group
    FROM dogs d
    WHERE breed_group IN ('Working', 'Sporting', 'Herding')
    ) AS dd
WHERE dd.user_guid=u.user_guid









    



10254 rows affected.






    Out[13]:





    
        dog_guid
        breed_group
        state
        zip
    
    
        fd27b272-7144-11e5-ba71-058fbc01cf0b
        Sporting
        ND
        58201
    
    
        fd27b5ba-7144-11e5-ba71-058fbc01cf0b
        Herding
        MA
        1005
    
    
        fd27b6b4-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CT
        6820
    
    
        fd27b79a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        IL
        60093
    
    
        fd27b948-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98001
    
    
        fd27c1c2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WA
        98117
    
    
        fd27c7d0-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        95003
    
    
        fd27c8d4-7144-11e5-ba71-058fbc01cf0b
        Working
        VA
        22903
    
    
        fd27cf28-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WY
        82401
    
    
        fd27cfaa-7144-11e5-ba71-058fbc01cf0b
        Herding
        IL
        60030
    
    
        fd27d02c-7144-11e5-ba71-058fbc01cf0b
        Herding
        VA
        22044
    
    
        fd27d2ca-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        90045
    
    
        fd27d34c-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d3d8-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d45a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        AZ
        85635
    
    
        fd27d770-7144-11e5-ba71-058fbc01cf0b
        Herding
        NC
        27713
    
    
        fd27db08-7144-11e5-ba71-058fbc01cf0b
        Herding
        CO
        80304
    
    
        fd27e1e8-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e31e-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e580-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        94952
    
    
        fd27efb2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        92116
    
    
        fd27f110-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98225
    
    
        fd27f4c6-7144-11e5-ba71-058fbc01cf0b
        Herding
        1
        249081
    
    
        fd27f732-7144-11e5-ba71-058fbc01cf0b
        Herding
        1
        249081
    
    
        fd280236-7144-11e5-ba71-058fbc01cf0b
        Working
        OH
        45220
    

10254 rows, truncated to displaylimit of 25



In [ ]:

    
%%sql
SELECT DISTINCT d.dog_guid
FROM dogs d
WHERE breed_group IN ('Working', 'Sporting', 'Herding')

Question 7: Write the same query as in Question 6 using traditional join syntax.



In [12]:

    
%%sql
SELECT DISTINCT d.dog_guid, d.breed_group, u.state, u.zip
FROM dogs d JOIN users u ON d.user_guid=u.user_guid
WHERE d.breed_group IN ('Working','Sporting','Herding');









    



10254 rows affected.






    Out[12]:





    
        dog_guid
        breed_group
        state
        zip
    
    
        fd27b272-7144-11e5-ba71-058fbc01cf0b
        Sporting
        ND
        58201
    
    
        fd27b5ba-7144-11e5-ba71-058fbc01cf0b
        Herding
        MA
        1005
    
    
        fd3fb0f2-7144-11e5-ba71-058fbc01cf0b
        Herding
        MA
        1005
    
    
        fd27b6b4-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CT
        6820
    
    
        fd27b79a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        IL
        60093
    
    
        fd27b948-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98001
    
    
        fd27c1c2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WA
        98117
    
    
        fd27c0fa-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WA
        98117
    
    
        fd27c7d0-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        95003
    
    
        fd27c8d4-7144-11e5-ba71-058fbc01cf0b
        Working
        VA
        22903
    
    
        fd27cf28-7144-11e5-ba71-058fbc01cf0b
        Sporting
        WY
        82401
    
    
        fd27cfaa-7144-11e5-ba71-058fbc01cf0b
        Herding
        IL
        60030
    
    
        fd27d02c-7144-11e5-ba71-058fbc01cf0b
        Herding
        VA
        22044
    
    
        fd27d2ca-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        90045
    
    
        fd27d34c-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d3d8-7144-11e5-ba71-058fbc01cf0b
        Herding
        A
        68128
    
    
        fd27d45a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        AZ
        85635
    
    
        fd27d770-7144-11e5-ba71-058fbc01cf0b
        Herding
        NC
        27713
    
    
        fd27db08-7144-11e5-ba71-058fbc01cf0b
        Herding
        CO
        80304
    
    
        fd27e1e8-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e31e-7144-11e5-ba71-058fbc01cf0b
        Herding
        83
        6230
    
    
        fd27e580-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        94952
    
    
        fd27ec1a-7144-11e5-ba71-058fbc01cf0b
        Sporting
        NC
        27371
    
    
        fd27efb2-7144-11e5-ba71-058fbc01cf0b
        Sporting
        CA
        92116
    
    
        fd27f110-7144-11e5-ba71-058fbc01cf0b
        Working
        WA
        98225
    

10254 rows, truncated to displaylimit of 25

Question 8: Earlier we examined unique users in the users table who were NOT in the dogs table. Use a NOT EXISTS clause to examine all the users in the dogs table that are not in the users table (you should get 2 rows in your output).



In [2]:

    
%%sql
SELECT d.user_guid
FROM dogs d
WHERE NOT EXISTS (SELECT * 
                  FROM users u
                  WHERE u.user_guid=d.user_guid
                 )









    



2 rows affected.






    Out[2]:





    
        user_guid
    
    
        None
    
    
        None

Question 9: We saw earlier that user_guid 'ce7b75bc-7144-11e5-ba71-058fbc01cf0b' still ends up with 1819 rows of output after a left outer join with the dogs table. If you investigate why, you'll find out that's because there are duplicate user_guids in the dogs table as well. How would you adapt the query we wrote earlier (copied below) to only join unique UserIDs from the users table with unique UserIDs from the dog table?

Join we wrote earlier:

SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u) AS DistinctUUsersID 
LEFT JOIN dogs d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC;

Let's build our way up to the correct query. To troubleshoot, let's only examine the rows related to user_guid 'ce7b75bc-7144-11e5-ba71-058fbc01cf0b', since that's the userID that is causing most of the trouble. Rewrite the query above to only LEFT JOIN distinct user(s) from the user table whose user_guid='ce7b75bc-7144-11e5-ba71-058fbc01cf0b'. The first two output columns should have matching user_guids, and the numrows column should have one row with a value of 1819:



In [2]:

    
%%sql
SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u
      WHERE u.user_guid='ce7b75bc-7144-11e5-ba71-058fbc01cf0b'
     ) AS DistinctUUsersID 
LEFT JOIN dogs d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC;









    



1 rows affected.






    Out[2]:





    
        uUserID
        dUserID
        numrows
    
    
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        1819

Question 10: Now let's prepare and test the inner query for the right half of the join. Give the dogs table an alias, and write a query that would select the distinct user_guids from the dogs table (we will use this query as a inner subquery in subsequent questions, so you will need an alias to differentiate the user_guid column of the dogs table from the user_guid column of the users table).



In [ ]:

    
%%sql
SELECT DISTINCT dall.user_guid 
FROM dogs dall

Question 11: Now insert the query you wrote in Question 10 as a subquery on the right part of the join you wrote in question 9. The output should return columns that should have matching user_guids, and 1 row in the numrows column with a value of 1. If you are getting errors, make sure you have given an alias to the derived table you made to extract the distinct user_guids from the dogs table, and double-check that your aliases are referenced correctly in the SELECT and ON statements.



In [4]:

    
%%sql
SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u
      WHERE u.user_guid='ce7b75bc-7144-11e5-ba71-058fbc01cf0b'
     ) AS DistinctUUsersID 
LEFT JOIN (SELECT DISTINCT dall.user_guid 
            FROM dogs dall) AS d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC;









    



1 rows affected.






    Out[4]:





    
        uUserID
        dUserID
        numrows
    
    
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        1

Question 12: Adapt the query from Question 10 so that, in theory, you would retrieve a full list of all the DogIDs a user in the users table owns, with its accompagnying breed information whenever possible. HOWEVER, BEFORE YOU RUN THE QUERY MAKE SURE TO LIMIT YOUR OUTPUT TO 100 ROWS WITHIN THE SUBQUERY TO THE LEFT OF YOUR JOIN. If you run the query without imposing limits it will take a very long time. If you try to limit the output by just putting a limit clause at the end of the outermost query, the database will still have to hold the entire derived tables in memory and join each row of the derived tables before limiting the output. If you put the limit clause in the subquery to the left of the join, the database will only have to join 100 rows of data.



In [7]:

    
%%sql
SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, d.breed
FROM (SELECT DISTINCT u.user_guid 
      FROM users u
      LIMIT 100
     ) AS DistinctUUsersID 
LEFT JOIN (SELECT DISTINCT dall.user_guid, dall.breed
            FROM dogs dall) AS d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
;









    



100 rows affected.






    Out[7]:





    
        uUserID
        dUserID
        breed
    
    
        ce134a78-7144-11e5-ba71-058fbc01cf0b
        ce134a78-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
    
    
        ce134be0-7144-11e5-ba71-058fbc01cf0b
        ce134be0-7144-11e5-ba71-058fbc01cf0b
        German Shepherd Dog-Nova Scotia Duck Tolling Retriever Mix
    
    
        ce134e42-7144-11e5-ba71-058fbc01cf0b
        ce134e42-7144-11e5-ba71-058fbc01cf0b
        Labrador Retriever
    
    
        ce13507c-7144-11e5-ba71-058fbc01cf0b
        ce13507c-7144-11e5-ba71-058fbc01cf0b
        Golden Retriever
    
    
        ce135194-7144-11e5-ba71-058fbc01cf0b
        ce135194-7144-11e5-ba71-058fbc01cf0b
        Dalmatian
    
    
        ce1353d8-7144-11e5-ba71-058fbc01cf0b
        ce1353d8-7144-11e5-ba71-058fbc01cf0b
        Shetland Sheepdog
    
    
        ce135766-7144-11e5-ba71-058fbc01cf0b
        ce135766-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
    
    
        ce135ab8-7144-11e5-ba71-058fbc01cf0b
        ce135ab8-7144-11e5-ba71-058fbc01cf0b
        Golden Retriever
    
    
        ce135bd0-7144-11e5-ba71-058fbc01cf0b
        ce135bd0-7144-11e5-ba71-058fbc01cf0b
        Mixed
    
    
        ce135e14-7144-11e5-ba71-058fbc01cf0b
        ce135e14-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
    
    
        ce135f2c-7144-11e5-ba71-058fbc01cf0b
        ce135f2c-7144-11e5-ba71-058fbc01cf0b
        Mixed
    
    
        ce13615c-7144-11e5-ba71-058fbc01cf0b
        ce13615c-7144-11e5-ba71-058fbc01cf0b
        Siberian Husky
    
    
        ce136a1c-7144-11e5-ba71-058fbc01cf0b
        ce136a1c-7144-11e5-ba71-058fbc01cf0b
        Labrador Retriever
    
    
        ce136ac6-7144-11e5-ba71-058fbc01cf0b
        ce136ac6-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu-Poodle Mix
    
    
        ce136c24-7144-11e5-ba71-058fbc01cf0b
        ce136c24-7144-11e5-ba71-058fbc01cf0b
        German Shepherd Dog-Pembroke Welsh Corgi Mix
    
    
        ce136e36-7144-11e5-ba71-058fbc01cf0b
        ce136e36-7144-11e5-ba71-058fbc01cf0b
        Vizsla
    
    
        ce136ee0-7144-11e5-ba71-058fbc01cf0b
        ce136ee0-7144-11e5-ba71-058fbc01cf0b
        Pug
    
    
        ce136f94-7144-11e5-ba71-058fbc01cf0b
        ce136f94-7144-11e5-ba71-058fbc01cf0b
        Boxer
    
    
        ce137034-7144-11e5-ba71-058fbc01cf0b
        ce137034-7144-11e5-ba71-058fbc01cf0b
        Weimaraner
    
    
        ce1371a6-7144-11e5-ba71-058fbc01cf0b
        ce1371a6-7144-11e5-ba71-058fbc01cf0b
        Beagle
    
    
        ce1373ae-7144-11e5-ba71-058fbc01cf0b
        ce1373ae-7144-11e5-ba71-058fbc01cf0b
        Mixed
    
    
        ce137458-7144-11e5-ba71-058fbc01cf0b
        ce137458-7144-11e5-ba71-058fbc01cf0b
        Finnish Spitz
    
    
        ce13750c-7144-11e5-ba71-058fbc01cf0b
        ce13750c-7144-11e5-ba71-058fbc01cf0b
        Chesapeake Bay Retriever
    
    
        ce1375b6-7144-11e5-ba71-058fbc01cf0b
        ce1375b6-7144-11e5-ba71-058fbc01cf0b
        Border Collie
    
    
        ce137656-7144-11e5-ba71-058fbc01cf0b
        ce137656-7144-11e5-ba71-058fbc01cf0b
        Brussels Griffon
    

100 rows, truncated to displaylimit of 25

Question 13: You might have a good guess by now about why there are duplicate rows in the dogs table and users table, even though most corporate databases are configured to prevent duplicate rows from ever being accepted. To be sure, though, let's adapt this query we wrote above:

SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid FROM users u) AS DistinctUUsersID 
LEFT JOIN dogs d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
ORDER BY numrows DESC

Add dog breed and dog weight to the columns that will be included in the final output of your query. In addition, use a HAVING clause to include only UserIDs who would have more than 10 rows in the output of the left join (your output should contain 5 rows).



In [ ]:

    
%%sql
SELECT DistinctUUsersID.user_guid AS uUserID, d.user_guid AS dUserID, d.breed, d.weight, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid 
      FROM users u
      LIMIT 100 # Without limit, the query take forever to run...
     ) AS DistinctUUsersID 
LEFT JOIN (SELECT DISTINCT dall.user_guid, dall.breed, dall.weight
            FROM dogs dall) AS d
  ON DistinctUUsersID.user_guid=d.user_guid
GROUP BY DistinctUUsersID.user_guid
HAVING numrows>10
ORDER BY numrows DESC;



In [3]:

    
%%sql
SELECT DistictUUsersID.user_guid AS userid, d.breed, d.weight, count(*) AS numrows
FROM (SELECT DISTINCT u.user_guid
FROM users u) AS DistictUUsersID
LEFT JOIN dogs d
ON DistictUUsersID.user_guid=d.user_guid
GROUP BY DistictUUsersID.user_guid
HAVING numrows>10
ORDER BY numrows DESC;









    



5 rows affected.






    Out[3]:





    
        userid
        breed
        weight
        numrows
    
    
        ce7b75bc-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
        190
        1819
    
    
        ce225842-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
        190
        26
    
    
        ce2258a6-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
        190
        20
    
    
        ce135e14-7144-11e5-ba71-058fbc01cf0b
        Shih Tzu
        190
        13
    
    
        ce29675e-7144-11e5-ba71-058fbc01cf0b
        Labrador Retriever- Mix
        60
        11

You can see that almost all of the UserIDs that are causing problems are Shih Tzus that weigh 190 pounds. As we learned in earlier lessons, Dognition used this combination of breed and weight to code for testing accounts. These UserIDs do not represent real data. These types of testing entries would likely be cleaned out of databases used in large established companies, but could certainly still be present in either new databases that are still being prepared and configured, or in small companies which have not had time or resources to perfect their data storage.

There are not very many incorrect entries in the Dognition database and most of the time these entries will not appreciably affect your queries or analyses. However, you have now seen the effects such entries can have in the rare cases when you need to implement outer joins on tables that have duplicate rows or linking columns with many to many relationships. Hopefully, understanding these rare cases has helped you understand more deeply the fundamental concepts behind joining tables in relational databases.

Feel free to practice more subqueries below!



In [ ]:

test_name	TIMESTAMPDIFF(minute,start_time,end_time)
Sociability	345139
Emotions	345139
Shy/Boldness	345139
Perception	345138
Recall	345138
Attachment	345138
Puzzles	345138
Shy/Boldness	345138
Shy/Boldness	345137
Partnership	345137
Emotions	345137
Perception	345137
Obedience	345137
Attachment	345136
Attachment	345136
Puzzles	345136
Recall	345135
Obedience	345135
Perception	345134
Sociability	345134
Sociability	345134
Sociability	345134
Puzzles	345133
Excitability	345132
Attachment	345132

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	u	ALL	None	None	None	None	35182	Using where; Using temporary
2	DEPENDENT SUBQUERY	d	ref	idx_dogs_user_guid	idx_dogs_user_guid	183	dognitiondb.u.user_guid	1	Using where; Using index

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	d	index	idx_dogs_user_guid	idx_dogs_user_guid	183	None	34859	Using index; Using temporary
1	SIMPLE	u	ref	unq_users_guid	unq_users_guid	111	dognitiondb.d.user_guid	1	Using where

uUserID	dUserID	numrows
ce7b75bc-7144-11e5-ba71-058fbc01cf0b	ce7b75bc-7144-11e5-ba71-058fbc01cf0b	1819
ce225842-7144-11e5-ba71-058fbc01cf0b	ce225842-7144-11e5-ba71-058fbc01cf0b	26
ce2258a6-7144-11e5-ba71-058fbc01cf0b	ce2258a6-7144-11e5-ba71-058fbc01cf0b	20
ce135e14-7144-11e5-ba71-058fbc01cf0b	ce135e14-7144-11e5-ba71-058fbc01cf0b	13
ce29675e-7144-11e5-ba71-058fbc01cf0b	ce29675e-7144-11e5-ba71-058fbc01cf0b	11
ce134492-7144-11e5-ba71-058fbc01cf0b	ce134492-7144-11e5-ba71-058fbc01cf0b	9
ce6676d0-7144-11e5-ba71-058fbc01cf0b	ce6676d0-7144-11e5-ba71-058fbc01cf0b	8
ce83d2ca-7144-11e5-ba71-058fbc01cf0b	ce83d2ca-7144-11e5-ba71-058fbc01cf0b	8
ce32305a-7144-11e5-ba71-058fbc01cf0b	ce32305a-7144-11e5-ba71-058fbc01cf0b	7
ce7adeea-7144-11e5-ba71-058fbc01cf0b	ce7adeea-7144-11e5-ba71-058fbc01cf0b	7
ce135766-7144-11e5-ba71-058fbc01cf0b	ce135766-7144-11e5-ba71-058fbc01cf0b	6
ce47264a-7144-11e5-ba71-058fbc01cf0b	ce47264a-7144-11e5-ba71-058fbc01cf0b	6
ce66713a-7144-11e5-ba71-058fbc01cf0b	ce66713a-7144-11e5-ba71-058fbc01cf0b	6
ce66b9b0-7144-11e5-ba71-058fbc01cf0b	ce66b9b0-7144-11e5-ba71-058fbc01cf0b	6
ce8c2d08-7144-11e5-ba71-058fbc01cf0b	ce8c2d08-7144-11e5-ba71-058fbc01cf0b	6
ce964888-7144-11e5-ba71-058fbc01cf0b	ce964888-7144-11e5-ba71-058fbc01cf0b	6
ce9a381c-7144-11e5-ba71-058fbc01cf0b	ce9a381c-7144-11e5-ba71-058fbc01cf0b	6
ce262364-7144-11e5-ba71-058fbc01cf0b	ce262364-7144-11e5-ba71-058fbc01cf0b	5
ce26b266-7144-11e5-ba71-058fbc01cf0b	ce26b266-7144-11e5-ba71-058fbc01cf0b	5
ce26e790-7144-11e5-ba71-058fbc01cf0b	ce26e790-7144-11e5-ba71-058fbc01cf0b	5
ce6e67e6-7144-11e5-ba71-058fbc01cf0b	ce6e67e6-7144-11e5-ba71-058fbc01cf0b	5
ce728bf0-7144-11e5-ba71-058fbc01cf0b	ce728bf0-7144-11e5-ba71-058fbc01cf0b	5
ce739220-7144-11e5-ba71-058fbc01cf0b	ce739220-7144-11e5-ba71-058fbc01cf0b	5
ce7c2a66-7144-11e5-ba71-058fbc01cf0b	ce7c2a66-7144-11e5-ba71-058fbc01cf0b	5
ce7da332-7144-11e5-ba71-058fbc01cf0b	ce7da332-7144-11e5-ba71-058fbc01cf0b	5

dog_guid	breed_group	state	zip
fd27b272-7144-11e5-ba71-058fbc01cf0b	Sporting	ND	58201
fd27b5ba-7144-11e5-ba71-058fbc01cf0b	Herding	MA	1005
fd3fb0f2-7144-11e5-ba71-058fbc01cf0b	Herding	MA	1005
fd27b6b4-7144-11e5-ba71-058fbc01cf0b	Sporting	CT	6820
fd27b79a-7144-11e5-ba71-058fbc01cf0b	Sporting	IL	60093
fd27b948-7144-11e5-ba71-058fbc01cf0b	Working	WA	98001
fd27c1c2-7144-11e5-ba71-058fbc01cf0b	Sporting	WA	98117
fd27c0fa-7144-11e5-ba71-058fbc01cf0b	Sporting	WA	98117
fd27c7d0-7144-11e5-ba71-058fbc01cf0b	Sporting	CA	95003
fd27c8d4-7144-11e5-ba71-058fbc01cf0b	Working	VA	22903
fd27cf28-7144-11e5-ba71-058fbc01cf0b	Sporting	WY	82401
fd27cfaa-7144-11e5-ba71-058fbc01cf0b	Herding	IL	60030
fd27d02c-7144-11e5-ba71-058fbc01cf0b	Herding	VA	22044
fd27d2ca-7144-11e5-ba71-058fbc01cf0b	Sporting	CA	90045
fd27d34c-7144-11e5-ba71-058fbc01cf0b	Herding	A	68128
fd27d3d8-7144-11e5-ba71-058fbc01cf0b	Herding	A	68128
fd27d45a-7144-11e5-ba71-058fbc01cf0b	Sporting	AZ	85635
fd27d770-7144-11e5-ba71-058fbc01cf0b	Herding	NC	27713
fd27db08-7144-11e5-ba71-058fbc01cf0b	Herding	CO	80304
fd27e1e8-7144-11e5-ba71-058fbc01cf0b	Herding	83	6230
fd27e31e-7144-11e5-ba71-058fbc01cf0b	Herding	83	6230
fd27e580-7144-11e5-ba71-058fbc01cf0b	Sporting	CA	94952
fd27ec1a-7144-11e5-ba71-058fbc01cf0b	Sporting	NC	27371
fd27efb2-7144-11e5-ba71-058fbc01cf0b	Sporting	CA	92116
fd27f110-7144-11e5-ba71-058fbc01cf0b	Working	WA	98225

uUserID	dUserID	breed
ce134a78-7144-11e5-ba71-058fbc01cf0b	ce134a78-7144-11e5-ba71-058fbc01cf0b	Shih Tzu
ce134be0-7144-11e5-ba71-058fbc01cf0b	ce134be0-7144-11e5-ba71-058fbc01cf0b	German Shepherd Dog-Nova Scotia Duck Tolling Retriever Mix
ce134e42-7144-11e5-ba71-058fbc01cf0b	ce134e42-7144-11e5-ba71-058fbc01cf0b	Labrador Retriever
ce13507c-7144-11e5-ba71-058fbc01cf0b	ce13507c-7144-11e5-ba71-058fbc01cf0b	Golden Retriever
ce135194-7144-11e5-ba71-058fbc01cf0b	ce135194-7144-11e5-ba71-058fbc01cf0b	Dalmatian
ce1353d8-7144-11e5-ba71-058fbc01cf0b	ce1353d8-7144-11e5-ba71-058fbc01cf0b	Shetland Sheepdog
ce135766-7144-11e5-ba71-058fbc01cf0b	ce135766-7144-11e5-ba71-058fbc01cf0b	Shih Tzu
ce135ab8-7144-11e5-ba71-058fbc01cf0b	ce135ab8-7144-11e5-ba71-058fbc01cf0b	Golden Retriever
ce135bd0-7144-11e5-ba71-058fbc01cf0b	ce135bd0-7144-11e5-ba71-058fbc01cf0b	Mixed
ce135e14-7144-11e5-ba71-058fbc01cf0b	ce135e14-7144-11e5-ba71-058fbc01cf0b	Shih Tzu
ce135f2c-7144-11e5-ba71-058fbc01cf0b	ce135f2c-7144-11e5-ba71-058fbc01cf0b	Mixed
ce13615c-7144-11e5-ba71-058fbc01cf0b	ce13615c-7144-11e5-ba71-058fbc01cf0b	Siberian Husky
ce136a1c-7144-11e5-ba71-058fbc01cf0b	ce136a1c-7144-11e5-ba71-058fbc01cf0b	Labrador Retriever
ce136ac6-7144-11e5-ba71-058fbc01cf0b	ce136ac6-7144-11e5-ba71-058fbc01cf0b	Shih Tzu-Poodle Mix
ce136c24-7144-11e5-ba71-058fbc01cf0b	ce136c24-7144-11e5-ba71-058fbc01cf0b	German Shepherd Dog-Pembroke Welsh Corgi Mix
ce136e36-7144-11e5-ba71-058fbc01cf0b	ce136e36-7144-11e5-ba71-058fbc01cf0b	Vizsla
ce136ee0-7144-11e5-ba71-058fbc01cf0b	ce136ee0-7144-11e5-ba71-058fbc01cf0b	Pug
ce136f94-7144-11e5-ba71-058fbc01cf0b	ce136f94-7144-11e5-ba71-058fbc01cf0b	Boxer
ce137034-7144-11e5-ba71-058fbc01cf0b	ce137034-7144-11e5-ba71-058fbc01cf0b	Weimaraner
ce1371a6-7144-11e5-ba71-058fbc01cf0b	ce1371a6-7144-11e5-ba71-058fbc01cf0b	Beagle
ce1373ae-7144-11e5-ba71-058fbc01cf0b	ce1373ae-7144-11e5-ba71-058fbc01cf0b	Mixed
ce137458-7144-11e5-ba71-058fbc01cf0b	ce137458-7144-11e5-ba71-058fbc01cf0b	Finnish Spitz
ce13750c-7144-11e5-ba71-058fbc01cf0b	ce13750c-7144-11e5-ba71-058fbc01cf0b	Chesapeake Bay Retriever
ce1375b6-7144-11e5-ba71-058fbc01cf0b	ce1375b6-7144-11e5-ba71-058fbc01cf0b	Border Collie
ce137656-7144-11e5-ba71-058fbc01cf0b	ce137656-7144-11e5-ba71-058fbc01cf0b	Brussels Griffon