Copyright Jana Schaich Borg/Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

MySQL Exercise 2: Using WHERE to select specific data

When you are querying a business-related data set, you are usually doing so to answer a question about a subset of the data. In this lesson you will learn how to select subsets of rows of data that meet criteria you specify, to help you prepare for these types of business questions. The mechanism within a SQL query that allows you specify which subset of data you want to retrieve is the WHERE clause.

Before we begin, let's load the SQL library and Dognition database, and make the Dognition database our default database. As a reminder, these are the lines of code you should input:

%load_ext sql
%sql mysql://studentuser:studentpw@mysqlserver/dognitiondb
%sql USE dognitiondb

In [1]:
%load_ext sql
%sql mysql://studentuser:studentpw@mysqlserver/dognitiondb
%sql USE dognitiondb


0 rows affected.
Out[1]:
[]

In [2]:
%config SqlMagic.displaylimit=25

Recall the general syntax structure we learned from the "Introduction to Query Syntax" video at the beginning of the week:

This guide indicates that whenever the data we select need to meet certain criteria (specified using a "WHERE" clause), we specify those criteria after we have specified where the data come from.

Let's say we want to know which Dognition customers received access to Dognition's first four tests for free. These customers have a 1 in the "free_start_user" column of the users table. The syntax you would use to select the data for these customers would be:

SELECT user_guid
FROM users
WHERE free_start_user=1;

(Note: user_guid is the field that specifies the unique User ID number of each customer in the users table)

If you wanted to double-check that the outputted data indeed met the criteria you specified, you could include a second column in your output that would give you the value in the free_start_user field for each row of the output:

SELECT user_guid, free_start_user
FROM users
WHERE free_start_user=1;

Try this on your own below. Remember to use %%sql to indicate that your query will span multiple lines, and consider whether you would like to limit the number of results you ouput using the syntax we learned last lesson. If you do use a LIMIT statement, remember that it has to be the last item in your query, so this time you will place it after your WHERE statement instead of after your FROM statement.


In [3]:
%%sql
SELECT user_guid
FROM users
WHERE free_start_user=1
LIMIT 0,5;


5 rows affected.
Out[3]:
user_guid
ce28a468-7144-11e5-ba71-058fbc01cf0b
ce28ac4c-7144-11e5-ba71-058fbc01cf0b
ce28acba-7144-11e5-ba71-058fbc01cf0b
ce28ad1e-7144-11e5-ba71-058fbc01cf0b
ce28ad82-7144-11e5-ba71-058fbc01cf0b

Question 1: How would you select the Dog IDs for the dogs in the Dognition data set that were DNA tested (these should have a 1 in the dna_tested field of the dogs table)? Try it below (if you do not limit your output, your query should output data from 1433 dogs):


In [4]:
%%sql
DESCRIBE dogs


21 rows affected.
Out[4]:
Field Type Null Key Default Extra
gender varchar(255) YES None
birthday varchar(255) YES None
breed varchar(255) YES None
weight int(11) YES None
dog_fixed tinyint(1) YES None
dna_tested tinyint(1) YES None
created_at datetime NO None
updated_at datetime NO None
dimension varchar(255) YES None
exclude tinyint(1) YES None
breed_type varchar(255) YES None
breed_group varchar(255) YES None
dog_guid varchar(60) YES MUL None
user_guid varchar(60) YES MUL None
total_tests_completed varchar(255) YES None
mean_iti_days varchar(255) YES None
mean_iti_minutes varchar(255) YES None
median_iti_days varchar(255) YES None
median_iti_minutes varchar(255) YES None
time_diff_between_first_and_last_game_days varchar(255) YES None
time_diff_between_first_and_last_game_minutes varchar(255) YES None

In [5]:
%%sql
SELECT dog_guid
FROM dogs
WHERE dna_tested=1;


1433 rows affected.
Out[5]:
dog_guid
fd27b6b4-7144-11e5-ba71-058fbc01cf0b
fd27cd98-7144-11e5-ba71-058fbc01cf0b
fd27ce1a-7144-11e5-ba71-058fbc01cf0b
fd27d144-7144-11e5-ba71-058fbc01cf0b
fd27d1c6-7144-11e5-ba71-058fbc01cf0b
fd27d9fa-7144-11e5-ba71-058fbc01cf0b
fd27dc52-7144-11e5-ba71-058fbc01cf0b
fd27e454-7144-11e5-ba71-058fbc01cf0b
fd27e9a4-7144-11e5-ba71-058fbc01cf0b
fd3cd40e-7144-11e5-ba71-058fbc01cf0b
fd3cf718-7144-11e5-ba71-058fbc01cf0b
fd3cf8ee-7144-11e5-ba71-058fbc01cf0b
fd3d0078-7144-11e5-ba71-058fbc01cf0b
fd3d0492-7144-11e5-ba71-058fbc01cf0b
fd3d0938-7144-11e5-ba71-058fbc01cf0b
fd3d09ce-7144-11e5-ba71-058fbc01cf0b
fd3d0d48-7144-11e5-ba71-058fbc01cf0b
fd3d0dde-7144-11e5-ba71-058fbc01cf0b
fd3d0f96-7144-11e5-ba71-058fbc01cf0b
fd3d1202-7144-11e5-ba71-058fbc01cf0b
fd3d1a5e-7144-11e5-ba71-058fbc01cf0b
fd3d2788-7144-11e5-ba71-058fbc01cf0b
fd3d28aa-7144-11e5-ba71-058fbc01cf0b
fd3d2f08-7144-11e5-ba71-058fbc01cf0b
fd3d2f9e-7144-11e5-ba71-058fbc01cf0b
1433 rows, truncated to displaylimit of 25

The SELECT statement can be used to interact with all data types, and there are many operators and functions that allow you to interact with the data in different ways. Here are some resources that describe these operators and functions:

http://dev.mysql.com/doc/refman/5.7/en/func-op-summary-ref.html
http://www.w3resource.com/mysql/mysql-functions-and-operators.php

Some of the most common operators include: =,<,>,<=, and >=. If you want to select something that is NOT a specific value, use != or <>. You can also use logical operators, such as AND and OR.

Let's start by examining how operators can be used with numerical data.

If you wanted to examine the Dog IDs of dogs who weighed between 10 and 50 pounds, you could query:

SELECT dog_guid, weight
FROM dogs
WHERE weight BETWEEN 10 AND 50;

The above query provided an example of how to use the BETWEEN operator (described in the links provided above), as well as an example of how AND can be used to specify multiple criteria. If you wanted to examine the Dog IDs of dogs who were "fixed" (neutered) OR DNA tested, you could use OR in the following query:

SELECT dog_guid, dog_fixed, dna_tested
FROM dogs
WHERE dog_fixed=1 OR dna_tested=1;

If you wanted to examine the Dog IDs of dogs who were fixed but NOT DNA tested, you could query:

SELECT dog_guid, dog_fixed, dna_tested
FROM dogs
WHERE dog_fixed=1 AND dna_tested!=1;

Question 2: How would you query the User IDs of customers who bought annual subscriptions, indicated by a "2" in the membership_type field of the users table? (If you do not limit the output of this query, your output should contain 4919 rows.)


In [6]:
%%sql
DESCRIBE users


16 rows affected.
Out[6]:
Field Type Null Key Default Extra
sign_in_count int(11) YES 0
created_at datetime NO None
updated_at datetime NO None
max_dogs int(11) YES 0
membership_id int(11) YES None
subscribed tinyint(1) YES 0
exclude tinyint(1) YES None
free_start_user tinyint(1) YES None
last_active_at datetime YES None
membership_type int(11) YES None
user_guid text YES MUL None
city varchar(255) YES None
state varchar(255) YES None
zip varchar(255) YES None
country varchar(255) YES None
utc_correction varchar(255) YES None

In [7]:
%%sql
SELECT user_guid
FROM users
WHERE membership_type=2;


4919 rows affected.
Out[7]:
user_guid
ce134e42-7144-11e5-ba71-058fbc01cf0b
ce135e14-7144-11e5-ba71-058fbc01cf0b
ce135e14-7144-11e5-ba71-058fbc01cf0b
ce136ac6-7144-11e5-ba71-058fbc01cf0b
ce136c24-7144-11e5-ba71-058fbc01cf0b
ce136e36-7144-11e5-ba71-058fbc01cf0b
ce136ee0-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce134be0-7144-11e5-ba71-058fbc01cf0b
ce1371a6-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce1377b4-7144-11e5-ba71-058fbc01cf0b
ce137700-7144-11e5-ba71-058fbc01cf0b
ce137868-7144-11e5-ba71-058fbc01cf0b
ce137868-7144-11e5-ba71-058fbc01cf0b
ce137912-7144-11e5-ba71-058fbc01cf0b
ce137a7a-7144-11e5-ba71-058fbc01cf0b
ce137a7a-7144-11e5-ba71-058fbc01cf0b
ce137034-7144-11e5-ba71-058fbc01cf0b
ce137034-7144-11e5-ba71-058fbc01cf0b
ce137c78-7144-11e5-ba71-058fbc01cf0b
ce135bd0-7144-11e5-ba71-058fbc01cf0b
ce13807e-7144-11e5-ba71-058fbc01cf0b
ce1381c8-7144-11e5-ba71-058fbc01cf0b
4919 rows, truncated to displaylimit of 25

Now let's try using the WHERE statement to interact with text data (called "strings").

Strings need to be surrounded by quotation marks in SQL. MySQL accepts both double and single quotation marks, but some database systems only accept single quotation marks. Whenever a string contains an SQL keyword, the string must be enclosed in backticks instead of quotation marks.

'the marks that surrounds this phrase are single quotation marks'
"the marks that surrounds this phrase are double quotation marks"
`the marks that surround this phrase are backticks`

Strings enclosed in quotation or backticks can be used with many of the same operators as numerical data. For example, imagine that you only wanted to look at data from dogs of the breed "Golden Retrievers." You could query (note that double quotation marks could have been used in this example is well):

SELECT dog_guid, breed
FROM dogs
WHERE breed='golden retriever';

The IN operator allows you to specify multiple values in a WHERE clause. Each of these values must be separated by a comma from the other values, and the entire list of values should be enclosed in parentheses. If you wanted to look at all the data from Golden Retrievers and Poodles, you could certainly use the OR operator, but the IN operator would be even more efficient (note that single quotation marks could have been used in this example, too):

SELECT dog_guid, breed
FROM dogs
WHERE breed IN ("golden retriever","poodle");

The LIKE operator allows you to specify a pattern that the textual data you query has to match. For example, if you wanted to look at all the data from breeds whose names started with "s", you could query:

SELECT dog_guid, breed
FROM dogs
WHERE breed LIKE ("s%");

In this syntax, the percent sign indicates a wild card. Wild cards represent unlimited numbers of missing letters. This is how the placement of the percent sign would affect the results of the query:

  • WHERE breed LIKE ("s%") = the breed must start with "s", but can have any number of letters after the "s"
  • WHERE breed LIKE ("%s") = the breed must end with "s", but can have any number of letters before the "s"
  • WHERE breed LIKE ("%s%") = the breed must contain an "s" somewhere in its name, but can have any number of letters before or after the "s"

Question 3: How would you query all the data from customers located in the state of North Carolina (abbreviated "NC") or New York (abbreviated "NY")? If you do not limit the output of this query, your output should contain 1333 rows.


In [8]:
%%sql
SELECT *
FROM users
WHERE state IN ('NC','NY');


1333 rows affected.
Out[8]:
sign_in_count created_at updated_at max_dogs membership_id subscribed exclude free_start_user last_active_at membership_type user_guid city state zip country utc_correction
181 2013-02-05 17:54:42 2015-01-28 20:51:49 13 2 1 1 0 None 2 ce135e14-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27606 US -5
181 2013-02-05 17:54:42 2015-01-28 20:51:49 13 2 1 1 0 None 2 ce135e14-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27606 US -5
65 2013-02-05 00:52:16 2015-01-28 20:51:49 3 2 1 None None None 2 ce134be0-7144-11e5-ba71-058fbc01cf0b Hillsborough NC 27278 US -5
7 2013-02-06 00:40:59 2015-01-28 20:51:50 1 2 1 None None None 2 ce1371a6-7144-11e5-ba71-058fbc01cf0b New York NY 10023 US -5
15 2013-02-06 14:13:42 2015-01-28 20:51:50 1 2 1 None None None 2 ce137c78-7144-11e5-ba71-058fbc01cf0b Durham NC 27713 US -5
181 2013-02-05 17:54:42 2015-01-28 20:51:49 13 2 1 1 0 None 2 ce135e14-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27606 US -5
2 2013-02-06 19:50:16 2015-01-28 20:51:50 2 2 1 None None None 2 ce138722-7144-11e5-ba71-058fbc01cf0b Charlotte NC 27371 US -5
181 2013-02-05 17:54:42 2015-01-28 20:51:49 13 2 1 1 0 None 2 ce135e14-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27606 US -5
4 2013-02-07 04:19:57 2015-01-28 20:51:50 2 1 0 None None 2014-06-23 20:42:36 1 ce13919a-7144-11e5-ba71-058fbc01cf0b Oakland NY 11730 US -5
2 2013-02-06 01:54:46 2015-01-28 20:51:50 1 1 0 None None None 1 ce137458-7144-11e5-ba71-058fbc01cf0b Durham NC 27707 US -5
355 2013-02-05 17:17:31 2015-05-13 14:44:22 7 2 1 0 0 2015-05-13 14:44:22 2 ce135766-7144-11e5-ba71-058fbc01cf0b Cary NC 27519 US -5
2 2013-02-07 15:22:06 2015-01-28 20:51:50 1 1 1 None None None 1 ce139cb2-7144-11e5-ba71-058fbc01cf0b Carrboro NC 28412 US -5
12 2013-02-07 16:30:09 2015-01-28 20:51:50 3 3 1 None None None 3 ce13a5cc-7144-11e5-ba71-058fbc01cf0b Alexandria NY 14610 US -5
181 2013-02-05 17:54:42 2015-01-28 20:51:49 13 2 1 1 0 None 2 ce135e14-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27606 US -5
2 2013-02-07 22:16:32 2015-01-28 20:51:50 1 1 0 None None None 1 ce21e736-7144-11e5-ba71-058fbc01cf0b Durham NY 10065 US -5
310 2013-02-05 00:45:15 2015-02-23 13:39:48 3 2 1 1 0 2015-02-23 13:39:48 2 ce134a78-7144-11e5-ba71-058fbc01cf0b Durham NC 27701 US -5
50 2013-02-09 14:00:25 2015-02-24 23:55:26 3 2 1 None None 2015-02-24 23:55:26 2 ce221e5e-7144-11e5-ba71-058fbc01cf0b Hillsborough NC 27278 US -5
6 2013-02-09 15:43:59 2015-01-28 20:51:51 1 1 1 None None 2014-08-17 15:17:29 1 ce2220de-7144-11e5-ba71-058fbc01cf0b New York NY 10001 US -5
3 2013-02-09 17:26:32 2015-01-28 20:51:51 2 2 1 None None None 2 ce22248a-7144-11e5-ba71-058fbc01cf0b Charlotte NC 28278 US -5
7 2013-02-09 17:32:29 2015-05-18 01:21:59 1 1 1 None None 2015-05-18 01:21:59 1 ce22252a-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27614 US -5
16 2013-02-09 19:29:29 2015-01-28 20:51:51 1 1 1 None None None 1 ce222674-7144-11e5-ba71-058fbc01cf0b Raleigh NC 27615 US -5
4 2013-02-09 05:54:44 2015-01-28 20:51:51 1 1 1 None None None 1 ce221c88-7144-11e5-ba71-058fbc01cf0b New York NY 10028 US -5
15 2013-02-10 19:18:02 2015-07-12 23:43:41 1 2 1 None None 2015-07-12 23:43:41 2 ce223452-7144-11e5-ba71-058fbc01cf0b New York NY 10003 US -5
8 2013-02-10 20:51:19 2015-01-28 20:51:51 3 2 1 None None 2014-03-19 00:53:29 2 ce2236c8-7144-11e5-ba71-058fbc01cf0b Hillsborough NC 27278 US -5
8 2013-02-10 20:51:19 2015-01-28 20:51:51 3 2 1 None None 2014-03-19 00:53:29 2 ce2236c8-7144-11e5-ba71-058fbc01cf0b Hillsborough NC 27278 US -5
1333 rows, truncated to displaylimit of 25

Next, let's try using the WHERE statement to interact with datetime data. Time-related data is a little more complicated to work with than other types of data, because it must have a very specific format. MySQL comes with the following data types for storing a date or a date/time value in the database:

DATE - format YYYY-MM-DD
DATETIME - format: YYYY-MM-DD HH:MI:SS
TIMESTAMP - format: YYYY-MM-DD HH:MI:SS
YEAR - format YYYY or YY

One of the interesting things about time-related data is that SQL has commands to break the data into different "time parts" or "date parts" as described here:

http://www.tutorialspoint.com/mysql/mysql-date-time-functions.htm

A time stamp stored in one row of data might look like this:

2013-02-07 02:50:52

The year part of that entry would be 2013, the month part would be "02" or "February" (depending on the requested format), the seconds part would be "52", and so on. SQL functions easily allow you to convert those parts into formats you might need for specific analyses. For example, imagine you wanted to know how many tests Dognition customers complete on different days of the week. To complete this analysis, you would need to convert the time stamps of each completed test to a variable that outputted the correct day of the week for that date. DAYNAME is a function that will do this for you. You can combine DAYNAME with WHERE to select data from only a single day of the week:

SELECT dog_guid, created_at
FROM complete_tests
WHERE DAYNAME(created_at)="Tuesday"

You can also use common operators like =,<,>,<=,>=,!=, or <> with dates just like you would with other types of data, but whether you refer to the date as a number or text will depend on whether you are selecting individual date parts or treating the date/time entry as a single clause. For example, you could select all the Dog IDs and time stamps of tests completed after the 15 of every month with this command that extracts the "DAY" date part out of each time stamp:

SELECT dog_guid, created_at
FROM complete_tests
WHERE DAY(created_at) > 15

You could also select all the Dog IDs and time stamps of completed tests from after February 4, 2014 by treating date entries as text clauses with the following query:

SELECT dog_guid, created_at
FROM complete_tests
WHERE created_at > '2014-02-04'

Note that you have to use a different set of functions than you would use for regular numerical data to add or subtract time from any values in these datetime formats. For example, instead of using a minus sign to find the difference in time between two time stamps or dates, you would use the TIMEDIFF or DATEDIFF function. See the references provided above for a list of these functions.

Question 4: Now that you have seen how datetime data can be used to impose criteria on the data you select, how would you select all the Dog IDs and time stamps of Dognition tests completed before October 15, 2015 (your output should have 193,246 rows)?


In [9]:
%%sql
DESCRIBE complete_tests


6 rows affected.
Out[9]:
Field Type Null Key Default Extra
created_at datetime NO None
updated_at datetime NO None
user_guid varchar(60) YES MUL None
dog_guid varchar(60) YES MUL None
test_name varchar(60) YES None
subcategory_name varchar(60) YES None

In [10]:
%%sql
SELECT dog_guid, created_at
FROM complete_tests
WHERE created_at < '2015-10-15'


193246 rows affected.
Out[10]:
dog_guid created_at
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:26:54
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:31:03
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:32:04
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:32:25
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:32:56
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:33:15
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:33:33
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:33:59
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:34:25
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:34:39
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:34:46
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:35:18
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:35:47
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:36:28
fd27b86c-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:36:44
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 18:57:05
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:01:31
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:04:42
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:07:39
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:15:01
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:21:15
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:28:27
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:51:57
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 19:58:06
fd27bbbe-7144-11e5-ba71-058fbc01cf0b 2013-02-05 20:05:57
100000 rows, truncated to displaylimit of 25

Last, let's use the WHERE statement in combination with two very important operators: IS NULL and IS NOT NULL. IS NULL will indicate rows of data that have null values. IS NOT NULL will indicate rows that do not have null values. We saw in previous exercises that many of the entries in the free_start_user field of the user table in the Dognition data set had NULL values. To select only the rows that have non-null data you could query:

SELECT user_guid
FROM users
WHERE free_start_user IS NOT NULL;

To select only the rows that have null data so that you can examine if these rows share something else in common, you could query:

SELECT user_guid
FROM users
WHERE free_start_user IS NULL;

Question 5: How would you select all the User IDs of customers who do not have null values in the State field of their demographic information (if you do not limit the output, you should get 17,985 from this query -- there are a lot of null values in the state field!)?


In [11]:
%%sql
SELECT user_guid
FROM users
WHERE state IS NOT NULL;


17985 rows affected.
Out[11]:
user_guid
ce134e42-7144-11e5-ba71-058fbc01cf0b
ce1353d8-7144-11e5-ba71-058fbc01cf0b
ce135ab8-7144-11e5-ba71-058fbc01cf0b
ce13507c-7144-11e5-ba71-058fbc01cf0b
ce135e14-7144-11e5-ba71-058fbc01cf0b
ce13615c-7144-11e5-ba71-058fbc01cf0b
ce135e14-7144-11e5-ba71-058fbc01cf0b
ce135f2c-7144-11e5-ba71-058fbc01cf0b
ce136a1c-7144-11e5-ba71-058fbc01cf0b
ce136ac6-7144-11e5-ba71-058fbc01cf0b
ce136c24-7144-11e5-ba71-058fbc01cf0b
ce136e36-7144-11e5-ba71-058fbc01cf0b
ce136ee0-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce134be0-7144-11e5-ba71-058fbc01cf0b
ce1371a6-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce136f94-7144-11e5-ba71-058fbc01cf0b
ce1373ae-7144-11e5-ba71-058fbc01cf0b
ce13750c-7144-11e5-ba71-058fbc01cf0b
ce1375b6-7144-11e5-ba71-058fbc01cf0b
ce1377b4-7144-11e5-ba71-058fbc01cf0b
ce137700-7144-11e5-ba71-058fbc01cf0b
ce137868-7144-11e5-ba71-058fbc01cf0b
ce137868-7144-11e5-ba71-058fbc01cf0b
17985 rows, truncated to displaylimit of 25

Practice writing your own SELECT and WHERE statements!

These queries will combine what you've learned in the past two lessons.

Question 6: How would you retrieve the Dog ID, subcategory_name, and test_name fields, in that order, of the first 10 reviews entered in the Reviews table to be submitted in 2014?


In [12]:
%%sql
DESCRIBE reviews


7 rows affected.
Out[12]:
Field Type Null Key Default Extra
rating int(11) YES None
created_at datetime NO None
updated_at datetime NO None
user_guid varchar(60) YES MUL None
dog_guid varchar(60) YES MUL None
subcategory_name varchar(60) YES None
test_name varchar(60) YES None

In [13]:
%%sql
SELECT dog_guid, subcategory_name, test_name
FROM reviews
WHERE YEAR(created_at)=2014
#WHERE created_at <= '2014-12-31' and created_at >= '2014-01-01'
LIMIT 0, 10;


10 rows affected.
Out[13]:
dog_guid subcategory_name test_name
ce3ac77e-7144-11e5-ba71-058fbc01cf0b Empathy Yawn Warm-up
ce2aedcc-7144-11e5-ba71-058fbc01cf0b Empathy Eye Contact Warm-up
ce2aedcc-7144-11e5-ba71-058fbc01cf0b Empathy Eye Contact Game
ce2aedcc-7144-11e5-ba71-058fbc01cf0b Communication Treat Warm-up
ce405c52-7144-11e5-ba71-058fbc01cf0b Empathy Yawn Warm-up
ce405c52-7144-11e5-ba71-058fbc01cf0b Empathy Yawn Game
ce405c52-7144-11e5-ba71-058fbc01cf0b Empathy Eye Contact Game
ce405e28-7144-11e5-ba71-058fbc01cf0b Communication Treat Warm-up
ce405e28-7144-11e5-ba71-058fbc01cf0b Cunning Turn Your Back
ce2609c4-7144-11e5-ba71-058fbc01cf0b Communication Treat Warm-up

Question 7: How would you select all of the User IDs of customers who have female dogs whose breed includes the word "terrier" somewhere in its name (if you don't limit your output, you should have 1771 rows in your output)?


In [14]:
%%sql
DESCRIBE dogs


21 rows affected.
Out[14]:
Field Type Null Key Default Extra
gender varchar(255) YES None
birthday varchar(255) YES None
breed varchar(255) YES None
weight int(11) YES None
dog_fixed tinyint(1) YES None
dna_tested tinyint(1) YES None
created_at datetime NO None
updated_at datetime NO None
dimension varchar(255) YES None
exclude tinyint(1) YES None
breed_type varchar(255) YES None
breed_group varchar(255) YES None
dog_guid varchar(60) YES MUL None
user_guid varchar(60) YES MUL None
total_tests_completed varchar(255) YES None
mean_iti_days varchar(255) YES None
mean_iti_minutes varchar(255) YES None
median_iti_days varchar(255) YES None
median_iti_minutes varchar(255) YES None
time_diff_between_first_and_last_game_days varchar(255) YES None
time_diff_between_first_and_last_game_minutes varchar(255) YES None

In [15]:
%%sql
SELECT user_guid
FROM dogs
WHERE gender='female' and breed LIKE ('%terrier%')


1771 rows affected.
Out[15]:
user_guid
ce138722-7144-11e5-ba71-058fbc01cf0b
ce13b152-7144-11e5-ba71-058fbc01cf0b
ce21d7d2-7144-11e5-ba71-058fbc01cf0b
ce2202e8-7144-11e5-ba71-058fbc01cf0b
ce2203f6-7144-11e5-ba71-058fbc01cf0b
ce221774-7144-11e5-ba71-058fbc01cf0b
ce221a76-7144-11e5-ba71-058fbc01cf0b
ce22234a-7144-11e5-ba71-058fbc01cf0b
ce222fa2-7144-11e5-ba71-058fbc01cf0b
ce223628-7144-11e5-ba71-058fbc01cf0b
ce223af6-7144-11e5-ba71-058fbc01cf0b
ce222214-7144-11e5-ba71-058fbc01cf0b
ce222e58-7144-11e5-ba71-058fbc01cf0b
ce225c16-7144-11e5-ba71-058fbc01cf0b
ce23f634-7144-11e5-ba71-058fbc01cf0b
ce240d72-7144-11e5-ba71-058fbc01cf0b
ce242762-7144-11e5-ba71-058fbc01cf0b
ce24291a-7144-11e5-ba71-058fbc01cf0b
ce2420aa-7144-11e5-ba71-058fbc01cf0b
ce24d0cc-7144-11e5-ba71-058fbc01cf0b
ce24d248-7144-11e5-ba71-058fbc01cf0b
ce251af0-7144-11e5-ba71-058fbc01cf0b
ce252342-7144-11e5-ba71-058fbc01cf0b
ce252342-7144-11e5-ba71-058fbc01cf0b
ce252342-7144-11e5-ba71-058fbc01cf0b
1771 rows, truncated to displaylimit of 25

Question 8: How would you select the Dog ID, test name, and subcategory associated with each completed test for the first 100 tests entered in October, 2014?


In [16]:
%%sql
SELECT dog_guid, test_name, subcategory_name
FROM reviews
WHERE created_at >= '2014-10-01'
LIMIT 0, 100;


100 rows affected.
Out[16]:
dog_guid test_name subcategory_name
ce4145ea-7144-11e5-ba71-058fbc01cf0b Yawn Game Empathy
ce46e090-7144-11e5-ba71-058fbc01cf0b Physical Reasoning Warm-up Reasoning
ce46e090-7144-11e5-ba71-058fbc01cf0b Physical Reasoning Game Reasoning
ce470c82-7144-11e5-ba71-058fbc01cf0b Memory versus Pointing Memory
ce46f0d0-7144-11e5-ba71-058fbc01cf0b Delayed Cup Game Memory
ce470bb0-7144-11e5-ba71-058fbc01cf0b Watching Cunning
ce470e30-7144-11e5-ba71-058fbc01cf0b Watching Cunning
ce470bb0-7144-11e5-ba71-058fbc01cf0b Turn Your Back Cunning
ce470e30-7144-11e5-ba71-058fbc01cf0b Turn Your Back Cunning
ce46e090-7144-11e5-ba71-058fbc01cf0b Treat Warm-up Communication
ce470bb0-7144-11e5-ba71-058fbc01cf0b Cover Your Eyes Cunning
ce470bb0-7144-11e5-ba71-058fbc01cf0b Watching - Part 2 Cunning
ce470e30-7144-11e5-ba71-058fbc01cf0b Cover Your Eyes Cunning
ce46e090-7144-11e5-ba71-058fbc01cf0b Arm Pointing Communication
ce470e30-7144-11e5-ba71-058fbc01cf0b Watching - Part 2 Cunning
ce46e090-7144-11e5-ba71-058fbc01cf0b Foot Pointing Communication
ce45996a-7144-11e5-ba71-058fbc01cf0b Watching Cunning
ce45996a-7144-11e5-ba71-058fbc01cf0b Turn Your Back Cunning
ce45996a-7144-11e5-ba71-058fbc01cf0b Cover Your Eyes Cunning
ce46dab4-7144-11e5-ba71-058fbc01cf0b Watching Cunning
ce45996a-7144-11e5-ba71-058fbc01cf0b Watching - Part 2 Cunning
ce46dab4-7144-11e5-ba71-058fbc01cf0b Turn Your Back Cunning
ce470c82-7144-11e5-ba71-058fbc01cf0b Inferential Reasoning Warm-up Reasoning
ce46dab4-7144-11e5-ba71-058fbc01cf0b Cover Your Eyes Cunning
ce45ab4e-7144-11e5-ba71-058fbc01cf0b Watching Cunning
100 rows, truncated to displaylimit of 25

There are many more operators you can use in your WHERE clauses to restrict the data you select as well. We do not have the space to go over each one individually in this lesson, but I encourage you to explore them on your own. *This is a great area to practice being fearless and bold in your desire to learn new things! The more you try, the more you will learn.*

Feel free to practice any other functions or operators you discover in the space below:


In [17]:
%%sql
SELECT user_guid
FROM users
WHERE state='CA' and last_active_at<'2015-01-10';


565 rows affected.
Out[17]:
user_guid
ce137868-7144-11e5-ba71-058fbc01cf0b
ce137868-7144-11e5-ba71-058fbc01cf0b
ce2409ee-7144-11e5-ba71-058fbc01cf0b
ce245926-7144-11e5-ba71-058fbc01cf0b
ce24749c-7144-11e5-ba71-058fbc01cf0b
ce24f50c-7144-11e5-ba71-058fbc01cf0b
ce24f50c-7144-11e5-ba71-058fbc01cf0b
ce24b2cc-7144-11e5-ba71-058fbc01cf0b
ce250fb0-7144-11e5-ba71-058fbc01cf0b
ce251014-7144-11e5-ba71-058fbc01cf0b
ce2525c2-7144-11e5-ba71-058fbc01cf0b
ce252e78-7144-11e5-ba71-058fbc01cf0b
ce253fda-7144-11e5-ba71-058fbc01cf0b
ce253fda-7144-11e5-ba71-058fbc01cf0b
ce254c28-7144-11e5-ba71-058fbc01cf0b
ce25a088-7144-11e5-ba71-058fbc01cf0b
ce252e78-7144-11e5-ba71-058fbc01cf0b
ce25fac4-7144-11e5-ba71-058fbc01cf0b
ce25fe0c-7144-11e5-ba71-058fbc01cf0b
ce25fe0c-7144-11e5-ba71-058fbc01cf0b
ce254e76-7144-11e5-ba71-058fbc01cf0b
ce260e6a-7144-11e5-ba71-058fbc01cf0b
ce261c98-7144-11e5-ba71-058fbc01cf0b
ce2643b2-7144-11e5-ba71-058fbc01cf0b
ce266342-7144-11e5-ba71-058fbc01cf0b
565 rows, truncated to displaylimit of 25

In [ ]: