graded = 7/8
1) What books topped the Hardcover Fiction NYT best-sellers list on Mother's Day in 2009 and 2010? How about Father's Day?
In [2]:
import requests
date='2009-05-08' #Replace with 2010-05-09,2009-06-21,2010-06-20
url="https://api.nytimes.com/svc/books/v2/lists/"+date+"/hardcover-fiction.json?&num_results=1&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The Best Sellers on",date,"are the following:")
print("")
for n in range(0,len(data['results'])):
for title in data['results'][n]['book_details']:
print(n+1,".",title['title'],"by",title['author'])
In [438]:
import requests
date='2010-05-09' #Replace with 2010-05-09,2009-06-21,2010-06-20
url="https://api.nytimes.com/svc/books/v2/lists/"+date+"/hardcover-fiction.json?&num_results=1&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The Best Sellers on",date,"are the following:")
print("")
for n in range(0,len(data['results'])):
for title in data['results'][n]['book_details']:
print(n+1,".",title['title'],"by",title['author'])
In [14]:
import requests
date='2009-06-21' #Replace with 2010-05-09,2009-06-21,2010-06-20
url="https://api.nytimes.com/svc/books/v2/lists/"+date+"/hardcover-fiction.json?&num_results=1&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The Best Sellers on",date,"are the following:")
print("")
for n in range(0,len(data['results'])):
for title in data['results'][n]['book_details']:
print(n+1,".",title['title'],"by",title['author'])
In [15]:
import requests
date='2010-06-20' #Replace with 2010-05-09,2009-06-21,2010-06-20
url="https://api.nytimes.com/svc/books/v2/lists/"+date+"/hardcover-fiction.json?&num_results=1&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The Best Sellers on",date,"are the following:")
print("")
for n in range(0,len(data['results'])):
for title in data['results'][n]['book_details']:
print(n+1,".",title['title'],"by",title['author'])
2) What are all the different book categories the NYT ranked in June 6, 2009? How about June 6, 2015?
In [16]:
import requests
date='2009-06-06'
url="https://api.nytimes.com/svc/books/v3/lists/overview.json?&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The different book categories NYT ranked on",date,"are:")
print("")
for n in range(0,len(data['results'])):
print(data['results']['lists'][n]['list_name'])
In [17]:
import requests
date='2015-06-06'
url="https://api.nytimes.com/svc/books/v3/lists/overview.json?&api-key=4182fa9aca904ae18f4a1f6bef2fc7e9"
response=requests.get(url)
data=response.json()
print("The different book categories NYT ranked on",date,"are:")
print("")
for n in range(0,len(data['results'])):
print(data['results']['lists'][n]['list_name'])
In [3]:
#Ta-Stephan: you need to specify a date in the API. Here is how you would do it.
url ="https://api.nytimes.com/svc/books/v3/lists/overview.json?published_date=2009-06-06&api-key=9bddd887c630b8078e017396214a150a:15:61062085"
response = requests.get(url)
data = response.json()
for book_list in data['results']['lists']:
print(book_list['list_name'])
3) Muammar Gaddafi's name can be transliterated many many ways. His last name is often a source of a million and one versions - Gadafi, Gaddafi, Kadafi, and Qaddafi to name a few. How many times has the New York Times referred to him by each of those names? Tip: Add "Libya" to your search to make sure (-ish) you're talking about the right guy.
In [18]:
despot_names = ['Gadafi', 'Gaddafi', 'Kadafi', 'Qaddafi']
for name in despot_names:
despot_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q=' + name +'+Libya&api-key=0c3ba2a8848c44eea6a3443a17e57448')
despot_data = despot_response.json()
despot_hits_meta = despot_data['response']['meta']
despot_hit_count = despot_hits_meta['hits']
print("The NYT has referred to the Libyan despot", despot_hit_count, "times using the spelling", name)
4) What's the title of the first story to mention the word 'hipster' in 1995? What's the first paragraph?
In [19]:
import requests
term='hipster'
url='http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+ term +'&begin_date=19950101&end_date=19951231&api-key=0c3ba2a8848c44eea6a3443a17e57448'
response=requests.get(url)
data=response.json()
#print(data.keys())
#print(data['response'].keys())
print("The main headline for the first article in 1995 mentioning the term hipster is:",data['response']['docs'][0]['headline']['main'])
print("The kicker headline for the first article in 1995 mentioning the term hipster is:",data['response']['docs'][0]['headline']['kicker'])
print("")
print("The first paragrah for the first article in 1995 mentioning the term hipster is:",data['response']['docs'][0]['lead_paragraph'])
5) How many times was gay marriage mentioned in the NYT between 1950-1959, 1960-1969, 1970-1978, 1980-1989, 1990-2099, 2000-2009, and 2010-present? Tip: You'll want to put quotes around the search term so it isn't just looking for "gay" and "marriage" in the same article. Tip: Write code to find the number of mentions between Jan 1, 1950 and Dec 31, 1959.
In [20]:
search_term='gay marriage'
begin_date='19500101'
end_date='19591231'
gay_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+search_term+'&begin_date='+begin_date+'&end_date='+end_date+'&api-key=0c3ba2a8848c44eea6a3443a17e57448')
gay_data=gay_response.json()
print(gay_data['response']['meta']['hits'],"is the number of times the term",search_term,",appears between",begin_date,"and",end_date)
In [21]:
search_term='gay marriage'
begin_date='19700101'
end_date='19791231'
gay_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+search_term+'&begin_date='+begin_date+'&end_date='+end_date+'&api-key=0c3ba2a8848c44eea6a3443a17e57448')
gay_data=gay_response.json()
print(gay_data['response']['meta']['hits'],"is the number of times the term",search_term,",appears between",begin_date,"and",end_date)
In [22]:
search_term='gay marriage'
begin_date='19800101'
end_date='19891231'
gay_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+search_term+'&begin_date='+begin_date+'&end_date='+end_date+'&api-key=0c3ba2a8848c44eea6a3443a17e57448')
gay_data=gay_response.json()
print(gay_data['response']['meta']['hits'],"is the number of times the term",search_term,",appears between",begin_date,"and",end_date)
In [23]:
search_term='gay marriage'
begin_date='19900101'
end_date='19991231'
gay_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+search_term+'&begin_date='+begin_date+'&end_date='+end_date+'&api-key=0c3ba2a8848c44eea6a3443a17e57448')
gay_data=gay_response.json()
print(gay_data['response']['meta']['hits'],"is the number of times the term",search_term,",appears between",begin_date,"and",end_date)
In [24]:
search_term='gay marriage'
begin_date='20000101'
end_date='20091231'
gay_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q='+search_term+'&begin_date='+begin_date+'&end_date='+end_date+'&api-key=0c3ba2a8848c44eea6a3443a17e57448')
gay_data=gay_response.json()
print(gay_data['response']['meta']['hits'],"is the number of times the term",search_term,",appears between",begin_date,"and",end_date)
6) What section talks about motorcycles the most? Tip: You'll be using facets
In [25]:
search_term='motorcycles'
motor_response = requests.get('http://api.nytimes.com/svc/search/v2/articlesearch.json?q=motorcycles&facet_field=section_name&api-key=0c3ba2a8848c44eea6a3443a17e57448')
motor_data=motor_response.json()
print("The section that talks about,",search_term,"the most is",motor_data['response']['facets']['section_name']['terms'][0]['term'])
print("It is mentioned",motor_data['response']['facets']['section_name']['terms'][0]['count'],"times in the section.")
7) How many of the last 20 movies reviewed by the NYT were Critics' Picks? How about the last 40? The last 60? Tip: You really don't want to do this 3 separate times (1-20, 21-40 and 41-60) and add them together. What if, perhaps, you were able to figure out how to combine two lists? Then you could have a 1-20 list, a 1-40 list, and a 1-60 list, and then just run similar code for each of them.
In [40]:
critics_pick_count=0
meh_movie=0
offset_value = 0
movie_list=[]
critical_acclaimed_movies=[]
meh_movies=[]
for page in range(3):
movie_response = requests.get('https://api.nytimes.com/svc/movies/v2/reviews/search.json?publication_date=20160611&api-key=07c67436f1864abc8a144c14adff69c8&'+ str(offset_value))
movie_data=movie_response.json()
n=0
n1=0
movie_list=movie_list+movie_data['results']
for count in movie_data['results']:
if(movie_data['results'][n]['critics_pick']==1):
critics_pick_count=critics_pick_count+1
#print(movie_data['results'][n]['display_title'],".This movie is critically acclaimed")
critical_acclaimed_movies=critical_acclaimed_movies+movie_data['results']
else:
meh_movie=meh_movie+1
#print(movie_data['results'][n]['display_title'],".This movie is meh.")
n=n+1
print("The number of critically picked movies is",critics_pick_count)
print("The number of meh movies is",meh_movie)
8) Out of the last 40 movie reviews from the NYT, which critic has written the most reviews?
In [54]:
critics_pick_count=0
meh_movie=0
offset= 0
page=offset+20
reviewers=[]
reviewers1=[]
reviewers2=[]
x=0
import requests
for page in range(40):
movie_response = requests.get('https://api.nytimes.com/svc/movies/v2/reviews/search.json?publication_date=20160611&&offset='+str(offset)+'&api-key=07c67436f1864abc8a144c14adff69c8')
movie_data=movie_response.json()
movie_results=movie_data['results']
byline=movie_results[x]['byline']
reviewers1.insert(x,byline)
offset=offset+20
x=x+1
if(x>19):
x=0
reviewers2.insert(x,byline)
reviewers = reviewers1 + reviewers2
print("The list of all the reviewers are",reviewers)
print("")
print("")
from collections import Counter
most_common,num_most_common = Counter(reviewers).most_common(1)[0]
print("The reviewer who has reviewed the most in the last 40 films is",most_common,"and that person has reviewed",num_most_common,"times")