In [1]:
from __future__ import print_function, absolute_import, division
During the first session of the DSFP we spent a significant amount of time learning about version control and git/github. As we have continued to use git as the software management system for the LSSTC DSFP, we will not be reviewing that material at this time.
Instead we are going to review the basic elements of software engineering as introduced to us by Jake VanderPlas. The four steps are (and note that I've omitted step 0, which is to use git for version control throughout this process):
Instead we are going to review the basic elements of software engineering as introduced to us by Jake VanderPlas. The four steps are (and note that I've omitted step 0, which is to use git for version control throughout this process):
__init__.py file so you can import your library).Instead we are going to review the basic elements of software engineering as introduced to us by Jake VanderPlas. The four steps are (and note that I've omitted step 0, which is to use git for version control throughout this process):
Instead we are going to review the basic elements of software engineering as introduced to us by Jake VanderPlas. The four steps are (and note that I've omitted step 0, which is to use git for version control throughout this process):
Before we begin with the actual exercise, a quick aside.
A quick note on modular programming. [Previously we urged you to build talks in a modular fashion - the idea here is similar but not exactly the same.]
The idea - each individual "idea" should be contained within a single module. This does not mean every call to NumPy should be it's own module, but the code should be organized via a series of small code snippets.
The basic appeal -- modular progamming improves:
The basic appeal -- modular progamming improves:
The basic appeal -- modular progamming improves:
tmpAcct4Edu PleaseDontHackUsThisIs4Education
For this problem we are going to focus on the steps associated with development, and skip (most of) the nitty gritty for writing the code for this problem. We will start by creating a Jupyter notebook with the basics of our software. The is realtively simple: we will develop a script to retrieve the last $N$ tweets from any specified twitter user. The basics of such a script are as follows:
import tweepy
consumer_key = "bhzpKBdspYr2xSDb0RxpI586q"
consumer_secret = "FfddeX3qatIeXoA51LJbgHs4qNsYAoNoWIqnlMISr3E7P4x03L"
access_key = "855466876364877825-pUkJcfH48x3rEnlFKvSLJaWZ0jzg6Nc"
access_secret = "9JHPalnxb6PVineBeCFFU5L98PD7EMOUBuwemM8vj8hA9"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
tweets = api.user_timeline(screen_name = twitter_acct ,count = Ntweets)
return [tweet.text for tweet in tweets]
Note 1 - you likely need to pip install tweepy. You may also need to restart your kernel after that installation.
Note 2 - I have created a dummy twitter account to provide keys and secret codes to use the twitter API. I will change those keys after Monday. It goes without saying that secret keys should not be uploaded to github.
Problem 1a
Create a notebook with a function get_recent_tweets that returns the last $N$ tweets from any specified twitter user.
Test the function by retrieving tweets. If you don't have a favorite twitter user, you can check my account, MillerAdamA (likely boring), or Lucianne's account, shaka_lulu (probably more interesting).
Hint - only a small modification is needed to the example code given above.
In [4]:
import tweepy
def get_recent_tweets(twitter_acct, Ntweets, print_results = False):
"""Get the last N tweets from a twitter user
Parameters
----------
twitter_acct : str
Twitter handle for the user
Ntweets : int
Number of tweets to be returned
print_results : bool (default = True)
Print the tweets at the command line.
Output
------
the Ntweets most recent tweets from twitter user twitter_acct
"""
consumer_key = "bhzpKBdspYr2xSDb0RxpI586q"
consumer_secret = "FfddeX3qatIeXoA51LJbgHs4qNsYAoNoWIqnlMISr3E7P4x03L"
access_key = "855466876364877825-pUkJcfH48x3rEnlFKvSLJaWZ0jzg6Nc"
access_secret = "9JHPalnxb6PVineBeCFFU5L98PD7EMOUBuwemM8vj8hA9"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
tweets = api.user_timeline(screen_name = twitter_acct ,count = Ntweets)
if print_results:
print("The last {:d} tweets from {:s} are:".format(Ntweets, twitter_acct))
print([tweet.text for tweet in tweets])
return [tweet.text for tweet in tweets]
get_recent_tweets("shaka_lulu", 2)
Out[4]:
Create a directory retrieve_tweets/ which will serve as your new Python library to retrieve tweets from twitter.
Create an __init__.py file in retrieve_tweets/. The contents of this file can be empty.
Create the file get_recent_tweets.py in retrieve_tweets/, and include the get_recent_tweets function that you previously developed in a Jupyter notebook as the contents of this file.
Problem 2a
Check your work by importing the retrieve_tweets library and running the get_recent_tweets function from that library.
Use github to store the results of your work.
Create a directory tests/ in retrieve_tweets/. In tests/, create an __init__.py file, and a test_get_recent_tweets.py file.
Problem 3a
Write a unit test for get_recent_tweets.py in test_get_recent_tweets.py.
Run your unit test to make sure your software is working.
Hint - Recall that nosetests is a great package for executing your unit tests.
In [ ]: