TEAM Members:
Haley Huang
Helen Hong
Tom Meagher
Tyler Reese
Required Readings:
NOTE
Our topic of interest is the New England Patriots. All Patriots fans ourselves, we were disappointed upon their elimination from the post-season, and unsure how to approach the upcoming Super Bowl. In order to sample how others may be feeling about the Patriots, we use search term "Patriots" for the Twitter streaming API.
In [25]:
# HELPER FUNCTIONS
import io
import json
import twitter
def oauth_login(token, token_secret, consumer_key, consumer_secret):
"""
Snag an auth from Twitter
"""
auth = twitter.oauth.OAuth(token, token_secret,
consumer_key, consumer_secret)
return auth
def save_json(filename, data):
"""
Save json data to a filename
"""
print 'Saving data into {0}.json...'.format(filename)
with io.open('{0}.json'.format(filename),
'w', encoding='utf-8') as f:
f.write(unicode(json.dumps(data, ensure_ascii=False)))
def load_json(filename):
"""
Load json data from a filename
"""
print 'Loading data from {0}.json...'.format(filename)
with open('{0}.json'.format(filename)) as f:
return json.load(f)
In [26]:
# API CONSTANTS
CONSUMER_KEY = '92TpJf8O0c9AWN3ZJjcN8cYxs'
CONSUMER_SECRET ='dyeCqzI2w7apETbTUvPai1oCDL5oponvZhHSmYm5XZTQbeiygq'
OAUTH_TOKEN = '106590533-SEB5EGGoyJ8EsjOKN05YuOQYu2rg5muZgMDoNrqN'
OAUTH_TOKEN_SECRET = 'BficAky6uGyGfRzDGJqZYVKo0HS6G6Ex3ijYW3zy3kjNJ'
In [27]:
# CREATE AND CHECK API AND STREAM
auth = oauth_login(CONSUMER_KEY, CONSUMER_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
twitter_api = twitter.Twitter(auth=auth)
twitter_stream = twitter.TwitterStream(auth=auth)
if twitter_api and twitter_stream:
print 'Bingo! API and stream set up!'
else:
print 'Hmmmm, something is wrong here.'
In [3]:
# COLLECT TWEETS FROM STREAM WITH TRACK, 'PATRIOTS'
# track = "Patriots" # Tweets for Patriots
# TOTAL_TWEETS = 2500
# patriots = []
# patriots_counter = 0
# while patriots_counter < TOTAL_TWEETS: # collect tweets while current time is less than endTime
# # Create a stream instance
# auth = oauth_login(consumer_key=CONSUMER_KEY, consumer_secret=CONSUMER_SECRET,
# token=OAUTH_TOKEN, token_secret=OAUTH_TOKEN_SECRET)
# twitter_stream = TwitterStream(auth=auth)
# stream = twitter_stream.statuses.filter(track=track)
# counter = 0
# for tweet in stream:
# if patriots_counter == TOTAL_TWEETS:
# print 'break'
# break
# elif counter % 500 == 0 and counter != 0:
# print 'get new stream'
# break
# else:
# patriots.append(tweet)
# patriots_counter += 1
# counter += 1
# print patriots_counter, counter
# save_json('json/patriots', patriots)
In [29]:
# Use this code to load tweets that have already been collected
filename = "stream/json/patriots"
results = load_json(filename)
print 'Number of tweets loaded:', len(results)
In [5]:
# Compute additional statistics about the tweets collected
# Determine the average number of words in the text of each tweet
def average_words(tweet_texts):
total_words = sum([len(s.split()) for s in tweet_texts])
return 1.0*total_words/len(tweet_texts)
tweet_texts = [ tweet['text']
for tweet in results ]
print 'Average number of words:', average_words(tweet_texts)
# Calculate the lexical diversity of all words contained in the tweets
def lexical_diversity(tokens):
return 1.0*len(set(tokens))/len(tokens)
words = [ word
for tweet in tweet_texts
for word in tweet.split() ]
print 'Lexical Diversity:', lexical_diversity(words)
In [6]:
from collections import Counter
from prettytable import PrettyTable
import nltk
tweet_texts = [ tweet['text']
for tweet in results ]
words = [ word
for tweet in tweet_texts
for word in tweet.split()
if word not in ['RT', '&'] # filter out RT and ampersand
]
# Use the natural language toolkit to eliminate stop words
# nltk.download('stopwords') # download stop words if you do not have it
stop_words = nltk.corpus.stopwords.words('english')
non_stop_words = [w for w in words if w.lower() not in stop_words]
# frequency of words
count = Counter(non_stop_words).most_common()
# table of the top 30 words with their counts
pretty_table = PrettyTable(field_names=['Word', 'Count'])
[ pretty_table.add_row(w) for w in count[:30] ]
pretty_table.align['Word'] = 'l'
pretty_table.align['Count'] = 'r'
print pretty_table
2. Find the most popular tweets in your collection of tweets
Please plot a table of the top 10 tweets that are the most popular among your collection, i.e., the tweets with the largest number of retweet counts.
In [7]:
from collections import Counter
from prettytable import PrettyTable
# Create a list of all tweets with a retweeted_status key, and index the originator of that tweet and the text.
retweets = [
(tweet['retweet_count'],
tweet['retweeted_status']['user']['screen_name'],
tweet['text'])
#Ensure that a retweet exists
for tweet in results
if tweet.has_key('retweeted_status')
]
pretty_table = PrettyTable(field_names = ['Count','Screen Name','Text'])
# Sort tweets by descending number of retweets and display the top 10 results in a table.
[pretty_table.add_row(row) for row in sorted(retweets, reverse = True)[:10]]
pretty_table.max_width['Text'] = 50
pretty_table.align = 'l'
print pretty_table
Another measure of tweet "popularity" could be the number of times it is favorited. The following calculates the top-10 tweets with the most "favorites"
In [8]:
from prettytable import PrettyTable
# Determine the number of "favorites" for each tweet collected.
favorites = [
(tweet['favorite_count'],
tweet['text'])
for tweet in results
]
pretty_table = PrettyTable(field_names = ['Count','Text'])
# Sort tweets by descending number of favorites and display the top 10 results in a table.
[pretty_table.add_row(row) for row in sorted(favorites, reverse = True)[:10]]
pretty_table.max_width['Text'] = 75
pretty_table.align = 'l'
print pretty_table
3. Find the most popular Tweet Entities in your collection of tweets
Please plot a table of the top 10 hashtags, top 10 user mentions that are the most popular in your collection of tweets.
In [9]:
from collections import Counter
from prettytable import PrettyTable
# Extract the screen names which appear among the collection of tweets
screen_names = [user_mention['screen_name']
for tweet in results
for user_mention in tweet['entities']['user_mentions']]
# Extract the hashtags which appear among the collection of tweets
hashtags = [ hashtag['text']
for tweet in results
for hashtag in tweet['entities']['hashtags']]
# Simultaneously determine the frequency of screen names/hashtags, and display the top 10 most common in a table.
for label, data in (('Screen Name',screen_names),
('Hashtag',hashtags)):
pretty_table = PrettyTable(field_names =[label,'Count'])
counter = Counter(data)
[ pretty_table.add_row(entity) for entity in counter.most_common()[:10]]
pretty_table.align[label] ='l'
pretty_table.align['Count'] = 'r'
print pretty_table
Our chosen twitter user is RobGronkowski, one of the Patriots players
In [10]:
#----------------------------------------------
import sys
import time
from urllib2 import URLError
from httplib import BadStatusLine
import json
from functools import partial
from sys import maxint
# The following is the "general-purpose API wrapper" presented in "Mining the Social Web" for making robust twitter requests.
# This function can be used to accompany any twitter API function. It force-breaks after receiving more than max_errors
# error messages from the Twitter API. It also sleeps and later retries when rate limits are enforced.
def make_twitter_request(twitter_api_func, max_errors = 10, *args, **kw):
def handle_twitter_http_error(e, wait_period = 2, sleep_when_rate_limited = True):
if wait_period > 3600:
print >> sys.stderr, 'Too many retries. Quitting.'
raise e
if e.e.code == 401:
print >> sys.stderr, 'Encountered 401 Error (Not Authorized)'
return None
elif e.e.code == 404:
print >> sys.stderr, 'Encountered 404 Error (Not Found)'
return None
elif e.e.code == 429:
print >> sys.stderr, 'Encountered 429 Error (Rate Limit Exceeded)'
if sleep_when_rate_limited:
print >> sys.stderr, "Retrying again in 15 Minutes...ZzZ..."
sys.stderr.flush()
time.sleep(60*15 + 5)
print >> sys.stderr, '...ZzZ...Awake now and trying again.'
return 2
else:
raise e
elif e.e.code in (500,502,503,504):
print >> sys.stderr, 'Encountered %i Error. Retrying in %i seconds' % \
(e.e.code, wait_period)
time.sleel(wait.period)
wait.period *= 1.5
return wait_period
else:
raise e
wait_period = 2
error_count = 0
while True:
try:
return twitter_api_func(*args,**kw)
except twitter.api.TwitterHTTPError, e:
error_count = 0
wait_period = handle_twitter_http_error (e, wait_period)
if wait_period is None:
return
except URLError, e:
error_count += 1
print >> sys.stderr, "URLError encountered. Continuing"
if error_count > max_errors:
print >> sys.stderr, "Too many consecutive errors...bailing out."
raise
except BadStatusLine, e:
error_count += 1
print >> sys.stderr, "BadStatusLineEncountered. Continuing"
if error_count > max_errors:
print >> sys.stderr, "Too many consecutive errors...bailing out."
raise
In [11]:
# This function uses the above Robust Request wrapper to retreive all friends and followers of a given user. This code
# can be found in Chapter 9, the `Twitter Cookbook' in "Mining the social web"
from functools import partial
from sys import maxint
def get_friends_followers_ids(twitter_api, screen_name = None, user_id = None, friends_limit = maxint, followers_limit = maxint):
assert(screen_name != None) != (user_id != None), \
"Must have screen_name or user_id, but not both"
# See https://dev.twitter.com/docs/api/1.1/get/friends/ids and
# https://dev.twitter.com/docs/api/1.1/get/followers/ids for details
# on API parameters
get_friends_ids = partial(make_twitter_request, twitter_api.friends.ids,
count=5000)
get_followers_ids = partial(make_twitter_request, twitter_api.followers.ids,
count=5000)
friends_ids, followers_ids = [], []
for twitter_api_func, limit, ids, label in [
[get_friends_ids, friends_limit, friends_ids, "friends"],
[get_followers_ids, followers_limit, followers_ids, "followers"]
]:
if limit == 0: continue
cursor = -1
while cursor != 0:
# Use make_twitter_request via the partially bound callable...
if screen_name:
response = twitter_api_func(screen_name=screen_name, cursor=cursor)
else: # user_id
response = twitter_api_func(user_id=user_id, cursor=cursor)
if response is not None:
ids += response['ids']
cursor = response['next_cursor']
print 'Fetched {0} total {1} ids for {2}'.format(len(ids),
label, (user_id or screen_name))
if len(ids) >= limit or response is None:
break
return friends_ids[:friends_limit], followers_ids[:followers_limit]
Use the following code to retreive all friends and followers of @RobGronkowski, one of the Patriots players.
In [12]:
# Retrieve the friends and followers of a user, and save to a json file.
# screen_name = 'RobGronkowski'
# gronk_friends_ids, gronk_followers_ids = get_friends_followers_ids(twitter_api, screen_name = screen_name)
# filename = "json/gronk_friends"
# save_json(filename, gronk_friends_ids)
# filename = "json/gronk_followers"
# save_json(filename, gronk_followers_ids)
In [13]:
# Use this code to load the already-retrieved friends and followers from a json file.
gronk_followers_ids = load_json('json/gronk_followers')
gronk_friends_ids = load_json('json/gronk_friends')
In [14]:
# The following function retrieves the screen names of Twitter users, given their user IDs. If a certain number of screen
# names is desired (for example, 20) max_ids limits the number retreived.
def get_screen_names(twitter_api, user_ids = None, max_ids = None):
response = []
items = user_ids
# Due to individual user security settings, not all user profiles can be obtained. Iterate over all user IDs
# to ensure at least (max_ids) screen names are obtained.
while len(response) < max_ids:
items_str = ','.join([str(item) for item in items[:100]])
items = items[100:]
responses = make_twitter_request(twitter_api.users.lookup, user_id = items_str)
response += responses
items_to_info = {}
# The above loop has retrieved all user information.
for user_info in response:
items_to_info[user_info['id']] = user_info
# Extract only the screen names obtained. The keys of items_to_info are the user ID numbers.
names = [items_to_info[number]['screen_name']
for number in items_to_info.keys()
]
numbers =[number for number in items_to_info.keys()]
return names , numbers
In [15]:
from prettytable import PrettyTable
# Given a set of user ids, this function calls get_screen_names and plots a table of the first (max_ids) ID's and screen names.
def table_ids_screen_names(twitter_api, user_ids = None, max_ids = None):
names, numbers = get_screen_names(twitter_api, user_ids = user_ids, max_ids = max_ids)
ids_screen_names = zip(numbers, names)
pretty_table = PrettyTable(field_names = ['User ID','Screen Name'])
[ pretty_table.add_row (row) for row in ids_screen_names[:max_ids]]
pretty_table.align = 'l'
print pretty_table
In [16]:
# Given a list of friends_ids and followers_ids, this function counts and prints the size of each collection.
# It then plots a tables of the first (max_ids) listed friends and followers.
def display_friends_followers(screen_name, friends_ids, followers_ids ,max_ids = None):
friends_ids_set, followers_ids_set = set(friends_ids),set(followers_ids)
print
print '{0} has {1} friends. Here are {2}:'.format(screen_name, len(friends_ids_set),max_ids)
print
table_ids_screen_names(twitter_api, user_ids = friends_ids, max_ids = max_ids)
print
print '{0} has {1} followers. Here are {2}:'.format(screen_name,len(followers_ids_set),max_ids)
print
table_ids_screen_names(twitter_api, user_ids = followers_ids, max_ids = max_ids)
print
In [17]:
display_friends_followers(screen_name = screen_name, friends_ids = gronk_friends_ids, followers_ids = gronk_followers_ids, max_ids = 20)
In [18]:
# Given a list of friends_ids and followers_ids, this function use set intersection to find the number of mutual friends.
# It then plots a table of the first (max_ids) listed mutual friends.
def display_mutual_friends(screen_name, friends_ids, followers_ids ,max_ids = None):
friends_ids_set, followers_ids_set = set(friends_ids),set(followers_ids)
print
print '{0} has {1} mutual friends. Here are {2}:'.format(screen_name, len(friends_ids_set.intersection(followers_ids_set)),max_ids)
print
mutual_friends_ids = list(friends_ids_set.intersection(followers_ids_set))
table_ids_screen_names(twitter_api, user_ids = mutual_friends_ids, max_ids = max_ids)
In [19]:
display_mutual_friends(screen_name = screen_name, friends_ids = gronk_friends_ids, followers_ids = gronk_followers_ids, max_ids = 20)
The following code was used to collect all Twitter followers of the Patriots, Broncos, and Panthers. Once collected, the followers were saved to files.
In [20]:
# ## PATRIOTS
# patriots_friends_ids, patriots_followers_ids = get_friends_followers_ids(twitter_api, screen_name = 'Patriots')
# save_json('json/Patriots_Followers',patriots_followers_ids)
# save_json('json/Patriots_Friends', patriots_friends_ids)
# ## BRONCOS
# broncos_friends_ids, broncos_followers_ids = get_friends_followers_ids(twitter_api, screen_name = 'Broncos')
# save_json('json/Broncos_Followers',broncos_followers_ids)
# save_json('json/Broncos_Friends', broncos_friends_ids)
# ## PANTHERS
# panthers_friends_ids, panthers_followers_ids = get_friends_followers_ids(twitter_api, screen_name = 'Panthers')
# save_json('json/Panthers_Followers',panthers_followers_ids)
# save_json('json/Panthers_Friends', panthers_friends_ids)
This code is used to load the above followers, having already been collected. It then makes a venn-diagram comparing the mutual followers between the three teams.
In [30]:
patriots_followers_ids = load_json('json/Patriots_Followers')
broncos_followers_ids = load_json('json/Broncos_Followers')
panthers_followers_ids = load_json('json/Panthers_Followers')
In [24]:
%matplotlib inline
from matplotlib_venn import venn3
patriots_followers_set = set(patriots_followers_ids)
broncos_followers_set = set(broncos_followers_ids)
panthers_followers_set = set(panthers_followers_ids)
venn3([patriots_followers_set, broncos_followers_set, panthers_followers_set], ('Patriots Followers', 'Broncos Followers',
'Panthers Followers'))
Out[24]:
Next we wanted to estimate popularity of the Broncos and Panthers (the two remaining Super Bowl teams) in the Boston area. Our chosen metric of "popularity" is the speed at which tweets are generated. The following periodically collects tweets (constrained to the Boston geo zone) filtered for "Broncos" and "Panthers." It tracks the number of such tweets collected in each time window, allowing us to estimate a Tweets per Minute ratio.
In [31]:
# COLLECT TWEETS FROM STREAM WITH BRONCOS AND PANTHERS IN TWEET TEXT FROM BOSTON GEO ZONE
# from datetime import timedelta, datetime
# from time import sleep
# from twitter import TwitterStream
# track = "Broncos, Panthers" # Tweets for Broncos OR Panthers
# locations = '-73.313057,41.236511,-68.826305,44.933163' # New England / Boston geo zone
# NUMBER_OF_COLLECTIONS = 5 # number of times to collect tweets from stream
# COLLECTION_TIME = 2.5 # length of each collection in minutes
# WAIT_TIME = 10 # sleep time in between collections in minutes
# date_format = '%m/%d/%Y %H:%M:%S' # i.e. 1/1/2016 13:00:00
# broncos, panthers, counts = [], [], []
# for counter in range(1, NUMBER_OF_COLLECTIONS + 1):
# print '------------------------------------------'
# print 'COLLECTION NUMBER %s out of %s' % (counter, NUMBER_OF_COLLECTIONS)
# broncos_counter, panthers_counter = 0, 0 # set the internal counter for Broncos and Panthers to 0
# count_dict = {'start_time': datetime.now().strftime(format=date_format)} # add collection start time
# # Create a new stream instance every collection to avoid rate limits
# auth = oauth_login(consumer_key=CONSUMER_KEY, consumer_secret=CONSUMER_SECRET, token=OAUTH_TOKEN, token_secret=OAUTH_TOKEN_SECRET)
# twitter_stream = TwitterStream(auth=auth)
# stream = twitter_stream.statuses.filter(track=track, locations=locations)
# endTime = datetime.now() + timedelta(minutes=COLLECTION_TIME)
# while datetime.now() <= endTime: # collect tweets while current time is less than endTime
# for tweet in stream:
# if 'text' in tweet.keys(): # check to see if tweet contains text
# if datetime.now() > endTime:
# break # if the collection time is up, break out of the loop
# elif 'Broncos' in tweet['text'] and 'Panthers' in tweet['text']:
# broncos.append(tweet), panthers.append(tweet) # if a tweet contains both Broncos and Panthers, add the tweet to both arrays
# broncos_counter += 1
# panthers_counter += 1
# print 'Panthers: %s, Broncos: %s' % (panthers_counter, broncos_counter)
# elif 'Broncos' in tweet['text']:
# broncos.append(tweet)
# broncos_counter += 1
# print 'Broncos: %s' % broncos_counter
# elif 'Panthers' in tweet['text']:
# panthers.append(tweet)
# panthers_counter += 1
# print 'Panthers: %s' % panthers_counter
# else:
# print 'continue' # if the tweet text does not match 'Panthers' or 'Broncos', keep going
# continue
# count_dict['broncos'] = broncos_counter
# count_dict['panthers'] = panthers_counter
# count_dict['end_time'] = datetime.now().strftime(format=date_format) # add collection end time
# counts.append(count_dict)
# print counts
# if counter != NUMBER_OF_COLLECTIONS:
# print 'Sleeping until %s' % (datetime.now() + timedelta(minutes=WAIT_TIME))
# sleep(WAIT_TIME * 60) # sleep for WAIT_TIME
# else:
# print '------------------------------------------'
# # Save arrays to files
# save_json('stream/json/counts', counts)
# save_json('stream/json/broncos', broncos)
# save_json('stream/json/panthers', panthers)
In [56]:
# LOAD JSON FOR BRONCOS AND PANTHERS
broncos = load_json('stream/json/broncos')
panthers = load_json('stream/json/panthers')
counts = load_json('stream/json/counts')
pretty_table = PrettyTable(field_names =['Broncos Tweets', 'Collection Start Time', 'Panthers Tweets','Collection End Time'])
[ pretty_table.add_row(row.values()) for row in counts]
pretty_table.align[label] ='l'
pretty_table.align['Count'] = 'r'
print pretty_table
print 'TOTALS – Broncos: %s, Panthers: %s' % (len(broncos), len(panthers))
In [101]:
%matplotlib inline
# CUMULATIVE TWEET DISTRIBUTION FOR BRONCOS AND PANTHERS
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import mlab
# create numpy arrays for Broncos and Panthers tweets
broncos_tweets = np.array([row['broncos'] for row in counts])
panthers_tweets = np.array([row['panthers'] for row in counts])
bins = len(counts) * 10
# evaluate histogram
broncos_values, broncos_base = np.histogram(broncos_tweets, bins=bins)
panthers_values, panthers_base = np.histogram(panthers_tweets, bins=bins)
# evaluate cumulative function
broncos_cumulative = np.cumsum(broncos_values)
panthers_cumulative = np.cumsum(panthers_values)
# plot cumulative function
plt.plot(broncos_base[:-1], broncos_cumulative, c='darkorange')
plt.plot(panthers_base[:-1], panthers_cumulative, c='blue')
plt.grid(True)
plt.title('Cumulative Distribution of Broncos & Panthers Tweets')
plt.xlabel('Tweets')
plt.ylabel('Collection')
plt.show()
All set!
What do you need to submit?
PPT Slides: please prepare PPT slides (for 10 minutes' talk) to present about the case study . We will ask two teams which are randomly selected to present their case studies in class for this case study.
Report: please prepare a report (less than 10 pages) to report what you found in the data.
What did you find in the data?
(please include figures or tables in the report, but no source code)
Please compress all the files in a zipped file.
How to submit:
Please submit through myWPI, in the Assignment "Case Study 1".
Note: Each team just need to submit one submission in myWPI
Totoal Points: 120
Notebook: Points: 80
-----------------------------------
Qestion 1:
Points: 20
-----------------------------------
(1) Select a topic that you are interested in.
Points: 6
(2) Use Twitter Streaming API to sample a collection of tweets about this topic in real time. (It would be recommended that the number of tweets should be larger than 200, but smaller than 1 million. Please check whether the total number of tweets collected is larger than 200?
Points: 10
(3) Store the tweets you downloaded into a local file (txt file or json file)
Points: 4
-----------------------------------
Qestion 2:
Points: 20
-----------------------------------
1. Word Count
(1) Use the tweets you collected in Problem 1, and compute the frequencies of the words being used in these tweets.
Points: 4
(2) Plot a table of the top 30 words with their counts
Points: 4
2. Find the most popular tweets in your collection of tweets
plot a table of the top 10 tweets that are the most popular among your collection, i.e., the tweets with the largest number of retweet counts.
Points: 4
3. Find the most popular Tweet Entities in your collection of tweets
(1) plot a table of the top 10 hashtags,
Points: 4
(2) top 10 user mentions that are the most popular in your collection of tweets.
Points: 4
-----------------------------------
Qestion 3:
Points: 20
-----------------------------------
(1) choose a popular twitter user who has many followers, such as "ladygaga".
Points: 4
(2) Get the list of all friends and all followers of the twitter user.
Points: 4
(3) Plot 20 out of the followers, plot their ID numbers and screen names in a table.
Points: 4
(4) Plot 20 out of the friends (if the user has more than 20 friends), plot their ID numbers and screen names in a table.
Points: 4
(5) Compute the mutual friends within the two groups, i.e., the users who are in both friend list and follower list, plot their ID numbers and screen names in a table
Points: 4
-----------------------------------
Qestion 4: Explore the data
Points: 20
-----------------------------------
Novelty: 10
Interestingness: 10
-----------------------------------
Run some additional experiments with your data to gain familiarity with the twitter data and twitter API
Report: communicate the results Points: 20
(1) What data you collected? Points: 5
(2) Why this topic is interesting or important to you? (Motivations) Points: 5
(3) How did you analyse the data? Points: 5
(4) What did you find in the data? (please include figures or tables in the report, but no source code) Points: 5
Slides (for 10 minutes of presentation): Story-telling Points: 20
Motivation about the data collection, why the topic is interesting to you. Points: 5
Communicating Results (figure/table) Points: 10
Story telling (How all the parts (data, analysis, result) fit together as a story?) Points: 5
In [ ]: