Unit 7 | Assignment - Distinguishing Sentiments

Background

Twitter has become a wildly sprawling jungle of information—140 characters at a time. Somewhere between 350 million and 500 million tweets are estimated to be sent out per day. With such an explosion of data, on Twitter and elsewhere, it becomes more important than ever to tame it in some way, to concisely capture the essence of the data.

Choose one of the following two assignments, in which you will do just that. Good luck!

News Mood

In this assignment, you'll create a Python script to perform a sentiment analysis of the Twitter activity of various news oulets, and to present your findings visually.

Your final output should provide a visualized summary of the sentiments expressed in Tweets sent out by the following news organizations: BBC, CBS, CNN, Fox, and New York times.

The first plot will be and/or feature the following:

  • Be a scatter plot of sentiments of the last 100 tweets sent out by each news organization, ranging from -1.0 to 1.0, where a score of 0 expresses a neutral sentiment, -1 the most negative sentiment possible, and +1 the most positive sentiment possible.
  • Each plot point will reflect the compound sentiment of a tweet.
  • Sort each plot point by its relative timestamp.

The second plot will be a bar plot visualizing the overall sentiments of the last 100 tweets from each organization. For this plot, you will again aggregate the compound sentiments analyzed by VADER.

The tools of the trade you will need for your task as a data analyst include the following: tweepy, pandas, matplotlib, seaborn, textblob, and VADER.

Your final Jupyter notebook must:

  • Pull last 100 tweets from each outlet.
  • Perform a sentiment analysis with the compound, positive, neutral, and negative scoring for each tweet.
  • Pull into a DataFrame the tweet's source acount, its text, its date, and its compound, positive, neutral, and negative sentiment scores.
  • Export the data in the DataFrame into a CSV file.
  • Save PNG images for each plot.

As final considerations:

  • Use the Matplotlib and Seaborn libraries.
  • Include a written description of three observable trends based on the data.
  • Include proper labeling of your plots, including plot titles (with date of analysis) and axes labels.
  • Include an exported markdown version of your Notebook called README.md in your GitHub repository.

PlotBot

In this activity, more challenging than the last, you will build a Twitter bot that sends out visualized sentiment analysis of a Twitter account's recent tweets.

Visit https://twitter.com/PlotBot5 for an example of what your script should do.

The bot receives tweets via mentions and in turn performs sentiment analysis on the most recent twitter account specified in the mention

For example, when a user tweets, "@PlotBot Analyze: @CNN," it will trigger a sentiment analysis on the CNN twitter feed.

A plot from the sentiment analysis is then tweeted to the PlotBot5 twitter feed. See below for examples of scatter plots you will generate:

Hints, requirements, and considerations:

  • Your bot should scan your account every five minutes for mentions.
  • Your bot should pull 500 most recent tweets to analyze for each incoming request.
  • Your script should prevent abuse by analyzing only Twitter accounts that have not previously been analyzed.
  • Your plot should include meaningful legend and labels.
  • It should also mention the Twitter account name of the requesting user.
  • When submitting your assignment, be sure to have at least three analyses tweeted out from your account (enlist the help of classmates, friends, or family, if necessary!).
  • Notable libraries used to complete this application include: Matplotlib, Pandas, Tweepy, TextBlob, and Seaborn.
  • You may find it helpful to organize your code in function(s), then call them.
  • If you're not yet familiar with creating functions in Python, here is a tutorial you may wish to consult: https://www.tutorialspoint.com/python/python_functions.htm.

Coding Boot Camp (C) 2017. All Rights Reserved.


In [ ]: