Now You Code 5: Reddit Sentiment Analysis

In this assignment you're tasked with performing a sentiment analysis on top Reddit articles in a sub-reddit. If you are unfamiliar with Reddit, here are some API urls to a few sub-reddits:

From these URLs you should be able to figure out the pattern for any sub-reddit such as news, or ama.

Start by getting the Reddit API to work, for the URL above and extracting a list of titles only. You can easily view the JSON in a browser by clicking the link above. However to use the name URL with Python requests, you'll need to set a custom User-Agent in your HTTP Headers, as explained in the Rules section here: https://github.com/reddit/reddit/wiki/API Figuring this out is the point of the homework as the rest is rather trivial. Note: You do NOT need to follow the OAUTH2 instructions as you will access the Reddit api unauthenticated.

You should perform the analysis on the titles only.

After you get Reddit working move on to sentiment analysis. Once again, we will use (http://text-processing.com/api/sentiment/) like we did in the in-class coding lab. Figuring this out should be trivial.

We will start by writing the GetRedditStories and GetSentiment functions, then putting it all together.

Step 1: Problem Analysis for GetRedditStories

First let's write a function GetRedditStories to get the top news articles from the http://www.reddit.com site.

Inputs: a subreddit as string e.g. news or worldnews

Outputs: the top stories as a Python object converted from JSON

Algorithm (Steps in Program):

todo write algorithm here

In [ ]:
# Step 2: write code 

import requests

def GetRedditStories(subreddit):
    # todo write code return a list of dict of stories
    

# testing 
GetRedditStories() # you should see some stories

Step 3: Problem Analysis for GetSentiment

Now let's write a function, that when given text will return the sentiment score for the text. We will use http://text-processing.com 's API for this.

Inputs: text string

Outputs: a Python dictionary of sentiment information based on text

Algorithm (Steps in Program):

todo write algorithm here

In [ ]:
# Step 4: write code 

def GetSentiment(text):
    # todo write code to return dict of sentiment for text
    

# testing
GetSentiment("You are a bad, bad man!")

Step 5: Problem Analysis for entire program

Now let's write entire program. This program should take the titles of the Reddit stories and for each one run sentiment analysis on it. It should output the sentiment label and story title, like this:

Example Run (Your output will vary as news stories change...)

Enter Subreddit: news
Sentiment Analysis of news:
neutral : FBI Chief Comey 'Rejects' Phone Tap Allegation
pos : New Peeps-flavored Oreos reportedly turning people's poop pink
neutral : President Trump Signs Revised Travel Ban Executive Order
neutral : Police: Overdose survivors to be charged with misdemeanor
neutral : Struggling students forced to wait 3-4 weeks as Utah's public colleges don't have enough mental health therapists
neutral : Army Veteran Faces Possible Deportation to Mexico
neutral : Rep. Scott Taylor called out at town hall for ‘blocking’ constituents on social media
neutral : GM to suspend third shift at Delta Township plant, layoff 1,100 workers
neutral : American citizen Khizr Khan reportedly cancels trip to Canada after being warned his 'travel privileges are being reviewed'
neg : Mars far more likely to have had life than we thought, researchers find after new water discovery
neutral : Bird Flu Found at U.S. Farm That Supplies Chickens to Tyson
neutral : Investigation Reveals Huge Volume of Shark Fins Evading International Shipping Bans
neg : Sikh man's shooting in Washington investigated as hate crime

Problem Analysis

Inputs: a user input subreddit name.

Outputs: Sentiment Label and story title for each story.

Algorithm (Steps in Program):

todo write algorithm here

In [ ]:
## Step 6 Write final program here using the functions 
## you wrote in the previous steps!

Step 7: Questions

  1. What happens to this program when you do not have connectivity to the Internet? How can this code be modified to correct the issue?

Answer:

  1. Most of the stories come back with a neutral sentiment score. Does this surprise you? Explain your answer.

Answer:

  1. In what ways can the sentiment be more accurate?

Answer:

Step 8: Reflection

Reflect upon your experience completing this assignment. This should be a personal narrative, in your own voice, and cite specifics relevant to the activity as to help the grader understand how you arrived at the code you submitted. Things to consider touching upon: Elaborate on the process itself. Did your original problem analysis work as designed? How many iterations did you go through before you arrived at the solution? Where did you struggle along the way and how did you overcome it? What did you learn from completing the assignment? What do you need to work on to get better? What was most valuable and least valuable about this exercise? Do you have any suggestions for improvements?

To make a good reflection, you should journal your thoughts, questions and comments while you complete the exercise.

Keep your response to between 100 and 250 words.

--== Write Your Reflection Below Here ==--


In [ ]:
# RUN THIS CODE CELL TO TURN IN YOUR WORK!
from ist256.submission import Submission
Submission().submit()