In [4]:
import pandas as pd

In [5]:
with open('C:\\Users\\yourusername\\Dropbox\\Development\\sentiment\\reddit-comments.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

In [6]:
data_df


Out[6]:
archived author author_flair_css_class author_flair_text body controversiality created_utc distinguished downs gilded id link_id name parent_id retrieved_on score score_hidden subreddit subreddit_id ups
0 NaN Van_Herenhuis None None How there are different kinds of epileptics th... 0 1490180032 None NaN 1 df97q2c t3_60tvvm NaN t3_60tvvm 1491690385 8894 NaN AskReddit t5_2qh1i NaN
1 NaN Abaddon_Jones None None My bro did a very similar thing back in compre... 0 1488404127 None NaN 0 dedwikl t3_5wynp1 NaN t3_5wynp1 1491125814 6816 NaN tifu t5_2to41 NaN
2 NaN t_rex_arms_444 None None Aghhhh NOOO, I'm ashamed all over again. I had... 0 1489078148 None NaN 0 deppb2l t3_5yfdiz NaN t1_depjztj 1491343778 1676 NaN AskReddit t5_2qh1i NaN
3 NaN Van_Herenhuis None None It's cool. Most of my having to explain comes ... 0 1490181989 None NaN 0 df98gwp t3_60tvvm NaN t1_df98g66 1491690747 983 NaN AskReddit t5_2qh1i NaN
4 NaN CapRodgers None None Harrison Ford, he auctioned his The Force Awak... 0 1490716990 None NaN 0 dfin1qq t3_61yoko NaN t3_61yoko 1491854864 675 NaN AskReddit t5_2qh1i NaN
5 NaN _OldBay None None As someone with epilepsy, this is some scary s... 0 1489541048 None NaN 0 dexw29d t3_5zg77b NaN t3_5zg77b 1491486529 596 NaN news t5_2qh3l NaN
6 NaN LlamasInLingerie None None I'm really glad to hear that it's working for ... 0 1489129357 None NaN 0 deqtx6k t3_5yk8yt NaN t3_5yk8yt 1491363459 584 NaN Futurology t5_2t7no NaN
7 NaN KHFanboy None None As someone with epilepsy, his analogy is prett... 0 1489876531 None NaN 0 df3wjae t3_605mu7 NaN t1_df3tfqh 1491597864 414 NaN WTF t5_2qh61 NaN
8 NaN ichdru21 None None > * At some point in their lives, 1 in 6 ch... 0 1490031741 None NaN 0 df6icnn t3_60fllq NaN t1_df6827m 1491643229 413 NaN AskReddit t5_2qh1i NaN
9 NaN JoLLand713 None None Oh Jeff you old prick. Yes, let's keep on spen... 0 1489606704 None NaN 0 dez1d1g t3_5zlct1 NaN t3_5zlct1 1491506562 405 NaN nottheonion t5_2qnts NaN
10 NaN --Hello_World-- None None It's broken. I got epilepsy from looking at th... 0 1488953316 None NaN 0 deniufs t3_5y4wt5 NaN t1_denhrqr 1491305684 292 NaN mildlyinteresting t5_2ti4h NaN
11 NaN JuliusCaesarAMA None None Epilepsy is no laughing matter, amice 0 1489586239 None NaN 0 deyj35d t3_5zj69v NaN t1_deyizqk 1491497705 287 NaN AskReddit t5_2qh1i NaN
12 NaN Deep_Grady None None After about a month off booze (I was drinking ... 0 1490206162 None NaN 1 df9rr5k t3_60vatf NaN t1_df9p1n2 1491700057 243 NaN pics t5_2qh0u NaN
13 NaN Dox_Bulldops None None This is actually a new Mavis Beacon model keyb... 0 1488951268 None NaN 0 denhrqr t3_5y4wt5 NaN t1_denayaw 1491305164 236 NaN mildlyinteresting t5_2ti4h NaN
14 NaN flashcats None None It's a pretty damning compliant. \n\nThe guy t... 0 1490040336 None NaN 0 df6q873 t3_60i47g NaN t3_60i47g 1491647039 186 NaN law t5_2qh9k NaN
15 NaN Xenri None None Well, glad I don't have epilepsy. 0 1488821333 None NaN 0 dekxhtr t3_5xt9nr NaN t1_dekx16h 1491260550 161 NaN HighQualityGifs t5_2ylxz NaN
16 NaN Hunnyhelp None ★★★★★ 4.879 So your obviously not a millennial, image if s... 0 1490460713 None NaN 0 dfea9gu t3_61fvmg NaN t1_dfea3qk 1491778796 153 NaN blackmirror t5_2v08h NaN
17 NaN Fuckingshitstupid Wizards Wizards "I wish I could go somewhere and feel safe"\n\... 0 1490372907 None NaN 0 dfctw44 t3_619nlm NaN t1_dfcrott 1491753412 153 NaN nba t5_2qo4s NaN
18 NaN RichardArschmann None None Epilepsy is a serious disease. Esports comme... 0 1489777197 None NaN 0 df2a59t t3_5zyqpj NaN t3_5zyqpj 1491563066 149 NaN news t5_2qh3l NaN
19 NaN Optewe None None Another source here:\n\nhttp://www.huffingtonp... 0 1489827796 None NaN 0 df34s32 t3_6037of NaN t3_6037of 1491577916 148 NaN EnoughTrumpSpam t5_39usd NaN
20 NaN NoSpicyFood None None Since this is one of those shitbag sites that ... 0 1489561174 None NaN 0 dey8nqr t3_5zg77b NaN t3_5zg77b 1491492653 143 NaN news t5_2qh3l NaN
21 NaN lvl100Warlock None ★★★★☆ 4.137 He has messages where he stated his intention ... 0 1490461928 None NaN 0 dfeb3zu t3_61fvmg NaN t1_dfea3qk 1491779205 137 NaN blackmirror t5_2v08h NaN
22 NaN helix19 None None This is done sometimes for severe epilepsy. If... 0 1490218678 None NaN 0 dfa3iae t3_60vwyu NaN t1_dfa1j8v 1491705721 137 NaN todayilearned t5_2qqjc NaN
23 NaN sydthakyd1 None None The recordings of Anneliese Michel's exorcisms... 0 1488550256 None NaN 0 degiful t3_5x9jwa NaN t3_5x9jwa 1491174328 134 NaN AskReddit t5_2qh1i NaN
24 NaN failbears None None You might want to [have a look at this then](h... 0 1488771367 None NaN 0 dek7pd0 t3_5xp1fu NaN t1_dek0mbu 1491243405 131 NaN interestingasfuck t5_2qhsa NaN
25 NaN DEEP_SEA_MAX silly bitch Joe Logan, the improved wolverine, aka the Coy... 0 1490114347 None NaN 0 df80rg4 t3_60o6dn NaN t3_60o6dn 1491669622 128 NaN JoeRogan t5_2s4tv NaN
26 NaN pahco87 None None I'd just like to point out that no currently p... 0 1489134889 None NaN 0 deqw15q t3_5yk8yt NaN t3_5yk8yt 1491364477 127 NaN Futurology t5_2t7no NaN
27 NaN Slinkyfest2005 None None Kind of sad as he appears to be impoverished. ... 0 1489602886 None NaN 0 deyxsh9 t3_5zjwc3 NaN t1_deyv0hp 1491504829 126 NaN askscience t5_2qm4e NaN
28 NaN [deleted] None None Probably should include a bit more...My wife a... 0 1489783536 None NaN 0 df2fnp6 t3_60004m NaN t3_60004m 1491565744 117 NaN BeforeNAfterAdoption t5_35qtc NaN
29 NaN Canopenerdude aatrox Kill me now... He was also incredibly intelligent and the gre... 0 1489173905 None NaN 0 derkjlx t3_5ymqv4 NaN t1_derjnms 1491376332 116 NaN leagueoflegends t5_2rfxx NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3016 NaN OR-1992 None None only [3%](http://www.epilepsy.com/learn/trigge... 0 1490213777 None NaN 0 df9yzrk t3_60u01b NaN t1_df9h8g5 1491703544 -6 NaN news t5_2qh3l NaN
3017 NaN tzporidge None None Pretty sure that dog has epilepsy 1 1488830198 None NaN 0 del58in t3_5xuxov NaN t3_5xuxov 1491264325 -6 NaN aww t5_2qh1o NaN
3018 NaN Tsukino_Stareine None None Can you prove that though? Maybe he's just enj... 0 1488921204 None NaN 0 demuz7w t3_5y284z NaN t1_demulfg 1491294131 -6 NaN runescape t5_2qwxl NaN
3019 NaN orangensaft9 None None Because the whole thing stinks. Wife getting o... 0 1490268783 None NaN 0 dfavs0o t3_60unk2 NaN t1_df9sitg 1491719387 -7 NaN Fuckthealtright t5_3fdcn NaN
3020 NaN Lebagel None None > english setters\n\nThe English Setter, wh... 1 1490016116 None NaN 0 df655lc t3_60fnvv NaN t1_df651i0 1491636847 -7 NaN AskReddit t5_2qh1i NaN
3021 NaN lucycohen None None If magicians and illusionists try to prove som... 0 1489934033 None NaN 0 df4p3t0 t3_6088ny NaN t1_df4ngju 1491611685 -7 NaN videos t5_2qh1e NaN
3022 NaN [deleted] None None Oh comon, epileptic people have feelings too, ... 0 1490294861 None NaN 0 dfbg1pa t3_611q8g NaN t3_611q8g 1491729186 -7 NaN Jokes t5_2qh72 NaN
3023 NaN tmhoc None None Look what this does for Parkinson\n\nLook what... 0 1490804583 None NaN 0 dfkbuik t3_626mw9 NaN t3_626mw9 1491885179 -8 NaN news t5_2qh3l NaN
3024 NaN PinochetIsMyHero None None Obama's cure for epilepsy. Her benefits were ... 1 1489556796 None NaN 0 dey6rzo t3_5zg77b NaN t1_dexxka6 1491491734 -8 NaN news t5_2qh3l NaN
3025 NaN icansupportthat None None Are you a child with epilepsy? If not, don't d... 0 1489642277 None NaN 0 deztna9 t3_5zos8d NaN t1_deztlcm 1491520261 -9 NaN Fitness t5_2qhx4 NaN
3026 NaN Macarogi None None "The Twitter service regularly shows pictures,... 0 1489777685 None NaN 0 df2akxr t3_5zyqpj NaN t1_df2a06s 1491563278 -10 NaN news t5_2qh3l NaN
3027 NaN StunamiRS skill-quest Only people with photosensitive epilepsy can h... 0 1490053857 None NaN 0 df71erv t3_60hiwt NaN t1_df6z3q2 1491652464 -10 NaN runescape t5_2qwxl NaN
3028 NaN GreasyBub None None Uh... Yeah. If you're epileptic and choose to ... 0 1489209512 None NaN 0 desak99 t3_5ypd3r NaN t1_des4o7x 1491388937 -11 NaN runescape t5_2qwxl NaN
3029 NaN meddlingbiscuit None None wow this guy used a big teleport animation, su... 0 1488927403 None NaN 0 den0jvf t3_5y284z NaN t3_5y284z 1491296836 -11 NaN runescape t5_2qwxl NaN
3030 NaN CattleRaider None None Except he did show remorse and even donated to... 0 1490216856 None NaN 0 dfa1vei t3_60u01b NaN t1_df9adl2 1491704933 -13 NaN news t5_2qh3l NaN
3031 NaN Nobody1795 None None >My mother has epilepsy. She uses the inter... 0 1490213010 None NaN 0 df9y9va t3_60u01b NaN t1_df9vdxq 1491703197 -13 NaN news t5_2qh3l NaN
3032 NaN bomi3ster None None >posting an image that induces seizures in ... 0 1490128983 None NaN 0 df8e25t t3_60or27 NaN t1_df878vy 1491676044 -14 NaN worldnews t5_2qh13 NaN
3033 NaN Macarogi None None 99.9% chance Eichenwald fabricated his epileps... 0 1490203007 None NaN 0 df9oot9 t3_60u01b NaN t1_df9m3bf 1491698578 -15 NaN news t5_2qh3l NaN
3034 NaN GooseNZ None None I wanted to make a joke about epilepsy. \n\nSo... 0 1488917952 None NaN 0 demrwj4 t3_5y2rn7 NaN t3_5y2rn7 1491292649 -16 NaN gifs t5_2qt55 NaN
3035 NaN bluebonnet_bouquet None None People with epilepsy shouldn't be allowed to d... 0 1488420466 None NaN 0 dee9mch t3_5wysz2 NaN t1_dedxeyx 1491132151 -16 NaN Austin t5_2qhn5 NaN
3036 NaN ChanceV None None Some games are simply not made for you then.\n... 0 1490656570 None NaN 0 dfhn8a7 t3_61topr NaN t3_61topr 1491837406 -17 NaN MECoOp t5_2tkk1 NaN
3037 NaN benjimaestro None None Isn't that the shit which causes epilepsy? 0 1490050919 None NaN 0 df6z4v7 t3_60i5wf NaN t1_df6kk9e 1491651360 -19 NaN quityourbullshit t5_2y8xf NaN
3038 NaN lucycohen None None The difficulty is that vaccines are causing fa... 0 1489415443 None NaN 0 devcd7b t3_5z0j8g NaN t1_dev9vm1 1491442170 -21 NaN worldnews t5_2qh13 NaN
3039 NaN Macarogi None None "The Twitter service regularly shows pictures,... 0 1489775472 None NaN 0 df28kd0 t3_5zyqpj NaN t1_df28fvx 1491562303 -22 NaN news t5_2qh3l NaN
3040 NaN StunamiRS skill-quest Except they don't trigger epilepsy. 0 1490036923 None NaN 0 df6n3yn t3_60hiwt NaN t1_df6h3wf 1491645532 -24 NaN runescape t5_2qwxl NaN
3041 NaN GrownUpTurk Lakers1 Lakers GODDAMN LOL \n\nthis is fucked up, I have a do... 0 1490374594 None NaN 0 dfcvg9y t3_619nlm NaN t1_dfctw44 1491754183 -25 NaN nba t5_2qo4s NaN
3042 NaN fauxgnaws None None What I don't understand about this story is if... 0 1489856104 None NaN 0 df3i08i t3_604hf5 NaN t3_604hf5 1491587855 -26 NaN technology t5_2qh16 NaN
3043 NaN fauxgnaws None None Don't have epilepsy, and if I did I'd probably... 0 1489859202 None NaN 0 df3ka0x t3_604hf5 NaN t1_df3k49z 1491588972 -29 NaN technology t5_2qh16 NaN
3044 NaN k0mbine None None People clearly disagree with me, though, so I ... 0 1488871702 None NaN 0 delz40i t3_5xwuo4 NaN t1_delz21j 1491278760 -32 NaN zelda t5_2r61g NaN
3045 NaN steamwhy None ★★★★☆ 3.671 > "We consider the message like a bomb or s... 0 1490457678 None NaN 0 dfe86hb t3_61fvmg NaN t3_61fvmg 1491777790 -105 NaN blackmirror t5_2v08h NaN

3046 rows × 20 columns


In [8]:
import graphlab
graphlab.canvas.set_target('ipynb')

In [9]:
import matplotlib.pyplot as plt
%matplotlib inline

In [10]:
reddit_data = graphlab.SFrame(data_df)


[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: C:\Users\yourusername\AppData\Local\Temp\graphlab_server_1492781002.log.0
This non-commercial license of GraphLab Create for academic use is assigned to na and will expire on April 21, 2018.

In [12]:
reddit_data = graphlab.SFrame(data_df)
import time
import datetime
reddit_data['Comment Date'] = reddit_data.apply(lambda row: (datetime.datetime.fromtimestamp(row['created_utc'])).date())

In [13]:
reddit_data.show(view="Bar Chart", x="Comment Date")



In [56]:
with open('C:\\Users\\yourusername\\Downloads\\reddit_epilepsy_adjectives.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

In [57]:
reddit_adjective_data = graphlab.SFrame(data_df)

In [58]:
reddit_adjective_data


Out[58]:
adj_count tokens_text_content
539 other
370 more
301 medical
289 good
271 many
237 sure
230 different
221 first
212 bad
209 few
[20 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [38]:
reddit_adjective_data.show(view='Bar Chart', x="tokens_text_content")



In [53]:
reddit_adjective_data['tokens_text_content'].show()



In [42]:
reddit_adjective_data.show(view="Bar Chart")



In [43]:
reddit_adjective_data


Out[43]:
tokens_text_content
different
popular
rarest
confusing
Epileptic
violent
conscious
afraid
ashamed
terrifying
[22086 rows x 1 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [46]:
import graphlab.aggregate as agg
adj_count = reddit_adjective_data.groupby(key_columns='tokens_text_content',
            operations={'count': agg.COUNT()})

In [47]:
adj_count


Out[47]:
tokens_text_content count
unstable 2
costly 3
open 30
darkened 1
misinformed 1
neutral 4
puzzled 1
abhorrent 1
inclined 2
decent 12
[3139 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [50]:
adj_count.sort('count')
adj_count.show(view="Bar Chart")



In [59]:
reddit_adjective_data


Out[59]:
adj_count tokens_text_content
539 other
370 more
301 medical
289 good
271 many
237 sure
230 different
221 first
212 bad
209 few
[20 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [100]:
import numpy as np
plt.rcdefaults()
reddit_adjective_data.dropna()
objects = reddit_adjective_data['tokens_text_content']
y_pos = np.arange(len(objects))
performance = reddit_adjective_data['adj_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Usage')
plt.title('Adjective usage')
 
plt.axis('tight')

plt.show()



In [102]:
with open('C:\\Users\\yourusername\\Downloads\\results-20170421-pos.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_pos_data = graphlab.SFrame(data_df)

plt.rcdefaults()
reddit_adjective_data.dropna()
objects = reddit_adjective_pos_data['tokens_text_content']
y_pos = np.arange(len(objects))
performance = reddit_adjective_pos_data['adj_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Usage')
plt.title('Adjective usage positive sentiment')
 
plt.axis('tight')

plt.show()



In [105]:
with open('C:\\Users\\yourusername\\Downloads\\results-20170421-neg.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_neg_data = graphlab.SFrame(data_df)

plt.rcdefaults()
reddit_adjective_neg_data.dropna()
objects = reddit_adjective_neg_data['tokens_text_content']
y_pos = np.arange(len(objects))
performance = reddit_adjective_neg_data['adj_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Usage')
plt.title('Adjective usage negative sentiment')
 
plt.axis('tight')

plt.show()



In [117]:
with open('C:\\Users\\yourusername\\Downloads\\reddit_subreddit_group.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_pos_data = graphlab.SFrame(data_df)
reddit_adjective_pos_data = reddit_adjective_pos_data.topk('comment_count',k=25)
plt.rcdefaults()
reddit_adjective_data.dropna()
objects = reddit_adjective_pos_data['subreddit']
y_pos = np.arange(len(objects))
performance = reddit_adjective_pos_data['comment_count']
avg_score = reddit_adjective_pos_data['avg_score']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Comments')
plt.title('Comments By Reddit Group')
 
plt.axis('tight')

plt.show()

plt.ylabel('Sentiment')
plt.title('Comment Sentiment By Reddit Group')
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.bar(y_pos, avg_score, alpha=0.5, align="center")
plt.axis('tight')
plt.show()



In [180]:
with open('C:\\Users\\yourusername\\Dropbox\\Development\\sentiment\\reddit-comments.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_data = graphlab.SFrame(data_df)
import time
import datetime
reddit_data['Comment Date'] = reddit_data.apply(lambda row: (datetime.datetime.fromtimestamp(row['created_utc'])).date())

import graphlab.aggregate as agg
comments_by_month = reddit_data.groupby(key_columns='Comment Date',
            operations={'count': agg.COUNT()})
comments_by_month.sort('count')  
plt.rcdefaults()
comments_by_month.dropna()

comments_by_month = comments_by_month.sort('Comment Date')
objects = comments_by_month['Comment Date']
y_pos = np.arange(len(objects))
performance = comments_by_month['count']

from matplotlib.dates import DateFormatter

plt.bar(y_pos, performance, alpha=0.5, align="center")
 
plt.axis('tight')

import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

years = mdates.YearLocator()   # every year
months = mdates.MonthLocator()  # every month
yearsFmt = mdates.DateFormatter('%Y')

fig, ax = plt.subplots()
ax.plot(comments_by_month['Comment Date'],comments_by_month['count'])

ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')
# ax.format_ydata = price
ax.grid(False)
# rotates and right aligns the x labels, and moves the bottom of the
# axes up to make room for them
fig.autofmt_xdate()
plt.xticks(rotation=90, ha="center")
ax.xaxis.set_major_locator(ticker.MultipleLocator(2))
plt.ylabel('Comments')
plt.title('Comments By Date')

plt.show()



In [131]:
comments_by_month


Out[131]:
Comment Date count
2017-02-28 00:00:00 5
2017-03-01 00:00:00 85
2017-03-02 00:00:00 73
2017-03-03 00:00:00 101
2017-03-04 00:00:00 70
2017-03-05 00:00:00 76
2017-03-06 00:00:00 92
2017-03-07 00:00:00 83
2017-03-08 00:00:00 91
2017-03-09 00:00:00 102
[32 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [185]:
with open('C:\\Users\\yourusername\\Downloads\\epilepsy_entities2.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_pos_data = graphlab.SFrame(data_df)
reddit_adjective_pos_data = reddit_adjective_pos_data.topk('usage_count',k=30)
plt.rcdefaults()
reddit_adjective_data.dropna()
objects = reddit_adjective_pos_data['entities_name']
y_pos = np.arange(len(objects))
performance = reddit_adjective_pos_data['usage_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Mentions')
plt.title('Mentions Of Entities')
 
plt.axis('tight')

plt.show()



In [12]:
import graphlab
graphlab.canvas.set_target('ipynb')
import pandas as pd
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

import matplotlib.pyplot as plt
%matplotlib inline

with open('C:\\Users\\yourusername\\Downloads\\results-neg-words.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_neg_data = graphlab.SFrame(data_df)

plt.rcdefaults()
reddit_adjective_neg_data.dropna()
objects = reddit_adjective_neg_data['entities_name']
y_pos = np.arange(len(objects))
performance = reddit_adjective_neg_data['usage_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Usage')
plt.title('Negative Entities Used')
 
plt.axis('tight')

plt.show()



In [11]:
import graphlab
graphlab.canvas.set_target('ipynb')
import pandas as pd
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker

import matplotlib.pyplot as plt
%matplotlib inline

with open('C:\\Users\\yourusername\\Downloads\\results-pos.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_neg_data = graphlab.SFrame(data_df)

plt.rcdefaults()
reddit_adjective_neg_data.dropna()
objects = reddit_adjective_neg_data['entities_name']
y_pos = np.arange(len(objects))
performance = reddit_adjective_neg_data['usage_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Usage')
plt.title('Positive Entities Used')
 
plt.axis('tight')

plt.show()



In [29]:
import graphlab
graphlab.canvas.set_target('ipynb')
import pandas as pd
import datetime
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as ticker
import time
import datetime

import matplotlib.pyplot as plt
%matplotlib inline

with open('C:\\Users\\yourusername\\Downloads\\2017_q1.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_neg_data = graphlab.SFrame(data_df)

plt.rcdefaults()

years = mdates.YearLocator()   # every year
months = mdates.MonthLocator()  # every month
yearsFmt = mdates.DateFormatter('%Y')

reddit_adjective_neg_data['Comment Date'] = reddit_adjective_neg_data.apply(lambda row: datetime.datetime.strptime(row['comment_dt'], "%Y-%m-%d").date())
#data_df

fig, ax = plt.subplots()
ax.plot(reddit_adjective_neg_data['Comment Date'],reddit_adjective_neg_data['f0_'])

#ax.format_xdata = mdates.DateFormatter('%Y-%m-%d')
# ax.format_ydata = price
ax.grid(False)
# rotates and right aligns the x labels, and moves the bottom of the
# axes up to make room for them
fig.autofmt_xdate()
plt.xticks(rotation=90, ha="center")
ax.xaxis.set_major_locator(ticker.MultipleLocator(7))
plt.ylabel('Comments')
plt.title('Comments By Date')

plt.show()



In [20]:
data_df


Out[20]:
comment_dt f0_
0 2017-01-01 126
1 2017-01-02 122
2 2017-01-03 260
3 2017-01-04 194
4 2017-01-05 212
5 2017-01-06 138
6 2017-01-07 144
7 2017-01-08 116
8 2017-01-09 200
9 2017-01-10 204
10 2017-01-11 152
11 2017-01-12 184
12 2017-01-13 314
13 2017-01-14 136
14 2017-01-15 216
15 2017-01-16 186
16 2017-01-17 194
17 2017-01-18 172
18 2017-01-19 196
19 2017-01-20 174
20 2017-01-21 152
21 2017-01-22 172
22 2017-01-23 180
23 2017-01-24 172
24 2017-01-25 118
25 2017-01-26 190
26 2017-01-27 168
27 2017-01-28 192
28 2017-01-29 194
29 2017-01-30 160
... ... ...
60 2017-03-02 77
61 2017-03-03 95
62 2017-03-04 81
63 2017-03-05 66
64 2017-03-06 91
65 2017-03-07 90
66 2017-03-08 90
67 2017-03-09 96
68 2017-03-10 180
69 2017-03-11 117
70 2017-03-12 70
71 2017-03-13 63
72 2017-03-14 82
73 2017-03-15 101
74 2017-03-16 75
75 2017-03-17 106
76 2017-03-18 107
77 2017-03-19 114
78 2017-03-20 127
79 2017-03-21 115
80 2017-03-22 274
81 2017-03-23 115
82 2017-03-24 89
83 2017-03-25 68
84 2017-03-26 86
85 2017-03-27 86
86 2017-03-28 102
87 2017-03-29 80
88 2017-03-30 64
89 2017-03-31 66

90 rows × 2 columns


In [33]:
with open('C:\\Users\\yourusername\\Downloads\\entities_q1.json', 'rb') as f:
    data = f.readlines()
    
data = map(lambda x: x.rstrip(), data)

data_json_str = "[" + ','.join(data) + "]"

data_df = pd.read_json(data_json_str)

reddit_adjective_data = graphlab.SFrame(data_df)
reddit_adjective_data = reddit_adjective_data.topk('usage_count',k=30)
plt.rcdefaults()
reddit_adjective_data.dropna()
objects = reddit_adjective_pos_data['entities_name']
y_pos = np.arange(len(objects))
performance = reddit_adjective_pos_data['usage_count']

plt.bar(y_pos, performance, alpha=0.5, align="center")
plt.xticks(y_pos, objects, rotation="vertical", ha="center")
plt.ylabel('Mentions')
plt.title('Mentions Of Entities')
 
plt.axis('tight')

plt.show()



In [ ]: