In [2]:
import pandas as pd
import graphlab as gl
In [5]:
sf = gl.SFrame('../data/jokes.dat', format = 'tsv')
Finished parsing file /Users/apoorvc/recommender_caseStudy/data/jokes.dat
Parsing completed. Parsed 100 lines in 0.044875 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as
column_type_hints=[str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/apoorvc/recommender_caseStudy/data/jokes.dat
Parsing completed. Parsed 1276 lines in 0.02049 secs.
In [8]:
sf[3]
Out[8]:
{'1:': 'The man replies, "Well, thank God I don't have cancer!"'}
In [19]:
with open('../data/jokes.dat','r') as f:
df = pd.DataFrame(i for i in f)
print(df)
0
0 1:\n
1 <p>\r\n
2 A man visits the doctor. The doctor says, &quo...
3 <br />\r\n
4 The man replies, "Well, thank God I don&#...
5 </p>\r\n
6 \n
7 2:\n
8 <p>\r\n
9 This couple had an excellent relationship goin...
10 <br />\r\n
11 "What could they possibly have said to ma...
12 <br />\r\n
13 "They told me that you were a pedophile.&...
14 <br />\r\n
15 He replied, "That's an awfully big w...
16 </p>\r\n
17 \n
18 3:\n
19 <p>\r\n
20 Q. What's 200 feet long and has 4 teeth?<...
21 <br />\r\n
22 A. The front row at a Willie Nelson concert.\r\n
23 </p>\r\n
24 \n
25 4:\n
26 <p>\r\n
27 Q. What's the difference between a man an...
28 <br />\r\n
29 A. A toilet doesn't follow you around aft...
... ...
1397 <br />\r\n
1398 The teacher answered quickly, "That would...
1399 <br />\r\n
1400 St. Peter turned to the garbage man and, figur...
1401 <br />\r\n
1402 Fortunately for him, the trash man had just se...
1403 <br />\r\n
1404 "That's right! You may enter."<...
1405 <br />\r\n
1406 St. Peter turned to the lawyer: "Name the...
1407 </p>\r\n
1408 \n
1409 149:\n
1410 <p>\r\n
1411 A little girl asked her father, "Daddy? D...
1412 <br />\r\n
1413 He replied, "No, there is a whole series ...
1414 </p>\r\n
1415 \n
1416 150:\n
1417 <p>\r\n
1418 In an interview with David Letterman, Carter p...
1419 <br />\r\n
1420 He told the joke, then waited for the translat...
1421 <br />\r\n
1422 After the speech, Carter wanted to meet the tr...
1423 <br />\r\n
1424 When Carter asked how the joke had been told i...
1425 </p>\r\n
1426 \n
[1427 rows x 1 columns]
In [22]:
df.iloc(4)
Out[22]:
<pandas.core.indexing._iLocIndexer at 0x11b30ac90>
In [23]:
numpymatrx = df.as_matrix()
In [24]:
numpymatrx
Out[24]:
array([['1:\n'],
['<p>\r\n'],
[ 'A man visits the doctor. The doctor says, "I have bad news for you. You have cancer and Alzheimer's disease".<br />\r\n'],
...,
[ 'When Carter asked how the joke had been told in Japanese, the translator responded, "I told them, 'President Carter has told a very funny joke. Please laugh now.'"\r\n'],
['</p>\r\n'],
['\n']], dtype=object)
In [26]:
numpymatrx[0:30]
Out[26]:
array([['1:\n'],
['<p>\r\n'],
[ 'A man visits the doctor. The doctor says, "I have bad news for you. You have cancer and Alzheimer's disease".<br />\r\n'],
['<br />\r\n'],
[ 'The man replies, "Well, thank God I don't have cancer!"\r\n'],
['</p>\r\n'],
['\n'],
['2:\n'],
['<p>\r\n'],
[ 'This couple had an excellent relationship going until one day he came home from work to find his girlfriend packing. He asked her why she was leaving him and she told him that she had heard awful things about him. <br />\r\n'],
['<br />\r\n'],
[ '"What could they possibly have said to make you move out?"<br />\r\n'],
['<br />\r\n'],
['"They told me that you were a pedophile."<br />\r\n'],
['<br />\r\n'],
[ 'He replied, "That's an awfully big word for a ten year old."\r\n'],
['</p>\r\n'],
['\n'],
['3:\n'],
['<p>\r\n'],
['Q. What's 200 feet long and has 4 teeth?<br />\r\n'],
['<br />\r\n'],
['A. The front row at a Willie Nelson concert.\r\n'],
['</p>\r\n'],
['\n'],
['4:\n'],
['<p>\r\n'],
['Q. What's the difference between a man and a toilet?<br />\r\n'],
['<br />\r\n'],
['A. A toilet doesn't follow you around after you use it.\r\n']], dtype=object)
In [27]:
df_n = pd.read_table('../data/jokes.dat')
In [28]:
df_n
Out[28]:
1:
0
<p>
1
A man visits the doctor. The doctor says, &quo...
2
<br />
3
The man replies, "Well, thank God I don&#...
4
</p>
5
2:
6
<p>
7
This couple had an excellent relationship goin...
8
<br />
9
"What could they possibly have said to ma...
10
<br />
11
"They told me that you were a pedophile.&...
12
<br />
13
He replied, "That's an awfully big w...
14
</p>
15
3:
16
<p>
17
Q. What's 200 feet long and has 4 teeth?<...
18
<br />
19
A. The front row at a Willie Nelson concert.
20
</p>
21
4:
22
<p>
23
Q. What's the difference between a man an...
24
<br />
25
A. A toilet doesn't follow you around aft...
26
</p>
27
5:
28
<p>
29
Q. What's O. J. Simpson's web addres...
...
...
1246
Recently a teacher, a garbage collector, and a...
1247
<br />
1248
St. Peter addressed the teacher and asked, &qu...
1249
<br />
1250
The teacher answered quickly, "That would...
1251
<br />
1252
St. Peter turned to the garbage man and, figur...
1253
<br />
1254
Fortunately for him, the trash man had just se...
1255
<br />
1256
"That's right! You may enter."<...
1257
<br />
1258
St. Peter turned to the lawyer: "Name the...
1259
</p>
1260
149:
1261
<p>
1262
A little girl asked her father, "Daddy? D...
1263
<br />
1264
He replied, "No, there is a whole series ...
1265
</p>
1266
150:
1267
<p>
1268
In an interview with David Letterman, Carter p...
1269
<br />
1270
He told the joke, then waited for the translat...
1271
<br />
1272
After the speech, Carter wanted to meet the tr...
1273
<br />
1274
When Carter asked how the joke had been told i...
1275
</p>
1276 rows × 1 columns
In [ ]:
Content source: aflaisler/recommender_caseStudy
Similar notebooks: