In [3]:
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import mrjobs as mr
Copy this notebook. Rename it as: YOURNAME-HW4-mapreduce-XX
with your name replacing YOURNAME and the xx replaced with the date you submit or copy this HW.
Upload your completed jupyter notebook to elearning site as your homework submission. Do not put this notebook on your github.
Do all the homeworks problems below: As noted doing the homework gets a 3 out of 5. Extension of homework to to implement an TD-IDF algorithm (see below)
Use the data/bible+shakes.nonpunc.txt file as the source of you analysis in this homework
In [ ]:
In [ ]:
In [ ]:
In [ ]:
The Adventures of Sherlock Holmes- http://www.gutenberg.org/ebooks/1661.txt.utf-8
A Study in Scarlet - http://www.gutenberg.org/files/244/244-0.txt
The Hound of the Baskervilles - http://www.gutenberg.org/files/2852/2852-0.txt
The Return of Sherlock Holmes - http://www.gutenberg.org/files/108/108-0.txt
The Sign of the Four - http://www.gutenberg.org/ebooks/2097.txt.utf-8
In [ ]: