This notebook is a simple way to test whether Python has been properly installed on your machine. If you receive no errors in this notebook, then your computer is ready to run the materials for this workshop! If you do receive an error, read the message closely since it offers clues to resolving the issue.
Running the code in this notebook is relatively easy. Click on the cell you wish to run (a segment of code with a gray background) in order to highlight it. Then either click the "Play" button in the toolbar above the code window or press CTRL+RETURN on your keyboard.
During the course of this week, we will try to build a community of practice around programming and its applications to research on human-language texts. One facet of this effort will be a shared repository of knowledge regarding bugs and Python itself. To that end, we ask that each time you get an error message, you create a new thread on the bCourses page for this workshop. We hope to see dialogue around its resolution as well as a record that we can refer back to later. After all, if you see an error message, it's almost certain that someone else will get it too.
A quick check to see make sure that you are running Python 3. If the number "2" is printed below, install Anaconda for Python 3.5 from here: https://www.continuum.io/downloads
In [ ]:
import sys
sys.version_info.major
In [ ]:
import os
import numpy
import matplotlib
import pandas
import sklearn
import nltk
import gensim
print("Success!")
In [ ]:
%pylab inline
In order to fully use the NLTK package for Natural Language Processing, we need to download a couple of language models that give Python extra instructions. For example, the 'punkt' model below tells Python how to break strings of text into individual words or sentences.
Running this cell will require a stable internet connection and perhaps a little patience. If it completes successfully, then it will print the word "True" at the bottom.
In [ ]:
nltk_data = ["punkt", "words", "stopwords", "averaged_perceptron_tagger", "maxent_ne_chunker", 'wordnet']
nltk.download(nltk_data)
In [ ]: