[Data, the Humanist's New Best Friend](index.ipynb)
*Class 01*


*This is `Data`. Say hello to `Data`*

Welcome to Data, the Humanist's New Best Friend. Yaay!

Very briefly, in this course you will learn several things that are getting momentum now among the so called digital humanists. To be sincere, these are skills that not only a digital humanist can benefit from.

  1. Data Mining. Don't let the name scare you! If you have already heard other buzzwords such as big data, data science, and their derivatives mentioned in the media, chances that you already have a clue about data mining are high. Data mining is about explaining the past and predicting the future by means of data analysis. It combines statistics, machine learning, artificial intelligence, and database technology. Unfortunately we won't have time enough to cover all those topics.
  2. Text Analysis. Whenever you need to identify the most relevant actor in a play, analyze a corpus to extract the main entities, or process natural language to atribute authorship of an anonymous book, you will make use of text analysis techniques. Text analysis can be seen as the means to produce valuable information (i.e., data to be analyzed) from text sources.
  3. Networks Science. Facebook, twitter, tumblr, instagram, etc. are all examples of social networks. But networks can be built by interconnecting any kind of entity, not only people. That's how Social Network Analysis (SNA) is based on the more general Graph Theory. Network Science tries to understand complex structures by analyzing the relationships among their entities.

For a more detailed description, please, visit the syllabus of the course.

Setting things up!

Our only and almighty tool in this course will be the Python programming language.

So Python is the first thing you need to have installed in your computer. It is available for almost every platform, and it comes with a standard library. The Python standard library is a set of tools that provides very common and basisc tasks. However, some of the tools that we will use in this course are not included in the standard library.

To solve this issue, and in order to guarantee that everyone has the same setup, we will use the Anaconda distribution, that already packages all of the libraries that we will cover in the course. If you feel confident enough with Python and your system, you can always install the libraries on your own. The libraries are listed in the requirements.txt file.

Anaconda

As you have just seen, Anaconda is an application that installs Python as well as other useful third-party libraries. Depending on what operating system you are using, Linux, Mac or Windows, there are different instructions on how to get Anaconda running.

First you need to download the right version for your system (32 or 64 bits) from the official site. Then just follow the instruction for your system.

Linux

After downloading the installer, in the shell execute (note that $ is the symbol we use for the prompt of the shell)

$ bash <downloaded file>

Mac OSX

The shell of Mac OSX is pretty similar to the one shipped with Linux system. However, some users just prefer to use a graphical interface when installing software. If that's you, simply double-click on the Anaconda file you downloaded before, and an installer will be launched.

If you see a message that says You cannot install Anaconda in this location, follow these instructions.

Windows

Double clicking on the installer should be enough. In case of not having permissions to install applications, use the zipper version of the installers.

The Shell

The shell, also referred to as terminal or console, is the command line interface (CLI) to your computer. It is the component that it makes possible for people to interact with the operating system of their machine. And to run Python code we need to use the shell. Learning Code the Hard Way has a good guide to launch the shell that I summarize here.

Mac OSX

For Mac OSX you'll need to do this:

  1. Hold down Cmd. (⌘) and hit the spacebar.
  2. In the top right the blue "search bar" will pop up.
  3. Type: terminal
  4. Click on the Terminal application that looks kind of like a black box. This will open Terminal.
  5. You can now go to your Dock and press Ctrl.-click to pull up the menu, then select Options $\rightarrow$ Keep In Dock.

Now you have your Terminal open and it's in your Dock so you can get to it.

Linux

If you are using Linux I'm assuming that if you already know how to open your terminal. Look through the menu of your window manager for anything named Shell or Terminal.

Windows

On Windows we're going to use PowerShell. People used to work with a program called cmd.exe, but it's not nearly as usable as PowerShell. If you have Windows 7 or later, do this:

  1. Click Start.
  2. In Search programs and files type: powershell
  3. Hit Enter.

Now that you know how to launch your shell, you shoulnd't miss the opportunity of learning the basic tricks by following The Command Line Crash Course. At least, you need to be familiar with the tree structure of your disk and how to navigate folders.

Running Python

We are almost there!

Python is an interpreted language, which means that it doesn't need complicated things like compiling or linking in order to run. All you need to do is create a plain text file with the extension .py. For example, code.py and my_program.py are valid file names for Python programs; whereas my code.txt or program are not.

In order to run your code, you need to tell the interpreter where the file you want to run is by either passing the whole path, or by navigating to the location and then launching the interpreter

$ python my_code.py

$ python /home/users/me/course/homeworks/ass1.py

IPython Notebook

IPython is a library for Python that allows interactive and exploratory coding with Python. IPython uses the idea of REP loops: the user types something and then IPython Reads it, Evaluates it, and Prints it.

Another cool feature of IPython is the Notebook. This whole course itself is written using IPython Notebook, and throgh the classes, it will became your most beloved tool as well :-)

To launch it, just open a shell and type

$ ipython notebook

If you are using the Anaconda distribution, you should see ipython-notebook followed by button Launch in the main menu after executing or double-clicking Anaconda. Just press the Launch button.

Wakari

Getting and installing the environment is the right thing to do. But if for whatever reason you end up desperate and unable to get the things done, there is an online alternative called Wakari. Wakari is a service that provides cloud hosted IPython Notebooks. It has a free account that you can use, although is pretty limited and the computing power available for the free account is way less than using your own computer, I pressume.

Hello World!

After some sweat, you have the whole environment set and ready. Now, if your browser (Chrome and Firefox are browsers, by the way) hasn't automatically opened up a new window, just open a new tab, type localhost:8888 in the address bar, and hit enter.

That will open the IPython Notebook files list. In this screen you can either create a New Notebook or drag one of the notebooks of the course.

The first thing you'll see is something like


In [ ]:

That's the IPython Notebook prompt. Now type print("Hello World!"), hit Ctrl.+Enter, and see what it happens!


In [1]:
print("Hello World!")


Hello World!

And that, dear students, is your very first Python program (well, if we could call that a program).


*You're now officialy a programmer!*

For the next class