Lecture #2: Setting up your Development Environment

Here is what I intend to cover today:

  • Python Basics
  • What is Interactive Python (IPython)?
  • What are IPython or Jupyter Notebooks?
  • How do I make my code available to others (git)?
  • What is GitHub?

At the end of this process, I would like for each of you to be able to create an Jupyter Notebook locally on your computer, and then be able to allow anyone else to see it using the online Jupyter Notebook Viewer (https://nbviewer.jupyter.org/).

This very same file we have on the screen now will make that journey.

Before you begin

Things you'll need to do ahead of time:

  1. Create an account on github.com
  2. Install the Anaconda Python distribution
  3. Install git on your computer, which you can get here

Some references that will be very helpful to ensure you understand what we are doing and how it all works:

  1. Git References
    • What it is and what is it used for?
      • Official Documentation, especially the first three videos on this page.
      • Official Git Tutorial, if you are already familiar with the command line interface to some other version control software and just need to get started.
    • How does it work?
  2. Python References

OK. Let's get you started.

Cloning the course's git repository

This file you are currently viewing is part of the course's git repository, which you can find here:

http://github.com/marioberges/F16-12-752/

Since it is in a public repository, you should be able to clone that repository into your computer and edit each file locally. For that, you could either clone it using the command line interface to git, or a graphical user interface (whichever you installed on your computer if you chose to install git). From the command line, for instance, you would issue this command to clone it:

git clone http://github.com/marioberges/F16-12-752.git

Make sure that you can clone the repository into your computer by issuing that command.

If you are successful, you will be able to see a new folder called F16-12-752 inside the folder where you issued the command. A copy of this Jupyter Notebook file should be in there as well, and you can view it by opening an IPython Notebook Server as follows:

jupyter notebook

Just make sure you issue this last command on the corresponding folder.

Creating and using your own repositories

The steps we followed above were for cloning the course's official repository. However, you will want to repeat these steps for any other repository you may be interested in working with, especially the ones that you end up creating under your Github account. Thus, let's practice importing one of your repositories.

Follow these steps:

  1. Head over to github.com and log in using your credentials.
  2. Create a new repository and name it whatever you like.
  3. At the end of the process you will be given a checkout string. Copy that.
  4. Use the checkout string to replace the one we used earlier that looked like this:
     git clone http://github.com/yourusername/yourrepository.git
  5. Try issuing that command on your computer (obviously, replacing yourusername and yourrepository with the right information)
  6. If all goes well, you'll have your (empty) repository available for use in your computer.

Now it's time for you to practice some of your recently learned git skills.

Create a new Jupyter notebook, making sure to place it inside the folder of the repository you just cloned.

Add a couple of Python commands to it, or some comments, and save it.

Now go back to the terminal and add, commit and push the changes to your repository:

git add yourfile.ipynb
git commit -m "Made my first commit"
git push origin master

If this worked, you should be able to see the file added to your repository by simply pointing your browser to:

http://github.com/yourusername/yourrepository

Doing away with the terminal

Because Jupyter can be used to issue commands to a shell, directly, you can avoid having to switch to a terminal screen if you want to. This means we could have performed all of the above git manipulation directly from this notebook. The trick is to create a Code cell (the default type of cells) in the Jupyter notebook and then issuing the commands preceded by a ! sign, as follows:


In [1]:
!git status


On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   data/surveyresults.csv

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	Lecture5_Assignment2-2014-ReDo.ipynb

no changes added to commit (use "git add" and/or "git commit -a")

Try running the above cell and see what you get.


In [2]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [3]:
!file -I data/surveyresults.csv


data/surveyresults.csv: text/plain; charset=utf-16le

In [4]:
!iconv -f utf-16le -t utf-8 < data/surveyresults.csv > data/surveyresults_fixed.csv

In [5]:
!file -I data/surveyresults_fixed.csv


data/surveyresults_fixed.csv: text/plain; charset=utf-8

In [6]:
with open('data/surveyresults_fixed.csv') as f:
    contents = f.readlines()

In [7]:
len(contents)


Out[7]:
29

In [8]:
for i in range(len(contents)):
    print(contents[i][-15:])


sing platform"

"4,6,13,11,21"

""14"",""25"""

","3,2,7,8,15"

14,11,15,7,20"

5,15,16,18,38"

,"7,18,8,21,1"

","8,1,2,12,5"

,"13,12,9,3,2"

"13,15,3,12,6"

"6,10,12,13,7"

3,18,22,36,48"

0,18,19,20,24"

"15,4,5,22,23"

18,26,6,11,13"

16,6,22,15,13"

"13,12,3,6,23"

"4,15,23,16,6"

","8,6,10,7,3"

6,11,14,22,24"

,"3,6,8,11,24"

17,24,13,11,7"

,"2,7,14,8,12"

>","3,1,5,2,7"

"8,10,6,22,13"

"11,4,13,7,20"

,"6,14,7,24,3"

,"13,14,8,6,5"

,"2,7,8,13,15"


In [9]:
a = contents
out = []

for l in a:
    d = l.split(' </ul>')[1]
    nums = d.split('"')[2]
    if len(nums) > 0:
        n = nums.split(',')
        try:
            out.append([int(x) for x in n])
        except Exception as e:
            print('couldnt parse:', n)


#borda count
b_score = {}
for vote in out:
    for i,paper in enumerate(vote):
        if not paper in b_score:
            b_score[paper] = 0
        b_score[paper] += 5 - i

#sorting by boarda score
scores = [(paper, b_score[paper]) for paper in b_score]
scores.sort(key= lambda x: x[1], reverse=True)

#print the top 5 papers
for paper, b_score in scores[:5]:
    print('Paper: #',paper, ' Borda score:', b_score)


couldnt parse: ['Analysing energy usage on a city scale using utility smart meters', 'MotionSync: Personal energy analytics through motion sensing and wearable sensing', 'SunSpot: exposing the location of annymous solar powered homes', 'Manual shade control simulation', ' algorithm and impact', 'AURES: A wide band ultrasonic occupancy sensing platform']
Paper: # 6  Borda score: 43
Paper: # 13  Borda score: 41
Paper: # 8  Borda score: 31
Paper: # 7  Borda score: 28
Paper: # 3  Borda score: 25

In [ ]: