Exercise 1

The purpose of this exercise is to familiarize yourself with the Python programing language.

1: INSTALL – The first step is to install Python on your computer. If you’re using a mac, it’s already on there. You can just open a terminal (go to Applications/Utilities to find it) and type python and it should open an interactive python session. If you’re not using a mac, you’ll need to go here and follow the instructions for downloading. I’m using the 2.7 version of python, which should be good for our needs.

2: WARM UP – If you’re new to python (or programing), then I suggest you start here to get some background on what python is capable of. The official python page has its own tutorial as well, but it’s pretty long with lots of text. Don’t worry about learning everything now, just explore.

3: NUMPY – This is where you start to learn the core of python’s mathematical routines. Numpy is the main package for scientific computing, and is the building block of just about everything we’ll do in this course. Go through a bit of the quick-start tutorial to get used to the indexing of the arrays (vectors, matrices, tensors). Here is a short reference guide to linear algebra, which will help because much of machine-learning (and most of scientific mathematics!) is linear algebra. Let me know if you need me to help out.

4: ASSIGNMENT Part 1 – Write a python script and save it in a text file called firstname_lastname_softmax.py (substituting your names of course). The script should contain a function at the top called $softmax()$ that takes as an input a 1-dimensional array (vector) of any length and computes the function

$$softmax(x^i)= \frac{e^{x^i}}{\Sigma_{j} e^{x^j}}$$

where $x^i$ is the value of the i’th component of the array. This turns the elements of an array into a kind of probability, where the values of the elements sum to 1. Right after your function in the script, check that it works by trying it out on an array. The script should look something like:


In [10]:
import numpy as np

# Define your function
def softmax(x):
    # This is where you write your code!
    vector = "This is only psudo code.\nYou will have to write this function yourself!"
    return vector # Replace this with the new array

# Test it out on an array
test=[1,3,2]
print(softmax(test))
# The result should be [0.09003057 0.66524096 0.24472847]


This is only psudo code.
You will have to write this function yourself!

5: ASSIGNMENT Part 2 – Now that you’ve got that down, modify the same script so that your function can take any two dimensional matrix and compute the $softmax()$ of every column (each column is its own unique independent array that your taking the $softmax()$ of). The solution is simple in numpy, and it has to do with how you take the sum in the denominator. Test it out on a sample array. It should look like:


In [9]:
import numpy as np

# Define your function
def softmax(x):
    # This is where you write your code!
    vector = "This is only psudo code.\nYou will have to write this function yourself!"
    return vector # Replace this with the new array

# Test it out on an array
test=np.ones((3,4))
test[0,:]=2.
print(test)
print(softmax(test))
# The result should be [[ 2.  2.  2.  2.]
#                       [ 1.  1.  1.  1.]
#                       [ 1.  1.  1.  1.]]
# [[ 0.57611688  0.57611688  0.57611688  0.57611688]
#  [ 0.21194156  0.21194156  0.21194156  0.21194156]
#  [ 0.21194156  0.21194156  0.21194156  0.21194156]]


[[ 2.  2.  2.  2.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]
This is only psudo code.
You will have to write this function yourself!

6: ASSIGNMENT Part 3 – Now you are going to write another function called linearClassifier(). This function will take three inputs, $x$, $W$, and $b$. The input $x$ is a one-dimensional array which we will call our feature vector, $W$ is a two-dimensional array of weights, and $b$ is a one-dimensional bias array. These names may sound strange now, but they’ll come back later. The function should take these values and return the value of $softmax(Wx+b)$. It’s important that $b$ is the same size as the number of rows in $W$, and $x$ must be the same size as the number of columns in $W$. The $softmax()$ will then turn the result into an array of probabilities the same size as $b$.

After you write this function, make up your own values for $x$, $W$ and $b$ and test out the function. To get full points, $W$ can’t be a square matrix (e.g., try a 3×4 matrix).

ADVANCED (optional): If this is all too easy and you’re bored, write a class for the linearClassifier(), where the above code is one of the methods (like .evaluate() or something), and where the weights $W$ and bias $b$ are stored as attributes that can be called and updated by appropriate methods.