In [ ]:
a = 100
print("a is", a)
a + 200
This is an iPython Notebook. You can write whatever Python code you like here - output like the print will be shown below the cell, and the final result is also shown (the result of a + 200).
Note - your Python code is running on a server I've set up (which has everything you need), not on your local machine.
Exercise - save the notebook (do this regularly), by pressing Ctrl+s (or the save icon)
Hint - if you are struggling what to write at any point, try pressing Tab - iPython should try to offer some sensible completions. If you want to know what a function does, try Shift+Tab to bring up documentation.
In [ ]:
%matplotlib inline
import dlt
import numpy as np
import chainer as C
Now we'll learn how to use these libraries to create deep learning functions (later, in the full tutorial, we'll use this to train a handwriting recognizer).
Here are two ways to create a numpy array:
In [ ]:
a = np.array([1, 2, 3, 4, 5], dtype=np.int32)
print("a =", a)
print("a.shape =", a.shape)
print()
b = np.zeros((2, 3), dtype=np.float32)
print("b =", b)
print("b.shape =", b.shape)
A np.array is a multidimensional array - a very flexible thing, it can be:
5)a above)b above)It can also contain either whole numbers (np.int32) or real numbers (np.float32).
OK, I've done a bit much now - time for you...
Exercise - create the following numpy arrays, and print out the shape:
In [ ]:
# EXERCISE
# 1. an array scalar containing the integer 5
# 2. a (10, 20) array of zeros
# 3. a (3, 3) array of different numbers (hint: use a list-of-lists)
Now we just need a few ways of working with these arrays - here are some examples of things that you can do:
In [ ]:
x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int32)
print("x =\n%s" % x)
print()
# Indexing
print("x[0, 1] =", x[0, 1]) # 0th row, 1st column
print("x[1, 1] =", x[1, 1]) # 1st row, 1st column
print()
# Slicing
print("x[0, :] =", x[0, :]) # 0th row, all columns
print("x[:, 2] =", x[:, 2]) # 2nd column, all rows
print("x[1, :] =", x[1, :]) # 1st row, all columns
print("x[1, 0:2] =", x[1, 0:2]) # 1st row, first two columns
print()
# Other numpy functions (there are very many more...)
print("np.argmax(x[0, :]) =", np.argmax(x[0, :])) # Find the index of the maximum element in the 0th row
I won't explain all of this in detail, but have a play around with arrays, see what you can do with the above operations.
Exercise - try to use your numpy operations to find the following with M:
In [ ]:
M = np.arange(900, dtype=np.float32).reshape(45, 20)
print(M.shape)
# EXERCISE
# 1. print out row number 0 (hint, it should be shape (20,))
# 2. print out row number 34
# 3. select column 15, print out the shape
# 4. select rows 30-40 inclusive, columns 5-8 inclusive, print out the shape (hint: should be (11, 4))
In [ ]:
a = C.Variable(np.zeros((10, 20), dtype=np.float32))
print("a.data.shape =", a.data.shape)
transformation = C.links.Linear(20, 30)
b = transformation(a)
print("b.data.shape =", b.data.shape)
c = C.functions.tanh(b)
print("c.data.shape =", c.data.shape)
This may not seem particularly special, but this is the heart of a deep learning function. Take an input array, make various transformations that mess around with the shape, and produce an output array.
Some concepts:
Variable holds an array - this is some data going through the functionLink contains some parameters (these start random), which process an input Variable, and produce an output Variable.Function is a Link without any parameters (like sin, cos, tan, tanh, max... so many more...)Exercise - use Chainer to calculate the following:
In [ ]:
# EXERCISE
# 1. Create an array, shape (2, 3) of various float numbers, put it in a variable
a = None # your array here
# 2. Print out tanh(a) (for the whole array)
# 3. Create a linear link of shape (3, 5) - this means it takes (N, 3) and produces (N, 5)
mylink = None # your link here
# 4. Use your link to transform `a`, then take the tanh, check the shape of the result
# 5. Uncomment the following; what happens when you re-run the code?
# print("W =", mylink.W.data)
If you can do all of this, you're ready to create a deep learning function.
In the last step, you may have noticed something interesting - the parameters inside the link change every time it is re-created. This is because deep learning functions start off random! Random functions don't sound too useful, so later we're going to learn how to "teach" them to be useful functions.
In [ ]:
log = dlt.Log()
for i in range(100):
# The first argument "loss" says which plot to put the value on
# The second argument "train" gives it a name on that plot
# The third argument is the y-value
log.add("loss", "train", i)
log.add("loss", "valid", 2 * i)
log.show()
Exercise - try to add another curve to the plot, e.g. np.sqrt(i) - you'll need to give it a different name.
OK - this was quite a lot to learn! To review, you've learnt how to:
Next, we'll put all of this together, adding training, as we teach a deep learning function to recognize handwritten digits, so see the DIY guide & Tutorial.ipynb.