Introduction to keras

This is a short introductive tutorial for keras. Keras is a high-level interface for machine learning with neural networks.

In order to introduce and illustrate the principle of neural networks, we will consider the well-known classification problem of iris species.

Classification of iris species

Iris setosa, Iris versicolor and Iris virginica are closely-related species. They differ by the size of their petal and sepal.

For instance the dataset below list the height and width of the petals of different specimens, and whether these specimens belong to the species setosa.


In [1]:
import pandas as pd
df = pd.read_csv('./data/setosa/train.csv')
df.head(6)


Out[1]:
petal length (cm) petal width (cm) setosa
0 1.3 0.2 1.0
1 1.6 0.4 1.0
2 4.7 1.2 0.0
3 5.5 2.1 0.0
4 1.3 0.3 1.0
5 3.7 1.0 0.0

Problem statement

Based on this dataset, and given the petal height and width of a new specimen, we would like to predict the probability that this specimen belongs to the species setosa.

Using neural networks for this problem

General structure of fully-connected neural networks

Neural networks are an extremeley versatile machine learning technique.

They consist in stacking individual units (a.k.a. artificial neurons) that :

  • perform a weighted sum of their inputs ($\,S = \sum_i w_i x_i + w_0\,$)
  • apply a non-linear function to this sum ($\,y = f(S)\,$) and return it as output

  • The number of units and layers are arbitrarily chosen by the user.
  • The non-linear function $f$ are arbitrarily chosen by the user (typical example include the sigmoid function or the tanh function).
  • The values of the weights are determined by an algorithm, so as to produce the right output on a known dataset.

Training the network consists in finding the right values for the weights $w_i$.

Using a neural network for the classification of Iris setosa

In order to solve the problem of classification for Iris setosa, we will start with the simplest kind of neural network: a single layer network.

For the non-linear function, we will choose the sigmoid function, since its output is between 0 and 1 and can thus easily be interpreted as a probability.

With this model, we have: $ p_{setosa} = f( w_0 + w_1\times height + w_2 \times width )$, and training the model consists in finding $w_1$ and $w_2$.

Next steps in this tutorial

We will first build a neural network by hand here, before using keras to automate the process.