Multilayer Perceptron

  • Written by Jaeyeong Yang, urisa12@snu.ac.kr
  • Experimental Psychology Seminar, 2017 Spring

How to Use

Compilation

make

Manual

gcc main.c -o main.o

or

gcc-6 main.c -o main.o

Run

./main.o [-a ALPHA] [-d DATAFILE] [-e ETA] [-f FUNCTION] [-t N_TRAIN] [-v]
  • -a ALPHA (Default: 0.001)
    • Set the level of momentum to ALPHA.
  • -d DATAFILE (Default: "data.txt")
    • Set the data file to DATAFILE.
  • -e ETA (Default: 0.1)
    • Set the learning rate of the model to ETA.
  • -f FUNCTION (Default: sigmoid)
    • Change the activation function for the model.
    • Available
      • identity: Identity function
      • logistic or sigmoid: Logistic(Sigmoid) function
      • tanh: Hyperbolic Tangent
      • relu: Rectifier Linear Unit
      • lrelu: Leaky ReLU
      • softplus: SoftPlus
  • -t N_TRAIN (Default: 10000)
    • Set the number of training to N_TRAIN.
  • -v
    • Print verbose information for each training.

Test with Sample Data

make test

or

./main.o

Test Output

The given target data are respectively boolean AND, boolean OR, boolean NOT, boolean XOR.

./main.o
[PARAMETERS]
    ACT_FUNC: Sigmoid function
     N_TRAIN: 10000
     N_LAYER: 2
      N_NODE: [ 2 3 4 ]
     N_INPUT: 2
    N_OUTPUT: 4
         ETA: 0.100000
       ALPHA: 0.001000

           w: [ [ [  2.554752 -5.885810 -6.037463 ],
                  [  6.903305 -4.302409 -5.145883 ],
                  [ -2.980794  5.578378  0.802870 ] ],
                [ [ -0.015197 -1.812910 -7.211772  3.813242 ],
                  [  3.729935 -6.422554 -1.156519  4.435602 ],
                  [  1.502379  1.134523  2.860297 -8.014769 ],
                  [  2.423113  8.414084 -7.152882  1.065576 ] ] ]

[CASE 0]   x: [ 0.000 0.000 ] / t: [ 0.000 0.000 1.000 1.000 ] / y: [ 0.000 0.040 0.993 0.958 ]
[CASE 1]   x: [ 0.000 1.000 ] / t: [ 0.000 1.000 1.000 0.000 ] / y: [ 0.003 0.953 0.959 0.035 ]
[CASE 2]   x: [ 1.000 0.000 ] / t: [ 0.000 1.000 0.000 0.000 ] / y: [ 0.038 0.999 0.037 0.050 ]
[CASE 3]   x: [ 1.000 1.000 ] / t: [ 1.000 1.000 0.000 1.000 ] / y: [ 0.959 1.000 0.002 0.950 ]

Implementation

Data File Structure

$L$ $P$

$N_0$ $N_1$ $\ldots$ $N_L$

$x_{1, 1}$ $x_{1, 2}$ $\ldots$ $x_{1, N_0}$ $t_{1, 1}$ $t_{1, 2}$ $\ldots$ $t_{1, N_L}$

$x_{2, 1}$ $x_{2, 2}$ $\ldots$ $x_{2, N_0}$ $t_{2, 1}$ $t_{2, 2}$ $\ldots$ $t_{2, N_L}$

$\vdots$

$x_{P, 1}$ $x_{P, 2}$ $\ldots$ $x_{P, N_0}$ $t_{P, 1}$ $t_{P, 2}$ $\ldots$ $t_{P, N_L}$

  • $L$: # of layers in weight matrix of the model.
  • $P$: # of training cases.
  • $N_l$: # of nodes in the layer $l$, excluding bias node.
    • $N_0$: # of nodes in the input layer.
    • $N_L$: # of nodes in the output layer.
    • $N_1$, $\cdots$, $N_{L-1}$: # of nodes in each hidden layer.
  • $x_{p, i}$: $i$th input of the training case $p$.
  • $t_{p, i}$: $i$th target of the training case $p$.

Activation Functions

I implemented additional activation functions with reference to the Wikipedia page.

Name Equation Derivative Range
Identity $f(x) = x$ $f'(x) = 1$ $(-\infty, \infty)$
Logistic $f(x) = \frac{1}{1 + e^{-x}}$ $f'(x) = f(x)\big\{1 - f(x)\big\}$ $(0, 1)$
Hyperbolic Tangent $f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}$ $f'(x) = 1 - f(x)^2$ $(-1, 1)$
ReLU $f(x) = \begin{cases} 0 & \text{for } x < 0 \\ x & \text{for } x \ge 0 \end{cases}$ $f(x) = \begin{cases} 0 & \text{for } x < 0 \\ 1 & \text{for } x \ge 0 \end{cases}$ $[0, \infty)$
Leaky ReLU $f(x) = \begin{cases} 0.01 x & \text{for } x < 0 \\ x & \text{for } x \ge 0 \end{cases}$ $f(x) = \begin{cases} 0.01 & \text{for } x < 0 \\ 1 & \text{for } x \ge 0 \end{cases}$ $(-\infty, \infty)$
SoftPlus $f(x) = \ln(1 + e^x)$ $f'(x) = \mathtt{logistic}(x) = \frac{1}{1 + e^{-x}}$ $(0, \infty)$

Resources