What a neuron does is to response when a stimulation is given. This response could be strong or weak or even null. If I would draw a figure, of this behavior, it looks like this.
A simple network is a collection of neurons that response to stimulations, which could be the responses of other neurons.
A given input signal is spreaded onto three different neurons. The neurons respond to this signal sperately then summed together with different weights. In the language of math, given input $x$, output $y(x)$ is
$$ y(x) = \sum_{k=1}^{3} v_k * \text{activation}( w_k * x + u_k ) $$where $\text{activation}$ is the activation function, i.e., the response behavior of the neuron. This is a single layer structure.
A lot of different ways could be used to extend this network.
Here is an exmaple of how the network works.
Suppose we have only two neurons in the network.
Seen from this example, we can expect neural network to be good at classification. With one neuron, we can do a classification too. For example we can choose proper parameters so that we have a input temperature and a output that tells us which is high temperature which is low temperature.
We have got a lot of paramters with the set up of the network. The parameters are the degree of freedom we have. The question is how to get the right paramters.
The Network NEEDS TRAINING. Just like human learning, the neural network have to be trained using prepared data. One example would be
In [5]:
import numpy as np
print np.linspace(0,9,10), np.exp(-np.linspace(0,9,10))
Balance bewteen 'speed' (Beta-coefficient) and 'momentum' of the learning
Problems: over-trained or 'grandmothered' -> respond only to one set of problems
A very basic introduction: http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html
In [3]:
# This line configures matplotlib to show figures embedded in the notebook,
# instead of opening a new window for each figure. More about that later.
# If you are using an old version of IPython, try using '%pylab inline' instead.
%matplotlib inline
from scipy.optimize import minimize
from scipy.special import expit
import matplotlib.pyplot as plt
import timeit
This is a practice of minimizing an expression using scipy.optimize.minimize()
In [46]:
fun = lambda x: (x[0] - 1)**2 + (x[1] - 2.5)**2
minimize(fun,(2,1),method="Nelder-Mead")
Out[46]:
In [47]:
def fun_jacf(x):
np.asarray(x)
return np.array([2*(x[0] - 1),2*(x[1] - 2.5)])
minimize(fun,(2,1),method="BFGS",jac=fun_jacf)
Out[47]:
Here is a summary:
The problem to solve is the differential equation $$\frac{d}{dt}y(t)= - y(t).$$ Using the network, this is $$y_i= 1+t_i v_k f(t_i w_k+u_k).$$
The procedures are
Deal with the function first.
The cost is $$I=\sum_i\left( \frac{dy_i}{dt}+y_i \right)^2.$$ Our purpose is to minimize this cost.
To calculate the differential of y, we can write down the explicit expression for it. $$\frac{dy}{dt} = v_k f(t w_k+u_k) + t v_k f(tw_k+u_k) (1-f(tw_k+u_k))w_k,$$ where the function f is defined as a trigf().
So the cost becomse $$I = \sum_i \left( v_k f(t w_k+u_k) + t v_k f(tw_k+u_k) (1-f(tw_k+u_k)) w_k + y \right)^2.$$
In [48]:
def cost(v,w,u,t):
v = np.array(v) # Don't know why but np.asarray(v) doesn't work here.
w = np.array(w)
u = np.array(u)
fvec = np.array(trigf(t*w + u) ) # This is a vector!!!
yt = 1 + np.sum ( t * v * fvec ) # For a given t, this calculates the value of y(t), given the parameters, v, w, u.
return ( np.sum (v*fvec + t * v* fvec * ( 1 - fvec ) * w ) + yt ) ** 2
# return np.sum(np.array( v*np.array( trigf( np.array( t*w ) + u ) ) ) + np.array( t*np.array( v*np.array( trigf(np.array( t*w ) + u)) ) ) * ( 1 - np.array( trigf( np.array( t*w )+u) ) ) * w + ( 1 + np.array( t*np.array( v*np.array( trigf( np.array(t*w)+u ) ) ) ) ) ) # trigf() should return an array with the same length of the input.
Caution: a number times an array is not returned as array but instead as list. and list + list doesn't conserved the length of the list!
Define the trigf() next, usually we use $$trigf(x)=\frac{1}{1+\exp(-x)}$$.
In [49]:
def trigf(x):
#return 1/(1+np.exp(-x)) #
return expit(x)
Test cost function:
In [50]:
test11 = np.ones(30)
cost(np.array([1,1,1]),[1,1,1],[1,1,1],1)
Out[50]:
Next step is to optimize this cost. To do this we need the derivitive. But anyway let's try a simple minimization first.
In [51]:
def costTotal(v,w,u,t):
t = np.array(t)
costt = 0
for temp in t:
costt = costt + cost(v,w,u,temp)
return costt
Test total cost
In [52]:
test11 = np.ones(30)
tlintest = np.linspace(0,1,2)
print costTotal(np.ones(10),np.ones(10),2*np.ones(10),tlintest)
print costTotal(np.ones(10),np.ones(10),np.ones(10),tlintest)
Suppose the parameters are five dimensional and we have 10 data points.
In [53]:
tlin = np.linspace(0,5,11)
print tlin
Define a list divier that splits an array into three arrays.
In [54]:
## No need to define such a function! Use np.split(x,3) instead.
np.zeros(30)
Out[54]:
In [55]:
# This is only an example of 2dimensional neural network.
costTotalF = lambda x: costTotal(np.split(x,3)[0],np.split(x,3)[1],np.split(x,3)[2],tlin)
initGuess = np.zeros(30)
# initGuess = np.random.rand(1,30)+2
start1 = timeit.default_timer()
minimize(costTotalF,initGuess,method="Nelder-Mead")
# minimize(costTotalF,initGuess,method="L-BFGS-B")
# minimize(costTotalF,initGuess,method="TNC")
stop1 = timeit.default_timer()
print stop1 - start1
It shows that the minimization depends greatly on the initial guess. It is not true for a simple scenario with gradient descent however it could be the case if the landscape is too complicated.
I can define a function that deals with this part: $$M = v_k f(t w_k+u_k) + t v_k f(tw_k+u_k) (1-f(tw_k+u_k))w_k + y,$$ which is actually an array given an array input.
So the cost is $$I = M_i M_i,$$ using summation rule.
The derivative is always $$\partial_X I = 2 M_i \partial_X M_i .$$
So we have $$\partial_{w_{k'}}f(tw_k+u_k) = f(t w_k+u_k) (1 - f(t w_k+u_k) ) t . $$ $$\partial_{u_{k'}}f(t w_k+u_k) = f(t w_k+u_k) (1 - f(t w_k+u_k) ) . $$
One of the useful relation is $$\frac{df(x)}{dx} = f(x)(1-f(x)).$$
Derived by hand, the jac is a list of the following for $v_\alpha$ (Note that the k in this expression should be $\alpha$ and no summation should be done.) (double checked): $$2M_i(f(tw_{k'}+u_{k'}) +t f(tw_{k'}+u_{k'})(1-f(tw_{k'}+u_{k'}))w_{k'} + tf(tw_{k'} +u_{k'} )),$$
for $w_\alpha$ (Note that the k in this expression should be $\alpha$ and no summation should be done.) (double checked): $$2M_i( v_{k'}tf(1-f) + t v_{k'}f(1-f)*t*(1-f) w_{k'} - t v_{k'} f f(1-f ) t w_{k'} + tv_{k'} f(1-f) + t v_{k'} f(')( 1 - f(') ) t ),$$
for $u_\alpha$ (Note that the k in this expression should be $\alpha$ and no summation should be done.) (double checked): $$v_{k'} f(1-f) + t v_{k'} f(1-f) (1-f)w_{k'} - t v_{k'} f f(1-f) w_{k'} + t v_{k'} f(1-f) .$$
where $k'$ is not summed over.
Define a help function M here:
In [56]:
def mhelper(v,w,u,t): ## This function should output a result ## t is a number in this function not array!!
v = np.array(v)
w = np.array(w)
u = np.array(u)
return np.sum( v*trigf( t*w + u ) + t* v* trigf(t*w + u) * ( 1 - trigf( t*w +u) ) * w ) + ( 1 + np.sum( t * v * trigf( t*w +u ) ) )
# Checked # Pass
def vhelper(v,w,u,t):
v = np.array(v)
w = np.array(w)
u = np.array(u)
return trigf(t*w+u) + t*trigf(t*w+u)*( 1-trigf(t*w+u) )*w + t*trigf(t*w+u)
def whelper(v,w,u,t):
v = np.array(v)
w = np.array(w)
u = np.array(u)
return v*t*trigf(t*w+u)*( 1- trigf(t*w+u) ) + t*v*( trigf(t*w+u)*(1-trigf(t*w+u))*t* (1-trigf(t*w+u)) )*w - t*v*trigf(t*w+u)*trigf(t*w+u)*(1-trigf(t*w+u))*t*w + t*v*trigf(t*w+u)*(1-trigf(t*w+u)) + t*v*trigf(t*w+u)*(1-trigf(t*w+u))*t
def uhelper(v,w,u,t):
v = np.array(v)
w = np.array(w)
u = np.array(u)
return v*trigf(t*w+u)*( 1 - trigf(t*w+u)) + t* v * trigf(t*w+u) * (1-trigf(t*w+u))*(1-trigf(t*w+u))*w - t*v*trigf(t*w+u)*trigf(t*w+u)*(1-trigf(t*w+u))*w + t*v*trigf(t*w+u)*(1-trigf(t*w+u))
In [57]:
mhelper([1,2],[2,3],[3,4],[1])
Out[57]:
In [58]:
vhelper([1,2],[2,3],[3,4],[1,2])
Out[58]:
Define the jac of cost function
In [59]:
def mhelperT(v,w,u,t):
t = np.array(t)
mhelperT = 0
for temp in t:
mhelperT = mhelperT + mhelper(v,w,u,temp)
return mhelperT
def vhelperT(v,w,u,t):
t = np.array(t)
vhelperT = 0
for temp in t:
vhelperT = vhelperT + vhelper(v,w,u,temp)
return vhelperT
def whelperT(v,w,u,t):
t = np.array(t)
whelperT = 0
for temp in t:
whelperT = whelperT + whelper(v,w,u,temp)
return whelperT
def uhelperT(v,w,u,t):
t = np.array(t)
uhelperT = 0
for temp in t:
uhelperT = uhelperT + uhelper(v,w,u,temp)
return uhelperT
def costJac(v,w,u,t):
v = np.array(v)
w = np.array(w)
u = np.array(u)
vout = 0
wout = 0
uout = 0
for temp in t:
vout = vout + 2*mhelper(v,w,u,temp)*vhelper(v,w,u,temp)
wout = wout + 2*mhelper(v,w,u,temp)*whelper(v,w,u,temp)
uout = uout + 2*mhelper(v,w,u,temp)*uhelper(v,w,u,temp)
out = np.hstack((vout,wout,uout))
return np.array(out)
In [60]:
print uhelperT([1,2],[2,3],[3,4],[1,2,3]),mhelperT([1,2],[2,3],[3,4],[1]),whelperT([1,2],[2,3],[3,4],[1]),vhelperT([1,2],[2,3],[3,4],[1])
In [61]:
costJac([1,2,3],[2,3,1],[3,4,3],[1,2])
Out[61]:
In [62]:
costJacF = lambda x: costJac(np.split(x,3)[0],np.split(x,3)[1],np.split(x,3)[2],tlin)
initGuessJ = np.zeros(30)
# initGuessJ = np.random.rand(1,30)+2
minimize(costTotalF,initGuessJ,method="Newton-CG",jac=costJacF)
Out[62]:
Plot!
In [73]:
# funYNN(np.ones(10),np.ones(10),np.ones(10),2)
test13=np.array([-57.2424592 , -57.2424592 , -57.2424592 , -57.2424592 ,
-57.2424592 , -57.2424592 , -57.2424592 , -57.2424592 ,
-57.2424592 , -57.2424592 , -0.28879104, -0.28879104,
-0.28879104, -0.28879104, -0.28879104, -0.28879104,
-0.28879104, -0.28879104, -0.28879104, -0.28879104,
-6.5643978 , -6.5643978 , -6.5643978 , -6.5643978 ,
-6.5643978 , -6.5643978 , -6.5643978 , -6.5643978 ,
-6.5643978 , -6.5643978 ])
for i in np.linspace(0,5,11):
print i,functionYNN(np.split(test13,3)[0],np.split(test13,3)[1],np.split(test13,3)[2],np.array([i]))[0]
temp14 = np.array([])
for i in np.linspace(0,5,11):
temp14 = np.append(temp14,functionYNN(np.split(test13,3)[0],np.split(test13,3)[1],np.split(test13,3)[2],np.array([i]))[0])
testTLin = np.linspace(0,5,11)
plt.figure(figsize=(10,6.18))
plt.plot(testTLin,functionY(testTLin),'bs')
plt.plot(testTLin,temp14,'r-')
plt.show()
In [ ]:
temp16 = np.array([1.,0.60129567, 0.36281265 , 0.22220159 , 0.13660321,0.08295538 , 0.04904239 ,0.02817984 , 0.01636932 , 0.01048201, 0.00741816])
In [ ]:
temp15 = np.linspace(0,5,11)
print temp15
plt.plot(temp15,temp16)
plt.plot(temp15,functionY(temp15),'bs')
plt.show()
In [ ]:
test17 = np.array([])
for temp in np.linspace(0,5,11):
test171 = 1 + expit(10*temp)
test17 = np.append(test17,test171)
print np.array(test17)
1 + expit(10*0)
In [68]:
def functionYNNSt(v,w,u,t): # t is a single scalar value
t = np.array(t)
return 1 + np.sum(t * v * trigf( t*w +u ) )
def functionYNN(v,w,u,t):
t = np.array(t)
func = np.asarray([])
for temp in t:
func = np.append(func, functionYNNSt(v,w,u,temp) )
return np.array(func)
def functionY(t):
return np.exp(-t)
In [ ]:
print functionYNN(np.array([1,2]),np.array([1,2]),np.array([1,2]),tlin)
In [ ]:
# structArray=np.array([-1.77606225*np.exp(-01), -3.52080053*np.exp(-01), -1.77606225*np.exp(-01),
# -1.77606225*np.exp(-01), -8.65246997*np.exp(-14), 1.00000000,
# -8.65246997*np.exp(-14), -8.65246997*np.exp(-14), -1.13618293*np.exp(-14),
# -7.57778017*np.exp(-16), -1.13618293*np.exp(-14), -1.13618293*np.exp(-14)])
#structArray=np.array([-1.6001368 , -1.6001368 , -2.08065131, -2.06818762, -2.07367757,
# -2.06779168, -2.07260669, -2.08533436, -2.07112826, -2.06893266,
# -0.03859167, -0.03859167, -0.25919807, -0.66904303, -0.41571841,
# -0.76917468, -0.4483773 , -0.17544777, -1.03122022, -0.90581106,
# -3.46409689, -3.46409689, -2.83715218, -2.84817563, -2.8434598 ,
# -2.84773205, -2.84446398, -2.85001617, -2.83613622, -2.84402863])
structArray=np.array([ 0.1330613 , 1.05982273, 0.18777729, -0.60789078, -0.96393469,
-0.65270373, -1.55257864, 0.8002259 , -0.12414033, -0.21230861,
-0.88629202, 0.47527367, 0.21401419, 0.2130512 , -1.5236408 ,
1.35208616, -0.48922234, -0.85850735, 0.72135512, -1.03407686,
2.29041152, 0.91184671, -0.56987761, 0.16597395, -0.43267372,
2.1772668 , -0.1318482 , -0.80817762, 0.44533168, -0.28545885])
structArrayJ = np.array([-11.45706046, -11.45706046, -11.45706046, -11.45706046,
-11.45706046, -11.45706046, -11.45706046, -11.45706046,
-11.45706046, -11.45706046, -0.44524438, -0.44524438,
-0.44524438, -0.44524438, -0.44524438, -0.44524438,
-0.44524438, -0.44524438, -0.44524438, -0.44524438,
-4.7477771 , -4.7477771 , -4.7477771 , -4.7477771 ,
-4.7477771 , -4.7477771 , -4.7477771 , -4.7477771 ,
-4.7477771 , -4.7477771 ])
print("The Structure Array is \n {}".format(structArray))
# print np.split(structArray,3)[0],np.split(structArray,3)[1],np.split(structArray,3)[2]
testTLin = np.linspace(0,5,11)
print "\n \n The plot is"
plt.figure(figsize=(10,6.18))
plt.plot(testTLin,functionY(testTLin),'bs')
plt.plot(testTLin,functionYNN(structArray[0],structArray[1],structArray[2],testTLin),'g^')
plt.plot(testTLin,functionYNN(structArrayJ[0],structArrayJ[1],structArrayJ[2],testTLin),'r^')
plt.yscale('log')
plt.show()
print functionY(testTLin), functionYNN(structArray[0],structArray[1],structArray[2],testTLin), functionYNN(structArrayJ[0],structArrayJ[1],structArrayJ[2],testTLin)
In [ ]:
In [ ]:
In [ ]:
In [ ]:
## Test of Numpy
temp1=np.asarray([1,2,3])
temp2=np.asarray([4,5,6])
temp3=np.asarray([7,8,9])
temp1*temp2
print 3*temp1
temp1+temp2
print temp1*temp2*temp3*temp1
1/(1+np.exp(-temp1))
temp1 + temp2
[1,2] + [2,3]
1 - 3*np.array([1,2])
temp1**2
In [ ]:
1+np.asarray([1,2,3])
In [ ]:
def testfunction(v,w,u,t):
v = np.array(v)
w = np.array(w)
u = np.array(u)
return t*w + u
#return np.sum(v*trigf( t*w + u ))
In [ ]:
testfunction([2,3,4],[3,4,5],[4,5,7],2)
Test a very simple equation $$\frac{dy}{dx}=4x^3-3x^2+2,$$ with initial condition $$y(0)=0.$$
As in any case, $$y = \text{Initial} + x_i v_k f(x_iw_k+u_k).$$
$$\frac{dy}{dx} = v_k f(x w_k+u_k) + t v_k f(x w_k+u_k) (1-f(xw_k+u_k))w_k,$$where the function f is defined as a trigf().
Cost is
$$I = \sum_i \left(\frac{dy}{dx}-(4x^2-3x^2+2) \right)^2$$
In [ ]:
def costS(v,w,u,x):
v = np.array(v) # Don't know why but np.asarray(v) doesn't work here.
w = np.array(w)
u = np.array(u)
fvec = np.array(trigf(x*w + u) ) # This is a vector!!!
yx = np.sum ( x * v * fvec ) # For a given x, this calculates the value of y(t), given the parameters, v, w, u.
dySLASHdt = np.sum (v*fvec + x * v* fvec * ( 1 - fvec ) * w )
return ( dySLASHdt - yx )**2
In [ ]:
costS(np.array([2,3,4]),[3,4,5],[4,5,7],4)
In [ ]:
def costSTotal(v,w,u,x):
x = np.array(x)
costSt = 0
for temp in x:
costSt = costSt + costS(v,w,u,temp)
return costSt
print costSTotal([1,2,3],[2,3,2],[3,4,1],[1,2,3,4,5,2,6,1])
In [ ]:
xlinS = np.linspace(0,1,10)
print xlinS
In [ ]:
# This is only an example of 2dimensional neural network.
costSTotalF = lambda x: costSTotal(np.split(x,3)[0],np.split(x,3)[1],np.split(x,3)[2],xlinS)
# initGuessS = np.zeros(30)
initGuessS = np.random.rand(1,30)+2
# minimize(costTotalF,([1,0,3,0,1,1,2,0,1,0,1,0]),method="Nelder-Mead")
minimize(costSTotalF,(initGuessS),method="L-BFGS-B")
# minimize(costTotalF,([1,0,3,0,1,1,2,0,1,0,1,0]),method="TNC")
In [ ]:
def functionSYNN(v,w,u,x): # t is a single scalar value
x = np.array(x)
func = np.asarray([])
for temp in x:
tempfunc = np.sum(temp * v * trigf( temp*w +u ) )
func = np.append(func, tempfunc)
return np.array(func)
def functionSY(x):
return x**4 - x**3 + 2*x
In [ ]:
# structArray=np.array([-1.77606225*np.exp(-01), -3.52080053*np.exp(-01), -1.77606225*np.exp(-01),
# -1.77606225*np.exp(-01), -8.65246997*np.exp(-14), 1.00000000,
# -8.65246997*np.exp(-14), -8.65246997*np.exp(-14), -1.13618293*np.exp(-14),
# -7.57778017*np.exp(-16), -1.13618293*np.exp(-14), -1.13618293*np.exp(-14)])
#structArray=np.array([-1.6001368 , -1.6001368 , -2.08065131, -2.06818762, -2.07367757,
# -2.06779168, -2.07260669, -2.08533436, -2.07112826, -2.06893266,
# -0.03859167, -0.03859167, -0.25919807, -0.66904303, -0.41571841,
# -0.76917468, -0.4483773 , -0.17544777, -1.03122022, -0.90581106,
# -3.46409689, -3.46409689, -2.83715218, -2.84817563, -2.8434598 ,
# -2.84773205, -2.84446398, -2.85001617, -2.83613622, -2.84402863])
structArrayS=np.array([ 0.01462306, 0.13467016, 0.43137834, 0.32915392, 0.16398891,
-0.36502654, -0.1943661 , 0.16082714, -0.2923346 , -0.38280994,
2.23127245, 1.97866504, 2.95181241, 2.70643394, 2.19371603,
2.63386948, 2.20213407, 2.81089774, 2.43916804, 2.80375489,
2.32389017, 2.16118574, 2.7346048 , 2.18630694, 2.19932286,
2.52525807, 2.22125577, 2.81758156, 2.27231039, 2.6118171 ])
print("The Structure Array is \n {}".format(structArray))
# print np.split(structArray,3)[0],np.split(structArray,3)[1],np.split(structArray,3)[2]
testXLinS = np.linspace(0,1,10)
print "\n \n The plot is"
plt.figure(figsize=(10,6.18))
plt.plot(testXLinS,functionSY(testXLinS),'bs')
plt.plot(testXLinS,functionSYNN(structArrayS[0],structArrayS[1],structArrayS[2],testXLinS),'g^')
## plt.plot(testXLin,functionYNN(structArrayJ[0],structArrayJ[1],structArrayJ[2],testXLin),'r^')
plt.show()
print functionY(testXLinS), functionYNN(structArrayS[0],structArrayS[1],structArrayS[2],testXLinS)
In [ ]:
In [ ]: