cesium library for machine learning w/ time series data
In [1]:
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context('poster')
def sigmoid(x): return 1 / (1 + np.exp(-x))
fig, ax = plt.subplots(1, 3, figsize=(15, 6))
x = np.linspace(-4, 4, 501)
ax[0].plot(x, sigmoid(x)); ax[0].set_title("Sigmoid")
ax[1].plot(x, np.tanh(x)); ax[1].set_title("Tanh")
ax[2].plot(x, np.maximum(x, 0)); ax[2].set_title("ReLU");
In [2]:
from mpl_toolkits.mplot3d import Axes3D
def f(x, y):
return (1 - x / 2 + x ** 5 + y ** 3) * np.exp(-x ** 2 -y ** 2)
def df(x, y, h=1e-3):
return np.r_[(f(x + h, y) - f(x, y)) / h,
(f(x, y + h) - f(x, y)) / h]
n = 256
x = np.linspace(-3, 3, n)
y = np.linspace(-2, 2, n)
X, Y = np.meshgrid(x, y)
fig = plt.figure(figsize=(15, 15))
ax = fig.add_subplot(2, 1, 1, projection='3d', azim=-90)
ax.plot_surface(X, Y, f(X, Y), cmap='inferno')
x0 = -0.55; y0 = -0.1
step = 1.0
ax = fig.add_subplot(2, 1, 2)
plt.contourf(X, Y, f(X, Y), V=12, alpha=0.75, cmap='inferno')
plt.contour(X, Y, f(X, Y), V=12, colors='black', linewidth=0.5)
ax.scatter(x0, y0, s=160., c='w', edgecolors='k', linewidths=2.5)
ax.scatter(x0 - step * df(x0, y0)[0], y0 - step * df(x0, y0)[1],
s=160., c='w', edgecolors='k', linewidths=2.5)
ax.arrow(x0, y0, *(-step * df(x0, y0)), linewidth=2.5, head_width=0.1, color='k',
length_includes_head=True);
In [3]:
fig, ax = plt.subplots(1, 3, figsize=(15, 6))
x = np.linspace(-4, 4, 501)
ax[0].plot(x, sigmoid(x)); ax[0].set_title("Sigmoid")
ax[1].plot(x, np.tanh(x)); ax[1].set_title("Tanh")
ax[2].plot(x, np.maximum(x, 0)); ax[2].set_title("ReLU");
"The Great AI Awakening" (NYTimes 12/2016) https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
"Deep learning" (Nature 2015) http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html
"Deep learning algorithm does as well as dermatologists in identifying skin cancer" (yesterday!) http://news.stanford.edu/2017/01/25/artificial-intelligence-used-identify-skin-cancer/
INPUT: [224x224x3] memory: 224*224*3=150K weights: 0
CONV3-64: [224x224x64] memory: 224*224*64=3.2M weights: (3*3*3)*64 = 1,728
CONV3-64: [224x224x64] memory: 224*224*64=3.2M weights: (3*3*64)*64 = 36,864
POOL2: [112x112x64] memory: 112*112*64=800K weights: 0
CONV3-128: [112x112x128] memory: 112*112*128=1.6M weights: (3*3*64)*128 = 73,728
CONV3-128: [112x112x128] memory: 112*112*128=1.6M weights: (3*3*128)*128 = 147,456
POOL2: [56x56x128] memory: 56*56*128=400K weights: 0
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*128)*256 = 294,912
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*256)*256 = 589,824
CONV3-256: [56x56x256] memory: 56*56*256=800K weights: (3*3*256)*256 = 589,824
POOL2: [28x28x256] memory: 28*28*256=200K weights: 0
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*256)*512 = 1,179,648
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [28x28x512] memory: 28*28*512=400K weights: (3*3*512)*512 = 2,359,296
POOL2: [14x14x512] memory: 14*14*512=100K weights: 0
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512] memory: 14*14*512=100K weights: (3*3*512)*512 = 2,359,296
POOL2: [7x7x512] memory: 7*7*512=25K weights: 0
FC: [1x1x4096] memory: 4096 weights: 7*7*512*4096 = 102,760,448
FC: [1x1x4096] memory: 4096 weights: 4096*4096 = 16,777,216
FC: [1x1x1000] memory: 1000 weights: 4096*1000 = 4,096,000
TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters
For details see http://colah.github.io/posts/2015-08-Understanding-LSTMs/.
"Unreasonable Effectiveness of Recurrent Neural Networks" http://karpathy.github.io/2015/05/21/rnn-effectiveness/