Part1: 机器学习框架 Torch 介绍

Torch 是一个以机器学习为核心的科学计算框架。由于采用了高效的脚本语言LuaJIT，同时提供了一个灵活的交互计算环境，因此，Torch 特别适合用来开发机器学习相关任务。此外Torch对C/CUDA支持非常好，因此也能很好的胜任大规模的计算任务，如深度学习的训练。

关键的特征：

一个强大的张量库（Tensor,N-dimensional array），支持各种下标、切片访问以及变换
完整的线性代数支持，类似于Matlab的接口
集成了state of art的神经网络、概率图模型实现
集成常规的数值优化计算
友好的交互计算，可视化支持
非常容易集成C语言开发本地代码
对CUDA支持非常友好
容易移植到iOS, Android等终端平台

$ th

  ______             __   |  Torch7                                   
 /_  __/__  ________/ /   |  Scientific computing for Lua.         
  / / / _ \/ __/ __/ _ \  |                                           
 /_/  \___/_/  \__/_//_/  |  https://github.com/torch   
                          |  http://torch.ch

1. 基础的矩阵计算演示

下面是最常用的矩阵计算函数:

rand() which creates tensor drawn from uniform distribution
t() which transposes a tensor (note it returns a new view)
dot() which performs a dot product between two tensors
eye() which returns a identity matrix



In [1]:

    
-- 构造一个5x5的随机矩阵
N = 5
A = torch.rand(N, N)
print(A)









    Out[1]:





 0.3669  0.7558  0.3807  0.7782  0.4420
 0.2461  0.7218  0.9170  0.2530  0.4834
 0.2714  0.6528  0.2854  0.3949  0.3828
 0.2879  0.6680  0.2238  0.8213  0.7976
 0.7888  0.7018  0.9198  0.7489  0.4952
[torch.DoubleTensor of size 5x5]

以下代码输出对称矩阵 $$ A = A * A^T $$



In [2]:

    
A = A*A:t()
print(A)









    Out[2]:





 1.6518  1.3954  1.1782  1.6874  1.9718
 1.3954  1.7200  1.0846  1.3515  1.9729
 1.1782  1.0846  0.8838  1.2078  1.4201
 1.6874  1.3515  1.2078  1.8899  1.9118
 1.9718  1.9729  1.4201  1.9118  2.7669
[torch.DoubleTensor of size 5x5]

下面代码实现计算：矩阵乘向量，即列向量的线性组合。

$$ B = A * v $$



In [4]:

    
v = torch.rand(5,1)
B = A*v
print("v=")
print(v)
print("B=")
print(B)









    Out[4]:





v=	
 0.3445
 0.6221
 0.8973
 0.5904
 0.7642
[torch.DoubleTensor of size 5x1]

B=	
 4.9974
 4.8296
 3.6720
 5.0827
 6.4241
[torch.DoubleTensor of size 5x1]

计算反矩阵：

$$ C = A^{-1} $$



In [5]:

    
C = torch.inverse(A)
print(C)
print(C*A)









    Out[5]:





 33.3314   7.1113 -27.3597  -7.9975  -9.2554
  7.1113   6.0192 -11.1720   0.5108  -3.9786
-27.3597 -11.1720  44.2274  -2.2147   6.2943
 -7.9975   0.5108  -2.2147   7.2185   1.4840
 -9.2554  -3.9786   6.2943   1.4840   5.5380
[torch.DoubleTensor of size 5x5]








    Out[5]:





 1.0000 -0.0000 -0.0000 -0.0000 -0.0000
-0.0000  1.0000 -0.0000 -0.0000 -0.0000
 0.0000  0.0000  1.0000 -0.0000  0.0000
-0.0000 -0.0000 -0.0000  1.0000  0.0000
 0.0000  0.0000  0.0000  0.0000  1.0000
[torch.DoubleTensor of size 5x5]

2. 可视化演示



In [6]:

    
require('image')
itorch.image({image.lena(), image.lena()})



In [7]:

    
require ('nn')
-- 利用神经网络构造一个随机的3x3x3滤波器，对RGB图像进行进行滤波操作
m=nn.SpatialConvolution(3,1,3,3)
n=m:forward(image.lena())
itorch.image(n)



In [8]:

    
Plot = require 'itorch.Plot'
x1 = torch.randn(40):mul(100)
y1 = torch.randn(40):mul(100)
x2 = torch.randn(40):mul(100)
y2 = torch.randn(40):mul(100)
x3 = torch.randn(40):mul(200)
y3 = torch.randn(40):mul(200)
plot = Plot():circle(x1, y1, 'red', 'hi'):circle(x2, y2, 'blue', 'bye'):draw()
plot:circle(x3,y3,'green', 'yolo'):redraw()
plot:title('Scatter Plot Demo'):redraw()
plot:xaxis('length'):yaxis('width'):redraw()
plot:legend(true)
plot:redraw()

3. 梯度下降算法演示

Torch支持多种数值计算优化算法，包括SGD, Adagrad, Conjugate-Gradient, LBFGS, RProp等等。

This package contains several optimization routines for Torch. Each optimization algorithm is based on the same interface:

x*, {f}, ... = optim.method(func, x, state)

where:

func: a user-defined closure that respects this API: f, df/dx = func(x)
x: the current parameter vector (a 1D torch.Tensor)
state: a table of parameters, and state variables, dependent upon the algorithm
x*: the new parameter vector that minimizes f, x* = argmin_x f(x)
{f}: a table of all f values, in the order they've been evaluated (for some simple algorithms, like SGD, #f == 1)

这里设计一个一维的线性回归，即直线拟合的例子。



In [9]:

    
-- 构造训练样本
N = 32
x = {}
y = {}
for i=1, N do
    x[i] = (math.random() - 0.5) * 20
    y[i] = 0.7*x[i] + 5.0 + (math.random()-0.5) 
end

Plot = require 'itorch.Plot'
local plot = Plot()
plot:circle(x,y,'black', 'yolo'):draw()
plot:title('直线拟合'):redraw()



In [10]:

    
require('optim')

-- 纪录输出日志
batchLog = {}
batchLog.value = {}
batchLog.seq = {}

parameter = torch.Tensor(2)
parameter[1] = 0
parameter[2] = 0

-- 首先构造 func(x)
function batchFunc(inParameter) 
  
  local sum = 0.0
  local deltaP = torch.Tensor(2)
    
  deltaP[1] = 0.0
  deltaP[2] = 0.0
  for i=1,#x do
    sum = sum + math.pow(inParameter[1] * x[i] + inParameter[2] - y[i],2)
    deltaP[1] = deltaP[1] + (inParameter[1] * x[i] + inParameter[2] - y[i]) * x[i]
    deltaP[2] = deltaP[2] + (inParameter[1] * x[i] + inParameter[2] - y[i])
  end
  sum = 0.5 * sum / #x
  deltaP = deltaP / #x

  batchLog.value[#batchLog.value+1] = sum
  batchLog.seq[#batchLog.seq+1] = #batchLog.seq+1
    
  return sum , deltaP
end


local state = {
   learningRate = 1.0e-2,
}

for i = 1,500 do
  optim.sgd(batchFunc, parameter ,state)
end

local plot = Plot()
plot:line(batchLog.seq, batchLog.value,'black', 'yolo'):draw()
plot:title('BGD'):redraw()



In [11]:

    
-- 绘制拟合出来的直线
drawResultLine = function()
  local resultValue = {}
  local resultSeq = {}
  for i=-10,10,0.1 do
    resultSeq[#resultSeq+1] = i
    resultValue[#resultValue+1] = i*parameter[1] + parameter[2]
  end
  local plot = Plot()
  plot:circle(x,y,'red', 'yolo'):draw()
  plot:line(resultSeq, resultValue,'black', 'yolo'):redraw()
  plot:title('直线拟合'):redraw()
    
end
drawResultLine()

由上面的曲线可以看出，设置learningRate非常重要，下面演示一下SGD算法。



In [12]:

    
require('optim')

-- 纪录输出日志
sgdLog = {}
sgdLog.value = {}
sgdLog.seq = {}

parameter[1] = 0
parameter[2] = 0

local sgdNumber = 0

-- 首先构造 func(x)
function sgdFunc(inParameter) 
  
  local sum = 0.0
  local deltaP = torch.Tensor(2)
    
    
  sgdNumber = (sgdNumber + 1) % #x
  local i = sgdNumber + 1
    
  sum = 0.5 * math.pow(inParameter[1] * x[i] + inParameter[2] - y[i],2)
  deltaP[1] = (inParameter[1] * x[i] + inParameter[2] - y[i]) * x[i]
  deltaP[2] = (inParameter[1] * x[i] + inParameter[2] - y[i])
    
  sgdLog.value[#sgdLog.value+1] = sum
  sgdLog.seq[#sgdLog.seq+1] = #sgdLog.seq+1
    
  return sum , deltaP
end


local state = {
   learningRate = 1.0e-2,
}

for i = 1,500 do
  optim.sgd(sgdFunc, parameter ,state)
end

local plot = Plot()
plot:line(sgdLog.seq, sgdLog.value,'black', 'yolo'):draw()
plot:title('SGD'):redraw()

drawResultLine()



In [ ]:



In [ ]: