1

2

3

4

5

6

7

8

9

10

$(X^TX)^{-1}X^T$被称作$X$的伪逆。方阵才有逆矩阵，而这里的这个乘积，就像逆矩阵一样。
如果$（X^TX)$接近不可逆，即行列式接近零，那么$\theta$就会是一个很大的值。加上$\lambda$以后，就能使$\theta$的值变小
加入$\lambda$后，就是岭回归，可以看MLAPP的第225页，讲的非常非常好。岭回归就是给参数w加入了一个先验，要求w服从$N(0, \gamma^2)$，那么$\gamma$就相当于标准差，自然是标准差越小，w就被限制的越小，而这里的$\lambda$，有$\lambda = \sigma^2 / \gamma ^ 2$，这个$\sigma$就是这里前面的$\epsilon$的方差。

下面是从MLAPP中截的图

The posterior is simply the likelihood times the prior, normalized.

11

12

13

14

15

16

17



In [23]:

    
%pylab inline
import numpy as np


x = np.linspace(0, 50, 1000)
y = np.linspace(0, 50, 1000)
xx, yy = np.meshgrid(x, y)

z = 3 * (xx - 25) ** 2 + 3 * (yy - 25) ** 2 - 2 * (xx - 25) * (yy - 25)
plt.contour(xx, yy, z)









    



Populating the interactive namespace from numpy and matplotlib






    Out[23]:





<matplotlib.contour.QuadContourSet at 0x116db5790>

18

19

20

这个normalize应该翻译成归一化，规范化，正则化就是对应regularization，另外，标准化：standardization 可以看看这篇博客

21

讲道理的话，这里的几个代码都写的非常丑陋。

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56



In [ ]: