The usage of random.seed()

如果使用了 random.seed(),那么每次所产生的随机数将与前面一次的相同。


In [13]:
import numpy as np

np.random.seed(20)

runningtimes=0
while (runningtimes <=10):
    x = np.random.randn(5)
    print(x)
    print('**')
    runningtimes=runningtimes+1


[ 0.88389311  0.19586502  0.35753652 -2.34326191 -1.08483259]
**
[ 0.55969629  0.93946935 -0.97848104  0.50309684  0.40641447]
**
[ 0.32346101 -0.49341088 -0.79201679 -0.84236793 -1.27950266]
**
[ 0.24571517 -0.0441948   1.56763255  1.05110868  0.40636843]
**
[-0.1686461  -3.18970279  1.12013226  1.33277821 -0.24333877]
**
[-0.13003071 -0.10901737  1.55618644  0.12877835 -2.06694872]
**
[-0.88549315 -1.10457948  0.93286635  2.059838   -0.93493796]
**
[-1.61299022  0.52706972 -1.55110074  0.32961334 -1.13652654]
**
[-0.3384906   0.32097078 -0.60230802  1.54472836  0.64703408]
**
[0.59321721 0.4380245  1.35778902 1.20451128 1.35179619]
**
[ 4.93437236e-01 -2.70436525e+00 -5.55185797e-01  1.50856026e-03
  8.57093817e-01]
**

In [11]:
import numpy as np


runningtimes=0
while (runningtimes <=10):
    np.random.seed(5)
    x = np.random.randn(5)
    print(x)
    print('**')
    runningtimes=runningtimes+1


[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**
[ 0.44122749 -0.33087015  2.43077119 -0.25209213  0.10960984]
**

Under stand how path is joined

Here, I use a simple example to show how we generate a path in python.


In [15]:
import os
PROJECT_ROOT_DIR = '.'
CHAPTER_ID = '001'
fig_id = '300'
path = os.path.join(PROJECT_ROOT_DIR, "images", CHAPTER_ID, fig_id + ".png")
print(path)


./images/001/300.png

numpy.c_ function

Translates slice objects to concatenation along the second axis.


In [22]:
import numpy as np

arr1 = np.c_[np.array([1,2,3]), np.array([4,5,6])]
print(arr1)
print('*******')

arr2 = np.c_[np.array([[1,2,3]]), np.array([[4,5,6]])]
print(arr2)
print('*******')

arr3 = np.c_[np.array([[1,2,3]]), 0, 0, np.array([[4,5,6]])]
print(arr3)


[[1 4]
 [2 5]
 [3 6]]
*******
[[1 2 3 4 5 6]]
*******
[[1 2 3 0 0 4 5 6]]

coef and intercept

  • coef_: array, shape (n_features, ) or (n_targets, n_features) Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

  • intercept_ : array Independent term in the linear model.截距

sklearn.linear_model.Ridge

Ref: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html

Linear least squares with l2 regularization.

Minimizes the objective function(最小化目标函数,其实就是最小化cost function):

||y - Xw||^2_2 + alpha * ||w||^2_2 This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. Also known as Ridge Regression or Tikhonov regularization.

Note that alpha is important to fine-tune the model.

L1 and L2 Regularization Methods

Ref: https://towardsdatascience.com/l1-and-l2-regularization-methods-ce25e7fc831c Original author:

In order to create less complex (parsimonious) model when you have a large number of features in your dataset, some of the Regularization techniques used to address over-fitting and feature selection are:

  • L1 Regularization

  • L2 Regularization

A regression model that uses L1 regularization technique is called Lasso (Least Absolute Shrinkage and Selection Operator) Regression and model which uses L2 is called Ridge Regression.

The key difference between these two is the penalty term.

Ridge regression adds "squared magnitude" of coefficient as penalty term to the loss function. Here the term in red represents L2 regularization element.

(Note: this fuction was rendered in Latex), the source code for latex is):

\sum_{i=1}^n (y_i - \sum_{j=1}^p x_{ij}\beta_{j})^2 + {\color[rgb]{0.986246,0.007121,0.027434}\lambda\sum_{j=1}^p\beta_j^2}

Here, if lambda is zero then you can imagine we get back OLS (Ordinary least squares). However, if lambda is very large then it will add too much weight and it will lead to under-fitting. Having said that it’s important how lambda is chosen. This technique works very well to avoid over-fitting issue.

Lasso Regression adds “absolute value of magnitude” of coefficient as penalty term to the loss function.

(Note: this fuction was rendered in Latex), the source code for latex is):

\sum_{i=1}^n (yi - \sum{j=1}^p x{ij}\beta{j})^2 + {\color[rgb]{0.986246,0.007121,0.027434}\lambda\sum_{j=1}^p|\beta_j|}

The key difference between these techniques is that Lasso shrinks the less important feature's coefficient to zero thus, removing some feature altogether. So, this works well for feature selection in case we have a huge number of features.

"hands-on machine learning with scikit-learn and "这本书的作者称呼 $\lambda$ 为 hyperparameter,他也帮我强调出来了:这个参数是learning algorithm的参数,而并不是model的参数。因此,这个参数并不受算法本身的影响,而是应该在训练之前人为设定,并且在训练中保持不变。


In [ ]:


In [ ]: