线性回归

自变量(X)代表火灾发生的数量 因变量(y)盗窃发生的数量

地点:芝加哥

目的:通过线性回归 使得可以通过 X 预测 Y


In [1]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import seaborn as sns
import xlrd

In [2]:
DATA_FILE = "../data/fire_theft.xls"

1. 读取数据


In [3]:
book = xlrd.open_workbook(DATA_FILE, encoding_override="utf-8")
sheet = book.sheet_by_index(0)
data = np.asarray([sheet.row_values(i) for i in range(1, sheet.nrows)])
n_samples = sheet.nrows - 1

In [4]:
data[:10]


Out[4]:
array([[  6.2,  29. ],
       [  9.5,  44. ],
       [ 10.5,  36. ],
       [  7.7,  37. ],
       [  8.6,  53. ],
       [ 34.1,  68. ],
       [ 11. ,  75. ],
       [  6.9,  18. ],
       [  7.3,  31. ],
       [ 15.1,  25. ]])

2.构造图


In [5]:
X = tf.placeholder(tf.float32, name='X')
Y = tf.placeholder(tf.float32, name='Y')

w = tf.Variable(0.0, name="weights")
b = tf.Variable(0.0, name="bias")

Y_predicted = X * w + b

loss = tf.square(Y - Y_predicted, name='loss')

optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001).minimize(loss)

3.在 Session 中执行t图(Graph)


In [6]:
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    writer = tf.summary.FileWriter('../graphs/linear_reg', sess.graph)
    
    # 100 epochs
    for i in range(100):
        total_loss = 0
        for x, y in data:
            _, l = sess.run([optimizer, loss], {X: x, Y: y})
            total_loss += l
        if not i % 20:
            print('Epoch {0}: {1}'.format(i, total_loss))
            
    writer.close()
    
    w, b = sess.run([w, b])
    print('After training, w is {0}, b is {1}'.format(w, b))


Epoch 0: 86924.54120270908
Epoch 20: 74470.3043830581
Epoch 40: 66766.05836592615
Epoch 60: 62731.20885742456
Epoch 80: 60533.684946784284
After training, w is 1.7183812856674194, b is 15.789156913757324

4.绘制结果


In [7]:
%matplotlib inline
X, Y = data.T[0], data.T[1]
plt.plot(X, Y, 'bo', label='Real Data')
plt.plot(X, X * w + b, 'r', label='Predicted Data')
plt.legend()
plt.show()



In [ ]: