This exercise is about modelling a linear relationship between "chirps of a cricket" and ground temperature.
In 1948, G. W. Pierce in his book "Songs of Insects" mentioned that we can predict temperature by listening to the frequency of songs(chirps) made by stripped Crickets. He recorded change in behaviour of crickets by recording number of chirps made by them at several "different temperatures" and found that there is a pattern in the way crickets respond to the rate of change in ground temperature 60 to 100 degrees of farenhite. He also found out that Crickets did not sing
above or below this temperature.
This data is derieved from the above mentioned book and aim is to fit a linear model and predict the "Best Fit Line" for the given "Chirps(per 15 Second)" in Column 'A' and the corresponding "Temperatures(Farenhite)" in Column 'B' using TensorFlow. So that one could easily tell what temperature it is just by listening to the songs of cricket.
In [2]:
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [3]:
pd.__version__
Out[3]:
In [3]:
#downloading dataset
!wget -nv -O ../data/PierceCricketData.csv https://ibm.box.com/shared/static/fjbsu8qbwm1n5zsw90q6xzfo4ptlsw96.csv
df = pd.read_csv("../data/PierceCricketData.csv")
df.head()
Out[3]:
In [4]:
%matplotlib inline
x_data, y_data = (df["Chirps"].values,df["Temp"].values)
# plots the data points
plt.plot(x_data, y_data, 'ro')
# label the axis
plt.xlabel("# Chirps per 15 sec")
plt.ylabel("Temp in Farenhiet")
Out[4]:
Looking at the scatter plot we can analyse that there is a linear relationship between the data points that connect chirps to the temperature and optimal way to infer this knowledge is by fitting a line that best describes the data. Which follows the linear equation:
We have to estimate the values of the slope 'm' and the inrtercept 'c' to fit a line where, X is the "Chirps" and Ypred is "Predicted Temperature" in this case.
Model the above equation by assigning arbitrary values of your choice for slope "m" and intercept "c" which can predict the temp "Ypred" given Chirps "X" as input.
example m=3 and c=2
Also, create a place holder for actual temperature "Y" which we will be needing for Optimization to estimate the actual values of slope and intercept.
In [5]:
# Create place holders and Variables along with the Linear model.
m = tf.Variable(3, dtype=tf.float32)
c = tf.Variable(2, dtype=tf.float32)
x = tf.placeholder(dtype=tf.float32, shape=x_data.size)
y = tf.placeholder(dtype=tf.float32, shape=y_data.size)
# Linear model
y_pred = m * x + c
In [6]:
#create session and initialize variables
session = tf.Session()
session.run(tf.global_variables_initializer())
In [7]:
#get prediction with initial parameter values
y_vals = session.run(y_pred, feed_dict={x: x_data})
#Your code goes here
plt.plot(x_data, y_vals, label='Predicted')
plt.scatter(x_data, y_data, color='red', label='GT')
Out[7]:
The essence of estimating the values for "m" and "c" lies in minimizing the difference between predicted "Ypred" and actual "Y" temperature values which is defined in the form of Mean Squared error loss function.
$$ loss = \frac{1}{n}\sum_{i=1}^n{[Ypred_i - {Y}_i]^2} $$Note: There are also other ways to model the loss function based on distance metric between predicted and actual temperature values. For this exercise Mean Suared error criteria is considered.
In [8]:
loss = tf.reduce_mean(tf.squared_difference(y_pred*0.1, y*0.1))
In [9]:
# Your code goes here
optimizer = tf.train.GradientDescentOptimizer(0.01)
train_op = optimizer.minimize(loss)
In [10]:
session.run(tf.global_variables_initializer())
Get the predicted m and c values by running a session on Training a linear model. Also collect the loss for different steps to print and plot.
In [12]:
convergenceTolerance = 0.0001
previous_m = np.inf
previous_c = np.inf
steps = {}
steps['m'] = []
steps['c'] = []
losses=[]
for k in range(10000):
########## Your Code goes Here ###########
_, _l, _m, _c = session.run([train_op, loss, m, c], feed_dict={x: x_data, y: y_data})
steps['m'].append(_m)
steps['c'].append(_c)
losses.append(_l)
if (np.abs(previous_m - _m) or np.abs(previous_c - _c) ) <= convergenceTolerance :
print("Finished by Convergence Criterion")
print(k)
print(_l)
break
previous_m = _m
previous_c = _c
In [13]:
# Your Code Goes Here
plt.plot(losses)
Out[13]:
In [14]:
y_vals_pred = y_pred.eval(session=session, feed_dict={x: x_data})
plt.scatter(x_data, y_vals_pred, marker='x', color='blue', label='Predicted')
plt.scatter(x_data, y_data, label='GT', color='red')
plt.legend()
plt.ylabel('Temperature (Fahrenheit)')
plt.xlabel('# Chirps per 15 s')
Out[14]:
In [15]:
session.close()
This Exercise is about giving Overview about how to use TensorFlow for Predicting Ground Temperature given the number of Cricket Chirps per 15 secs. Idea is to use TnesorFlow's dataflow graph to define Optimization and Training graphs to find out the actual values of 'm' and 'c' that best describes the given Data.
Created by Shashibushan Yenkanchi </h4>