# Performing Linear Regression in TensorFlow

I gathered this data for current real estate listing prices in North Bergen from Zillow. Let's see if we can use it to develop a model for housing costs based on home size.

``````

In :

%matplotlib inline
#Typical imports
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import pandas as pd

# plots on fleek
matplotlib.style.use('ggplot')

``````
``````

In :

# Read the housing data from the csv file into a pandas dataframe
# the names keyword allows us to name the columns,
# while the dtype sets the data type.

df = pd.read_csv('data/nb home sales.csv', names=['Square Feet', 'Price'],
dtype=np.float32)

``````
``````

In :

# Display the dataframe
df

``````
``````

Out:

Square Feet
Price

0
670.0
144900.0

1
2760.0
508000.0

2
2860.0
600000.0

3
503.0
139000.0

4
1575.0
435000.0

5
935.0
260000.0

6
680.0
229000.0

7
1593.0
559000.0

8
2552.0
475000.0

9
1008.0
275000.0

10
1060.0
200000.0

11
1976.0
459000.0

``````
``````

In :

# Visualize the data as a scatter plot
# with sq. ft. as the independent variable.
df.plot(x='Square Feet', y='Price', kind='scatter')

``````
``````

Out:

<matplotlib.axes._subplots.AxesSubplot at 0x110f37160>

``````

It seems a linear model could be appropriate in this case. How can we build it with TensorFlow?

``````

In :

# First we declare our placeholders
x = tf.placeholder(tf.float32, [None, 1])
y_ = tf.placeholder(tf.float32, [None, 1])

# Then our variables
W = tf.Variable(tf.zeros([1,1]))
b = tf.Variable(tf.zeros())

# And now we can make our linear model: y = Wx + b
y = tf.matmul(x, W) + b

# Finally we choose our cost function (SSE in this case)
cost = tf.reduce_sum(tf.square(y_-y))

``````

And here's where all the magic will happen:

``````

In :

# Call tf's gradient descent function with a learning rate and instructions to minimize the cost
learn_rate = .0000000001

# Prepare our data to be read into the training session. The data needs to match the
# shape we specified earlier -- in this case (n, 1) where n is the number of data points.
xdata = np.asarray([[i] for i in df['Square Feet']])
y_data = np.asarray([[i] for i in df['Price']])

# Create a tensorflow session, initialize the variables, and run gradient descent
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(10000):
# This is the actual training step - feed_dict specifies the data to be read into
# the placeholders x and y_ respectively.
sess.run(train, feed_dict={x:xdata, y_:y_data})

# Convert our variables from tensors to scalars so we can use them outside tf
price_sqft = np.asscalar(sess.run(W))
cost_0 = np.asscalar(sess.run(b))

print("Model: y = %sx + %s" % (round(price_sqft,2), round(cost_0,2)))

``````
``````

Model: y = 222.19x + 0.61

``````
``````

In :

# Create the empty plot
fig, axes = plt.subplots()

# Draw the scatter plot on the axes we just created
df.plot(x='Square Feet', y='Price', kind='scatter', ax=axes)

# Create a range of x values to plug into our model
sqft = np.arange(500, 3000, 1)

# Plot the model
plt.plot(sqft, price_sqft*sqft + cost_0)
plt.show()

``````
``````

``````