Introduction

We're going to explore Pizza Franshise data set from http://college.cengage.com/mathematics/brase/understandable_statistics/7e/students/datasets/slr/frames/frame.html

We want to know if we should be opening the next pizza franshise or not.

In the following data X = annual franchise fee ($1000) Y = start up cost ($1000) for a pizza franchise



In [44]:

    
%matplotlib inline
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

Data Exploring



In [47]:

    
df = pd.read_csv('slr12.csv', names=['annual', 'cost'], header=0)
df.describe()









    Out[47]:






  
    
      
      annual
      cost
    
  
  
    
      count
      36.000000
      36.000000
    
    
      mean
      1134.777778
      1291.055556
    
    
      std
      158.583211
      124.058038
    
    
      min
      700.000000
      1050.000000
    
    
      25%
      1080.000000
      1250.000000
    
    
      50%
      1162.500000
      1277.500000
    
    
      75%
      1250.000000
      1300.000000
    
    
      max
      1375.000000
      1830.000000



In [48]:

    
df.head()



In [49]:

    
df.annual.plot()









    Out[49]:





<matplotlib.axes._subplots.AxesSubplot at 0x10e90d400>



In [50]:

    
df.cost.plot()









    Out[50]:





<matplotlib.axes._subplots.AxesSubplot at 0x10ea1e128>



In [24]:

    
df.plot(kind='scatter', x='X', y='Y');



In [34]:

    
slope, intercept, r_value, p_value, std_err = stats.linregress(df['X'], df['Y'])



In [40]:

    
plt.plot(df['X'], df['Y'], 'o', label='Original data', markersize=2)
plt.plot(df['X'], slope*df['X'] + intercept, 'r', label='Fitted line')
plt.legend()
plt.show()

So from this trend we can predict that if you annual fee is high then you need your startup cost will be high as well.

	annual	cost
0	1000	1050
1	1125	1150
2	1087	1213
3	1070	1275
4	1100	1300

	annual	cost
count	36.000000	36.000000
mean	1134.777778	1291.055556
std	158.583211	124.058038
min	700.000000	1050.000000
25%	1080.000000	1250.000000
50%	1162.500000	1277.500000
75%	1250.000000	1300.000000
max	1375.000000	1830.000000