House price


In [3]:
size = [1400, 2400, 1800, 1900, 1300, 1100]
cost = [112000, 192000, 144000, 152000, 104000, 88000]

How much money you should pay for 2100 square ft?


In [6]:
2100 * (sum(cost)/sum(size))


Out[6]:
168000

In [7]:
1500 * (sum(cost)/sum(size))


Out[7]:
120000

Scatter Plot


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt

In [2]:
size = [1700, 2100, 1900, 1300, 1600, 2200]
cost = [51000, 63000, 57000, 39000, 48000, 66000]

In [3]:
plt.scatter(size, cost)


Out[3]:
<matplotlib.collections.PathCollection at 0x10e7cead0>

In [4]:
size = [1700, 2100, 1900, 1300, 1600, 2200]
cost = [53000, 65000, 59000, 41000, 50000, 68000]
plt.scatter(size, cost)


Out[4]:
<matplotlib.collections.PathCollection at 0x10eab26d0>

In [5]:
size = [1700, 2100, 1900, 1300, 1600, 2200]
cost = [53000, 44000, 59000, 82000, 50000, 68000]
plt.scatter(size, cost)


Out[5]:
<matplotlib.collections.PathCollection at 0x10ee0d110>

Bar Charts

Noise: the deviations from the linear graph. In the housing examples, there might be factors that really affect the house price beyond the size, which make the prices go up and down. But if those factors are unincluded, to a statistician that is called "random noise".

  • Bar Chart: 2D data
  • Histogram: 1D data, special case of bar chart, which is used to draw the frequency count of the data.

Pie Charts

  • Pie charts are used to visualize relative data.