How well does your model explain your data? R-squared is a useful statistic for answering this question. In this episode we explore how it applies to the problem of valuing a house. Aspects like the number of bedrooms go a long way in explaining why different houses have different prices. There's some amount of variance that can be explained by a model, and some amount that cannot be directly measured. R-squared is the ratio of the explained variance to the total variance. It's not a measure of accuracy, it's a measure of the power of one's model.

```
In [1]:
```%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf

```
```

```
In [6]:
```x = 1.0 * np.arange(100) / 100
y = 0.5 * np.arange(100) / 100
yy = y + (np.random.rand(100)-.5) * .01
f = sm.OLS(yy, x).fit()
plt.scatter(x, yy)
plt.plot(x, f.predict(x))
plt.title('R-squared = ' + str(f.rsquared))
plt.xlim(0, 1)
plt.ylim(-1, 1)
plt.show()

```
```

```
In [7]:
```yy = y + (np.random.rand(100)-.5) * .1
f = sm.OLS(yy, x).fit()
plt.plot(x, f.predict(x))
plt.scatter(x, yy)
plt.title('R-squared = ' + str(f.rsquared))
plt.xlim(0, 1)
plt.ylim(-1, 1)
plt.show()

```
```

```
In [10]:
```yy = y + (np.random.rand(100)-.5)
f = sm.OLS(yy, x).fit()
plt.plot(x, f.predict(x))
plt.scatter(x, yy)
plt.title('R-squared = ' + str(f.rsquared))
plt.xlim(0, 1)
plt.ylim(-1, 1)
plt.show()

```
```

```
In [11]:
```yy = np.random.rand(100)*2 - 1
f = sm.OLS(yy, x).fit()
plt.plot(x, f.predict(x))
plt.scatter(x, yy)
plt.title('R-squared = ' + str(f.rsquared))
plt.xlim(0, 1)
plt.ylim(-1, 1)
plt.show()

```
```