Following exercise comes from Hogg, Bovy, and Lang (2010). The link is http://adsabs.harvard.edu/abs/2010arXiv1008.4686H.

Step 1: Import packages used


In [1]:
from astropy.io import ascii
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Step 2: Read data file


In [2]:
dat = ascii.read("../data/hogg-et-al-2010-table1.csv")
print(dat)


 id  x   y   ey  ex  rxy 
--- --- --- --- --- -----
  1 201 592  61   9 -0.84
  2 244 401  25   4  0.31
  3  47 583  38  11  0.64
  4 287 402  15   7 -0.27
  5 203 495  21   5 -0.33
  6  58 173  15   9  0.67
  7 210 479  27   4 -0.02
  8 202 504  14   4 -0.05
  9 198 510  30  11 -0.84
 10 158 416  16   7 -0.69
 11 165 393  14   5   0.3
 12 201 442  25   5 -0.46
 13 157 317  52   5 -0.03
 14 131 311  16   6   0.5
 15 166 400  34   6  0.73
 16 160 337  31   5 -0.52
 17 186 423  42   9   0.9
 18 125 334  26   8   0.4
 19 218 533  16   6 -0.78
 20 146 344  22   5 -0.56

Step 3: Form matrices

\begin{equation} \textbf{Y} = \begin{bmatrix} y_1 \\ y_2 \\ ... \\ y_N \\ \end{bmatrix} , \textbf{A} = \begin{bmatrix} 1 & x_1 \\ 1 & x_2 \\ ... & ... \\ 1 & x_N \\ \end{bmatrix} , \textbf{C} = \begin{bmatrix} \sigma_{y_1}^2 & 0 & ... & 0 \\ 0 & \sigma_{y_1}^2 & ... & 0 \\ ... & ... & ... & ... \\ 0 & 0 & ... & \sigma_{y_N}^2 \\ \end{bmatrix} \end{equation}

In [3]:
A = np.ones((len(dat["x"]), 2))
A[:, 1] = dat["x"]
C = np.zeros((len(dat["ey"]), len(dat["ey"])))
np.fill_diagonal(C, pow(dat["ey"], 2.))

Step 4: Perform best-fit

\begin{equation} \begin{bmatrix} b \\ m \\ \end{bmatrix} = \textbf{X} = [\textbf{A}^T \textbf{C}^{-1} \textbf{A}]^{-1} [\textbf{A}^T \textbf{C}^{-1} \textbf{Y}] \end{equation}

In [4]:
term1 = np.linalg.inv(np.matmul(np.matmul(A.T, np.linalg.inv(C)), A))
term2 = np.matmul(np.matmul(A.T, np.linalg.inv(C)), dat["y"])
b, m = np.matmul(term1, term2)

Step 5: Standard errors in b and m parameters


In [5]:
eb, em = np.sqrt(term1[0, 0]), np.sqrt(term1[1, 1])

Step 6: Plot results


In [6]:
plt.errorbar(dat["x"], dat["y"], dat["ey"], marker=".", ls="none")
plt.plot(dat["x"], m * dat["x"] + b)
plt.xlabel("x-coordinate")
plt.ylabel("y-coordinate")
pf = "y = ({:.2f}$\pm${:.2f})x + ({:.2f}$\pm${:.2f})".format(m, em, b, eb)
plt.text(s=pf, x=50, y=200, color="C1", size=16)
plt.show();



In [ ]: