In 1929, Edwin Hubble published a paper in which he compared the radial velocity of objects with their distance. The former can be done pretty precisely with spectroscopy, the latter is much more uncertain. His original data are here.
He saw that the velocity increases with distance and speculated that this could be the sign of a cosmological expansion. Let's find out what he did.
Load the data into an array with numpy.genfromtxt, make use of its arguments names and dtype to read in the column names from the header and choosing the data type on its own as needed. You should get 6 columns
CAT, NUMBER: These two combined give you the name of the galaxy.R: distance in MpcV: radial velocity in km/sRA, DEC: equatorial coordinates of the galaxyMake a scatter plot of V vs R. Don't forget labels and units...
In [4]:
# load file into variable `data`
In [1]:
# plot data with plt.scatter
%matplotlib inline
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
Use np.linalg.lstsq to fit a linear regression function and determine the slope $H_0$ of the line $V=H_0 R$. For that, reshape $R$ as a $N\times1$ matrix (the design matrix) and solve for 1 unknown parameter. Add the best-fit line to the plot.
In [ ]:
Why is there scatter with respect to the best-fit curve? Is it fair to only fit for the slope and not also for the intercept? How would $H_0$ change if you include an intercept in the fit?
In [ ]:
$V$ as given in the table is a combination of any assumed cosmic expansion and the motion of the sun with respect to that cosmic frame. So, we need to generalize the model to $V=H_0 R + V_s$, where the solar velocity is given by $V_s = X \cos(RA)\cos(DEC) + Y\sin(RA)\cos(DEC)+Z\sin(DEC)$. We'll use astropy to read in the RA/DEC coordinate strings and properly convert them to degrees (and then radians):
In [ ]:
import astropy.coordinates as coord
import astropy.units as u
pos = coord.SkyCoord(ra=data['RA'].astype('U8'), dec=data['DEC'].astype('U9'), unit=(u.hourangle,u.deg),frame='fk5')
ra_ = pos.ra.to(u.deg).value * np.pi/180
dec_ = pos.dec.to(u.deg).value * np.pi/180
Construct a new $N\times4$ design matrix for the four unknown parameters $H_0$, $X$, $Y$, $Z$ to account for the solar motion. The resulting $H_0$ is Hubble's own version of the "Hubble constant". What do you get?
In [ ]:
Make a scatter plot of $V-V_S$ vs $R$. How is it different from the previous one without the correction for solar velicity. Add the best-fit linear regression line.
In [ ]:
Using astropy.units, can you estimate the age of the universe from $H_0$? Does it make sense?
In [ ]:
In [ ]:
Let see how adopting a suitable value $\sigma$ for those uncertainties would affect the estimate of $H_0$?
The problem you solved so far is $Ax=b$, and errors don't occur. With errors the respective equation is changed to $A^\top \Sigma^{-1} Ax=A^\top \Sigma^{-1}b$, where in this case the covariance matrix $\Sigma=\sigma^2\mathbf{1}$. This problem can still be solved by np.linalg.lstsq.
Construct the modified design matrix and data vector and get a new estimate of $H_0$. Has it changed? Use np.dot, np.transpose, and np.linalg.inv (or their shorthands).
In [ ]:
Compute the parameter covariance matrix $S=(A^\top \Sigma^{-1} A)^{-1}$ and read off the variance of $H_0$. Update your plot to illustrate that uncertainty.
In [ ]:
How large is the relative error? Would that help with the problematic age estimate above?
In [ ]:
Compare the noise-free result from above (Hubble's result) with $SA^\top \Sigma^{-1}b$. Did adopting errors change the result?
In [ ]: