This section is a bit odd in that it exists mainly to show that one non-parametric method for estimating luminosity functions is superior to all others without really introducing the others or going into much detail as to their limitations. They note only in passing that many (if not most) published luminosity functions are parametric, and that parameter values can be optimally located using Maximum Likelihood. They also end with an extremely brief mention of a full Bayesian approach to recovering luminosity functions, probably because that paper would be impossible to summarize in a single textbook section.
Luminosity functions in their simplest form, ask what is the distribution of objects in intrinsic luminosity within some volume? They note that an LF is a differential quantity, i.e.,
$$\phi(L,z)dL$$(Note that they generalize this to $y=z$ and $x=L$, but I had to keep reminding myself what $y$ and $x$ were. Of course if your interested in stars or nearby galaxies then $y$ could be physical distance).
This question of course gets complicated because essentially all surveys are flux limited, and hence the volume probed is a function of both luminosity and redshift. The method they more or less skip over, but which is more intuitive, is the Vmax formalism developed by Schmidt. The basic idea is that you recover the "effective" volume of your survey by asking, for each source, what is the volume in which I could have detected that source? Given a flux limit and ignoring details like $k$-corrections and selection functions, the maximum volume is:
$$V_{max} = (4/3) \pi r_{max}^3$$where
$$r_{max} = \sqrt{L/4\pi f_{lim}}$$In other words, if I take my source with luminosity $L$, and place it at a distance $r_{max}$, it will be at the flux limit ($f_{lim}$) of my survey. The volume it could occupy and be within my survey is $V_{max}$.
The main issue with the $V_{max}$ method is that it assumes the source distribution doesn't evolve with distance (time), i.e., that they are uniformly distributed within the volume. That assumption that fails for basically every astrophysical source at cosmological distances. This is often glossed over in practice by using binned luminosity functions, where the sources are divided into narrow redshift bins such that the effects of evolution are somewhat mitigated. There are even methods for correcting binned luminosity functions given an assumed form for the source evolution, but obviously that requires some fairly strong assumptions. Of course, with a parametric LF, you just need a parametric form for the redshift evolution.
Okay, so you don't want to assume any functional form for your LF, and you know $V_{max}$ won't work, then you should use Lynden-Bell's $C^-$ method. Since there are no parameters, all you need are your observed sources (pairs of $L_i$, $z_i$) and a selection function. For the purposes of their demonstration, they assume the selection function is $S=1$ if a hypothetical source with a given $(L,z)$ lands within your flux limits, and $S=0$ otherwise.
The first thing to note is that you can only apply this method if the two coordinates -- $L$ and $z$ (I'm going to refuse to stick with $x,y$) -- are separable. That is, you could write your LF as:
$$n(L,z) = \Psi(L) \rho(z)$$where $\Psi$ describes the distribution of sources in luminosity (in a parametric form, this could be a power-law) and $\rho$ describes the evolution with redshift (e.g., an exponential decline). To demonstrate that the two are separable using only the data, you follow the procedure outlined in the book. I'm going to do it graphically using an examply where the two are clearly separable: randomly distributing objects in $L$ and $z$.
In [35]:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(1)
L = np.random.random(30)
z = np.random.random(30)
zmax = lambda lum: np.sqrt(lum)
Lmax = lambda zed: zed**2
S = lambda lum,zed: zed < zmax(lum)
def lzplot(z,L):
ptcolors = np.choose(S(L,z),'rb')
fig = plt.figure()
plt.scatter(z,L,s=90,c=ptcolors)
plt.plot(zmax(np.linspace(0,1,100)),np.linspace(0,1,100),c='orange',lw=2)
plt.xlim(-0.05,1.05)
plt.ylim(1.05,-0.05)
plt.ylabel("luminosity")
plt.xlabel("redshift")
lzplot(z,L)
I'm just going to pick an individual source, and then its comparable set is defined as the sources with a luminosity greater than that source, and with a redshift less than the maximum redshift allowed by the selection function (flux limit).
In [36]:
i = 17
print z[i],L[i]
lzplot(z,L)
plt.scatter(z[i],L[i],s=200,edgecolor='g',lw=2,facecolor='none',marker='s')
plt.fill_between([0,]*2+[zmax(L[i]),]*2,[1.0,L[i],L[i],1.0],y2=[1.0]*4,color='k',alpha=0.25)
plt.plot([z[i]]*2,[1.0,L[i]],c='c',lw=1.4)
Out[36]:
In [37]:
J_i = np.where((L>L[i]) & (z<zmax(L[i])))[0]
N_i = len(J_i)
print N_i
In [38]:
R_j = np.argsort(z[J_i])
print R_j
In [39]:
R_i = np.sum(z[J_i] < z[i])
print R_i
Okay, so that's the example for building the associated set for a single source. Next, you do this for all sources. The rank of each source within its associated set should be a uniform distribution between 0 and $N_i$ if the two variables are independent. Since the distribution is uniform, the expectation value and variance are simply $N_i/2$ and $N_i^2/12$. Then you can cquanity the degree of correlation using Kendall's $\tau$-statistic (here I'm following the notation given in Fan et al. 2001, instead of $x,y$):
$$\tau = \Sigma(R_i-E_i) / \sqrt{\Sigma(V_i)}$$Note that the book says that $\tau < 1$ means you are uncorrelated, but actually it is $|\tau|<1$.
In [43]:
def calc_tau(_L,_z):
V = np.empty_like(_L)
E = np.empty_like(_L)
R = np.empty_like(_L)
for i in range(len(_L)):
J_i = np.where((_L>_L[i])&(Lmax(_L)<_L[i]))[0]
N_i = len(J_i)
R[i] = np.sum(_z[J_i] < _z[i])
E[i] = N_i / 2.
V[i] = N_i**2 / 12.
return np.sum(R-E)/np.sqrt(np.sum(V))
calc_tau(L,z)
Out[43]:
So $|\tau|<1$, and our uniform distribution is uncorrelated! Now's try it after correlating the two variables:
In [44]:
L2 = L*(2*z)
lzplot(z,L2)
calc_tau(L2,z)
Out[44]:
Not $<1$! All is good.
The last step is to actually get the luminosity function. Basically, you compute a similar ranked statistic in the other dimension, and then get the cumulative distributions of the actual point values in each dimension using the rank statistics. This just shows graphically what the other rank statistic ($M_k$ in the book notation) looks like:
In [45]:
lzplot(z,L)
plt.scatter(z[i],L[i],s=200,edgecolor='g',lw=2,facecolor='none',marker='s')
plt.fill_between([0,]*2+[z[i],]*2,[1.0,Lmax(z[i]),Lmax(z[i]),1.0],y2=[1.0]*4,color='k',alpha=0.25)
plt.plot([0,z[i]],[L[i]]*2,c='c',lw=1.4)
Out[45]:
Just as in the other dimension, the rank of the selected point is 3.
You then finally go from the cumulative distribution to a differential distribution by binning.
In [ ]: