Quality Assesment Notebook

This notebook displays basic info and plots. It should run on your local computer, provided that you've download the *-exp.fits and *-exp.h5 files, but you can run at jupyter.nersc.gov, once you've setup your environment (see below).

WARNING: If you open a .h5 file (fit outputs) make sure you use the 'r' option. Otherwise, you might overwrite the file and lose its contents. @TODO: add a protection againts this attack, for now please be careful.

Reminder: your modifications will modify your local copy of the notebook, if you want to share those modifications, please do a PR. You might lose them if the notebook is modified and you update picca to the latest master. If you are unsure what this means, make a copy of the file (e.g. to your home) and work with it.

How to setup the jupyter environment at NERSC: (you need to do this only once)

First, create a conda environment dedicated to picca plots:

conda config --add channels intel conda-forge defaults
conda create -n picca_plots ipykernel fitsio h5py scipy matplotlib healpy
source activate picca_plots
cd $HOME/igmhub/picca
python setup.py install
python -m ipykernel install --user --name picca_plots --display-name picca_plots
source deactivate

Point your browser to jupyter.nersc.gov. After you login, you should see a picca_plots kernel. Double click on it. Navigate the left menu to open $HOME/igmhub/picca/tutorials/QA_plots.ipynb

notes:

The setup.py install step will install picca only in your picca_plots environment and will not conflict with your previous installation, independently of whether you used PYTHONPATH or setup.py install (we need the setup.py install setp in the env because PYTHONPATH is not visible from the jupyter.nersc.gov).



In [ ]:

    
import os
import glob
import h5py
import fitsio
import healpy
import matplotlib.pyplot as plt
from matplotlib.pyplot import rcParams
import scipy as sp
import scipy.stats
from scipy.interpolate import interp1d

from picca import wedgize

%matplotlib inline
rcParams['figure.figsize'] = 10, 6
rcParams['lines.linewidth'] = 4
rcParams['axes.labelsize'] = 20
rcParams['xtick.labelsize'] = 18
rcParams['ytick.labelsize'] = 18
rcParams['legend.fontsize'] = 20
plt.rc('text',usetex=True)

INFO: you can safely ignore the h5py error message (if any) in the cell above

Set up the different path to the data and to the mocks



In [ ]:

    
### Get mocks
mock_base = '/project/projectdirs/desi/mocks/lya_forest/london/v4.0/'
quick_ver = 'quick-0.0'



In [ ]:

    
### Get mock products
mock = {}
mock['raw_path'] = '/project/projectdirs/desi/mocks/lya_forest/picca/london/v4.0/quick-0.0/'
mock['delta'] = mock['raw_path']+'/deltas/'
mock['delta_attributes'] = mock['raw_path']+'/iter.fits.gz'
mock['delta_log'] = mock['raw_path']+'/input.log'
mock['cf1d'] = '/global/homes/h/hdumasde/Run_programs/igmhub/picca/CoLoRe_mocks/cf1d.fits.gz'



In [ ]:

    
### Get data products
data = {}
data['raw_path'] = '/global/homes/h/hdumasde/Run_programs/igmhub/picca/DR14_results/'
data['delta'] = data['raw_path']+'/Delta_LYA/Delta/'
data['delta_attributes'] = data['raw_path']+'/Delta_LYA/Log/delta_attributes.fits.gz'
data['cf1d'] = data['raw_path']+'/Correlations/cf1d.fits.gz'
data['cf'] = data['raw_path']+'/Correlations_redone/e_cf.fits.gz'
data['xcf'] = data['raw_path']+'/Correlations_redone/e_xcf.fits.gz'
data['fit_cf'] = data['raw_path']+'/Correlations_redone/Fit/LYA_LYA/result.h5'
data['fit_xcf'] = data['raw_path']+'/Correlations_redone/Fit/LYA_QSO/result.h5'

Catalogs



In [ ]:

    
# Catalogs
cat_path = {'master':mock_base+'master.fits',
            'zcat':mock_base+quick_ver+'/zcat.fits',
            'desiDRQ':mock_base+quick_ver+'/zcat_desi_drq.fits',
            'random':mock_base+'/master_randoms.fits.gz'}
redshift_key = {'master':'Z_QSO_RSD', 'zcat':'Z', 'desiDRQ':'Z','random':'Z'}

Redshift distribution of the different catalogs

The following cell calculates the redshift distribution of the catalogs



In [ ]:

    
bins = sp.arange(1.5,5.,0.01)
for name,path in cat_path.items():
    h = fitsio.FITS(path)
    z = h[1][redshift_key[name]][:]
    h.close()
    plt.hist(z,bins=bins,histtype='step',label=name,density=True,linewidth=4)
plt.xlabel(r'$z$')
plt.ylabel(r'$\mathrm{\#}$')
plt.legend()
plt.grid()

The following function allows to have a 2d histogram



In [ ]:

    
def Get_2DTProfile(ar1, ar2, ar3, nbBinsX, nbBinsY,we=None):

    if we is None:
        we = sp.ones_like(ar1)
    d = sp.array(list(zip(ar1,ar2,ar3)))
    number, axis = sp.histogramdd( d, (nbBinsX,nbBinsY,1))
    weight, axis = sp.histogramdd( d, (nbBinsX,nbBinsY,1), weights=we  )
    mean,   axis = sp.histogramdd( d, (nbBinsX,nbBinsY,1), weights=we*ar3)
    err,    axis = sp.histogramdd( d, (nbBinsX,nbBinsY,1), weights=we*(ar3**2.))

    w = number>1
    mean[w]   /= weight[w]
    err[w]    = sp.sqrt((err[w]/weight[w]-mean[w]**2.)/number[w])

    mean   = mean[:,:,0]
    err    = err[:,:,0]
    number = number[:,:,0]

    return mean, err, number

Density of objects on the sky



In [ ]:

    
sp.random.seed(42)
nbQSO = 1000000
for name,path in cat_path.items():
    h = fitsio.FITS(path)
    ra = h[1]['RA'][:]
    dec = h[1]['DEC'][:]
    z = h[1][redshift_key[name]][:]
    h.close()
    w = sp.random.choice(sp.arange(ra.size,dtype=int),size=min(ra.size,nbQSO),replace=False)
    ra = ra[w]
    dec = dec[w]
    z = z[w]
    extent = [ra.min(), ra.max(), dec.min(), dec.max()]

    f, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(14,7))
    plt.suptitle(r'$\mathrm{'+name+'}$',fontsize=20)
    
    mean_nb, err_nb, number_nb = Get_2DTProfile(ra, dec, sp.ones(ra.size), 200, 100)
    number_nb = sp.transpose(number_nb)
    number_nb[number_nb==0.] = sp.nan
    plot1 = ax1.imshow(number_nb, interpolation='nearest', origin='lower',extent=extent, aspect='auto')
    cbar1 = plt.colorbar(plot1, ax=ax1)
    cbar1.set_label(r'$\#$',size=20)
    cbar1.update_ticks()
    ax1.set_xlabel(r'$\mathrm{R.A. \, [deg]}$')
    ax1.set_ylabel(r'$\mathrm{Dec. \, [deg]}$')
    ax1.grid()
    
    mean_z, err_z, number_z = Get_2DTProfile(ra, dec, z, 200, 100)
    mean_z = sp.transpose(mean_z)
    number_z = sp.transpose(number_z)
    mean_z[number_z==0.] = sp.nan
    plot2 = ax2.imshow(mean_z, interpolation='nearest', origin='lower',extent=extent, aspect='auto')
    cbar2 = plt.colorbar(plot2,ax=ax2)
    cbar2.set_label(r'$<z>$',size=20)
    cbar2.update_ticks()
    ax2.set_xlabel(r'$\mathrm{R.A. \, [deg]}$')
    ax2.set_ylabel(r'$\mathrm{Dec. \, [deg]}$')
    ax2.grid()
    
    plt.subplots_adjust(wspace=0.2)

Magnitude distribution



In [ ]:

    
name_mag = {}
name_mag[0] = 'g mag: decam'
name_mag[1] = 'r mag: decam'
name_mag[2] = 'z mag: decam'
name_mag[3] = 'wise1'
name_mag[4] = 'wise2'



In [ ]:

    
h = fitsio.FITS(cat_path['zcat'])
mag = h['ZCATALOG']['MAG'][:]
z = h['ZCATALOG']['Z'][:]
ra = h['ZCATALOG']['RA'][:]
dec = h['ZCATALOG']['DEC'][:]
h.close()



In [ ]:

    
bins = sp.arange(mag.min(), mag.max(), 0.1)
for i in range(5):
    plt.hist(mag[:,i],bins=bins,linewidth=4,histtype='step',label=r'$\mathrm{'+name_mag[i]+'}$')
plt.xlabel(r'magnitude')
plt.ylabel(r'number')
plt.legend(loc=2)
plt.grid()

The following function allows to get the profile



In [ ]:

    
def Get_TProfile(ar1, ar2, nbBin1, we2=None):

    if we2 is None:
        we2 = sp.ones_like(ar1)
    number, axisX, axisY = sp.histogram2d(ar1, ar2, (nbBin1,1))
    weight, axisX, axisY = sp.histogram2d(ar1, ar2, (nbBin1,1), weights=we2)
    mean,   axisX, axisY = sp.histogram2d(ar1, ar2, (nbBin1,1), weights=we2*ar2)
    err,    axisX, axisY = sp.histogram2d(ar1, ar2, (nbBin1,1), weights=we2*(ar2**2.))

    ### find the axis X
    axisX = sp.array([ axisX[i]+(axisX[i+1]-axisX[i])/2. for i in range(0,axisX.size-1) ])
    
    ### Get only not empty bins
    bool_number = (number[:,0]>1)
    
    axisX  = axisX[bool_number]
    number = number[:,0][bool_number]
    weight = weight[:,0][bool_number]
    mean   = mean[:,0][bool_number]
    err    = err[:,0][bool_number]

    mean  = mean/weight
    err   = sp.sqrt((err/weight-mean**2.)/number)

    return axisX, mean, err, number

Magnitude vs. redshift



In [ ]:

    
bins = sp.arange(1.5, 5., 0.01)
for i in range(5):
    axisX, mean, err, number = Get_TProfile(z,mag[:,i], bins)
    plt.plot(axisX, mean,linewidth=4,label=r'$\mathrm{'+name_mag[i]+'}$')
plt.xlabel(r'$z$')
plt.ylabel(r'$mag$')
plt.legend(loc=2)
plt.grid()

Magnitude on the sky



In [ ]:

    
for i in range(5):        
    extent = [ra.min(), ra.max(), dec.min(), dec.max()]
    mean, err, number = Get_2DTProfile(ra, dec, mag[:,i], 200, 100)
    mean = sp.transpose(mean)
    number = sp.transpose(number)
    mean[number==0.] = sp.nan
    plt.imshow(mean, interpolation='nearest', origin='lower',extent=extent, aspect='auto')
    cbar = plt.colorbar()
    cbar.set_label(r'$<\mathrm{'+name_mag[i]+'}>$',size=20)
    cbar.update_ticks()
    plt.xlabel(r'$\mathrm{R.A.} \, [\mathrm{deg}]$')
    plt.ylabel(r'$\mathrm{Dec.} \, [\mathrm{deg}]$')
    plt.grid()
    plt.show()

Lya Transmission



In [ ]:

    
### Get all transmission files
fs = sp.sort(glob.glob(mock_base+'/*/*/transmission-*-*.fits'))
fs = fs[:10]
lLYA = 1215.67
lminforest = 1215.67



In [ ]:

    
### Along the observed wavelength 
lObs_min = 3000.
lObs_max = 8000.
lObs_stack = sp.arange(lObs_min,lObs_max,1.)
z_stack = lObs_stack/lLYA-1.
T_stack = sp.zeros(lObs_stack.size)
TVar_stack = sp.zeros(lObs_stack.size)
n_stack = sp.zeros(lObs_stack.size)
T_stack_delta = sp.zeros(lObs_stack.size)
TVar_stack_delta = sp.zeros(lObs_stack.size)
n_stack_delta = sp.zeros(lObs_stack.size)



In [ ]:

    
### Along the rest-frame wavelength
lRF_min = 100.
lRF_max = 4000.
lRF_stack = sp.arange(lRF_min,lRF_max,1.)
T_lRF_stack = sp.zeros(lRF_stack.size)
TVar_lRF_stack = sp.zeros(lRF_stack.size)
n_lRF_stack = sp.zeros(lRF_stack.size)
T_lRF_stack_delta = sp.zeros(lRF_stack.size)
TVar_lRF_stack_delta = sp.zeros(lRF_stack.size)
n_lRF_stack_delta = sp.zeros(lRF_stack.size)



In [ ]:

    
### Get the transmission
for f in fs:
    h = fitsio.FITS(f)
    z = h['METADATA']['Z'][:]
    lObs = h['WAVELENGTH'].read()
    trans = h['TRANSMISSION'].read()
    lRF = lObs/(1.+z[:,None])
    lObs = lObs*sp.ones(z.size)[:,None]
    w = lRF<lminforest
    lObs = lObs[w]
    lRF = lRF[w]
    trans = trans[w]

    bins = ( ( lObs-lObs_min )/(lObs_max-lObs_min)*lObs_stack.size ).astype(int)
    T_stack += sp.bincount(bins,weights=trans,minlength=lObs_stack.size)
    TVar_stack += sp.bincount(bins,weights=trans**2,minlength=lObs_stack.size)
    n_stack += sp.bincount(bins,minlength=lObs_stack.size)

    bins = ( ( lRF-lRF_min )/(lRF_max-lRF_min)*lRF_stack.size ).astype(int)
    T_lRF_stack += sp.bincount(bins,weights=trans,minlength=lRF_stack.size)
    TVar_lRF_stack += sp.bincount(bins,weights=trans**2,minlength=lRF_stack.size)
    n_lRF_stack += sp.bincount(bins,minlength=lRF_stack.size)



In [ ]:

    
### Normalize the vectors (obseved wavelength)
w = n_stack>0.
T_stack[w] /= n_stack[w]
TVar_stack[w] /= n_stack[w]
TVar_stack -= T_stack**2
trans_vs_lObs = interp1d(lObs_stack[w],T_stack[w],fill_value='extrapolate',kind='nearest')

### Normalize the vectors (rest-frame wavelength)
w = n_lRF_stack>0.
T_lRF_stack[w] /= n_lRF_stack[w]
TVar_lRF_stack[w] /= n_lRF_stack[w]
TVar_lRF_stack -= T_lRF_stack**2



In [ ]:

    
### Get the delta
for f in fs:
    h = fitsio.FITS(f)
    z = h['METADATA']['Z'][:]
    lObs = h['WAVELENGTH'].read()
    trans = h['TRANSMISSION'].read()
    h.close()
    lRF = lObs/(1.+z[:,None])
    lObs = lObs*sp.ones(z.size)[:,None]
    w = lRF<lminforest
    lObs = lObs[w]
    lRF = lRF[w]
    trans = trans[w]/trans_vs_lObs(lObs)-1.

    bins = ( ( lObs-lObs_min )/(lObs_max-lObs_min)*lObs_stack.size ).astype(int)
    T_stack_delta += sp.bincount(bins,weights=trans,minlength=lObs_stack.size)
    TVar_stack_delta += sp.bincount(bins,weights=trans**2,minlength=lObs_stack.size)
    n_stack_delta += sp.bincount(bins,minlength=lObs_stack.size)

    bins = ( ( lRF-lRF_min )/(lRF_max-lRF_min)*lRF_stack.size ).astype(int)
    T_lRF_stack_delta += sp.bincount(bins,weights=trans,minlength=lRF_stack.size)
    TVar_lRF_stack_delta += sp.bincount(bins,weights=trans**2,minlength=lRF_stack.size)
    n_lRF_stack_delta += sp.bincount(bins,minlength=lRF_stack.size)



In [ ]:

    
### Normalize the vectors (obseved wavelength)
w = n_stack_delta>0.
T_stack_delta[w] /= n_stack_delta[w]
TVar_stack_delta[w] /= n_stack_delta[w]
TVar_stack_delta -= T_stack_delta**2

### Normalize the vectors (rest-frame wavelength)
w = n_lRF_stack_delta>0.
T_lRF_stack_delta[w] /= n_lRF_stack_delta[w]
TVar_lRF_stack_delta[w] /= n_lRF_stack_delta[w]
TVar_lRF_stack_delta -= T_lRF_stack_delta**2



In [ ]:

    
### from measurement on data
### The input mean flux that we were trying to match corresponds
### to F(z=2.25)=0.8, and a slope in optical depth of (1+z)^3.2.
A = -sp.log(0.8)/(1.+2.25)**3.2
B = 3.2
print(' optical depth = ', A, B)
T_measure = sp.exp(-A*(1.+z_stack)**B)
DVar_measure = 0.0018*(1.+z_stack)**(2.*2.9-2.)
TVar_measure = DVar_measure*T_measure**2

Transmission vs. redshift



In [ ]:

    
w = n_stack>0.
plt.plot(z_stack[w],T_stack[w],linewidth=2,label=r'$\mathrm{Mocks: \, T}$')
plt.plot(z_stack[w],T_measure[w],linewidth=2,label=r'$T = e^{-0.0051(1+z)^{3.2}}$')
w = n_stack_delta>0.
plt.plot(z_stack[w],T_stack_delta[w],linewidth=2,label=r'$\mathrm{Mocks: \, \delta}$')
plt.xlabel(r'$z$')
plt.ylabel(r'$T(z)$')
plt.legend()
plt.grid()
_=plt.ylim(-1,1)

Variance of the transmission vs. of redshift



In [ ]:

    
w = n_stack>0.
plt.plot(z_stack[w],TVar_stack[w],linewidth=2,label=r'$\mathrm{Mocks:\,T}$')
plt.plot(z_stack[w],TVar_measure[w],linewidth=2,label=r'$Var(T) = 0.0018 (1+z)^{2*2.9-2} * <T(z)>^{2}$')
w = n_stack_delta>0.
plt.plot(z_stack[w],TVar_stack_delta[w],linewidth=2,label=r'$\mathrm{Mocks:\,\delta}$')
plt.plot(z_stack[w],DVar_measure[w],linewidth=2,label=r'$Var(\delta) = 0.0018 (1+z)^{2*2.9-2}$')
plt.xlabel(r'$z$')
plt.ylabel(r'$Var(T)(z)$')
plt.legend()
plt.grid()
_=plt.ylim(0,1)

Transmission vs. rest-frame wavelength



In [ ]:

    
w = n_lRF_stack>0.
plt.plot(lRF_stack[w],T_lRF_stack[w],linewidth=2,label=r'$\mathrm{Mocks:\,T}$')
w = n_lRF_stack_delta>0.
plt.plot(lRF_stack[w],T_lRF_stack_delta[w],linewidth=2,label=r'$\mathrm{Mocks:\,\delta}$')
plt.xlabel(r'$\lambda_{\mathrm{R.F.}}$')
plt.ylabel(r'$T$')
plt.legend()
plt.ylim(-0.2,1)
plt.grid()

Variance of the transmission vs. rest-frame wavelength



In [ ]:

    
w = n_lRF_stack>0.
plt.plot(lRF_stack[w],TVar_lRF_stack[w],linewidth=2,label=r'$\mathrm{Mocks:\,T}$')
w = n_lRF_stack_delta>0.
plt.plot(lRF_stack[w],TVar_lRF_stack_delta[w],linewidth=2,label=r'$\mathrm{Mocks:\,\delta}$')
plt.xlabel(r'$\lambda_{\mathrm{R.F.}}$')
plt.ylabel(r'$Var(T)$')
plt.legend()
plt.grid()

Example spectrum



In [ ]:

    
### Get the different path to spectrum
trans_path = mock_base+'/0/0/transmission-16-0.fits'
spec_path = mock_base+quick_ver+'/spectra-16/0/0/spectra-16-0.fits'



In [ ]:

    
### Get delta nside
h = fitsio.FITS(mock['delta_attributes'])
nside = h[1].read_header()['NSIDE']
h.close()



In [ ]:

    
### Get the spectrum
h = fitsio.FITS(spec_path)
hra = h['FIBERMAP']['RA_TARGET'][:]
hde = h['FIBERMAP']['DEC_TARGET'][:]
tid = h['FIBERMAP']['TARGETID'][:]
b_ll = sp.log10(h['B_WAVELENGTH'].read())
b_iv = h['B_IVAR'].read()*(h['B_MASK'].read()==0)
b_fl = h['B_FLUX'].read()
r_ll = sp.log10(h['R_WAVELENGTH'].read())
r_iv = h['R_IVAR'].read()*(h['R_MASK'].read()==0)
r_fl = h['R_FLUX'].read()
z_ll = sp.log10(h['Z_WAVELENGTH'].read())
z_iv = h['Z_IVAR'].read()*(h['Z_MASK'].read()==0)
z_fl = h['Z_FLUX'].read()



In [ ]:

    
### Get Transmission
h = fitsio.FITS(trans_path)
z = h['METADATA']['Z'][:]
ra = h['METADATA']['RA'][:]
de = h['METADATA']['DEC'][:]
lObs = h['WAVELENGTH'].read()
trans = h['TRANSMISSION'].read()
h.close()
lRF = lObs/(1.+z[:,None])
lObs = lObs*sp.ones(z.size)[:,None]



In [ ]:

    
### Plot the first n spectra
for idx in range(1):
    ra_idx = hra[idx]
    de_idx = hde[idx]
    z_idx = z[ra==hra[idx]][0]
    tid_idx = tid[idx]

    delt_pix = healpy.ang2pix(nside, sp.pi/2.-de_idx*sp.pi/180., ra_idx*sp.pi/180.)
    delta = fitsio.FITS(mock['delta']+'/delta-'+str(delt_pix)+'.fits.gz')
    deltaa = delta[str(tid_idx)]

    nidx = sp.arange(ra.size)[ra==ra_idx][0]
    plt.plot(10**b_ll,b_fl[idx,:],color='blue',label=r'$\mathrm{Spectrum B}$',linewidth=1)
    plt.plot(10**r_ll,r_fl[idx,:],color='red',label=r'$\mathrm{Spectrum R}$',linewidth=1)
    plt.plot(10**z_ll,z_fl[idx,:],color='orange',label=r'$\mathrm{Spectrum Z}$',linewidth=1)
    plt.plot(lObs[nidx,:],trans[nidx,:],color='black',label=r'$\mathrm{Transmission}$',linewidth=1)
    plt.plot(10**deltaa['LOGLAM'][:],deltaa['DELTA'][:],color='green',label=r'$\mathrm{Delta}$',linewidth=1)
    plt.plot(10**deltaa['LOGLAM'][:],deltaa['CONT'][:],color='violet',label=r'$\mathrm{Continuum}$',linewidth=1)
    
    plt.title(r'$z_{q} = '+str(z[ra==hra[idx]][0])+'$')
    plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA]$')
    plt.ylabel(r'$Flux$')
    plt.legend(ncol=2,fontsize=10)
    plt.grid()

The delta field



In [ ]:

    
### List of all the delta attributes functions to plot
all_delta_attributes = {'Mock':mock['delta_attributes'], 'Data':data['delta_attributes']}

Flux stack vs. observed wavelength



In [ ]:

    
for name,path in all_delta_attributes.items():
    h = fitsio.FITS(path)
    loglam = h[1]['LOGLAM'][:]
    stack  = h[1]['STACK'][:]
    w = h[1]['WEIGHT'][:]>0.
    loglam = loglam[w]
    stack  = stack[w]
    plt.plot(10.**loglam, stack, linewidth=4,marker='o',label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA]$')
plt.ylabel(r'$\mathrm{\overline{Flux}}$')
plt.legend()
plt.grid()

Mean continuum



In [ ]:

    
for name,path in all_delta_attributes.items():
    h = fitsio.FITS(path)
    loglam_rest = h[3]['LOGLAM_REST'][:]
    mean_cont = h[3]['MEAN_CONT'][:]
    w = h[3]['WEIGHT'][:]>0.
    loglam_rest = loglam_rest[w]
    mean_cont = mean_cont[w]
    plt.plot(10.**loglam_rest, mean_cont, linewidth=4,marker='o',label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{R.F.}} \, [\AA]$')
plt.ylabel(r'$\mathrm{\overline{Flux}}$')
plt.legend()
plt.grid()

Correction to the pipeline variance



In [ ]:

    
for name,path in all_delta_attributes.items():
    h = fitsio.FITS(path)
    loglam = h[2]['LOGLAM'][:]
    eta = h[2]['ETA'][:]
    w = h[2]['NB_PIXELS'][:]>0.
    loglam = loglam[w]
    eta = eta[w]
    plt.errorbar(10.**loglam, eta, linewidth=4,label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA]$')
plt.ylabel(r'$\eta$')
plt.legend()
plt.grid()

Large Scale Structure variance



In [ ]:

    
for name,path in all_delta_attributes.items():
    h = fitsio.FITS(path)
    loglam = h[2]['LOGLAM'][:]
    var_lss = h[2]['VAR_LSS'][:]
    w = h[2]['NB_PIXELS'][:]>0.
    loglam = loglam[w]
    var_lss = var_lss[w]
    plt.errorbar(10.**loglam, var_lss, linewidth=4,label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA]$')
plt.ylabel(r'$\sigma^{2}_{\mathrm{LSS}}$')
plt.legend()
plt.grid()

Let's look at the contribution of the variation of the continuum



In [ ]:

    
for name,path in all_delta_attributes.items():
    h = fitsio.FITS(path)
    loglam = h[2]['LOGLAM'][:]
    fudge = h[2]['FUDGE'][:]
    w = h[2]['NB_PIXELS'][:]>0
    loglam = loglam[w]
    fudge = fudge[w]
    plt.errorbar(10.**loglam, fudge, linewidth=4,label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA]$')
plt.ylabel(r'$\mathrm{Fudge}$')
plt.legend()
plt.grid()

1D Correlation



In [ ]:

    
### List of all the correlation to plot
all_cf1d = {'Mock':mock['cf1d'], 'Data':data['cf1d']}

The variance of delta in the forest



In [ ]:

    
### The variance
for name,path in all_cf1d.items():
    try:
        h = fitsio.FITS(path)
    except:
        print('Couldnt find file {}, skipping'.format(path))
        continue
    head = h[1].read_header() #- in the future replace '1' by '1DCOR'
    llmin = head['LLMIN']
    llmax = head['LLMAX']
    dll = head['DLL']
    n1d = int((llmax-llmin)/dll+1)
    x = sp.arange(0.,n1d)*dll+llmin
    y = h[1]['v1d'][:]
    w = h[1]['wv1d'][:]>0.
    x = x[w]
    y = y[w]
    plt.plot(10.**x,y,linewidth=4,label=name)
    h.close()
plt.xlabel(r'$\lambda_{\mathrm{Obs.}} \, [\AA] $')
plt.ylabel(r'$\xi^{ff,1D}$')
plt.legend()
plt.grid()

The normalized correlation in the forest



In [ ]:

    
### The normalized correlation
f, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, figsize=(14,14))

### all
for name,path in all_cf1d.items():
    try:
        h = fitsio.FITS(path)
    except:
        print('cant find {}, skipping'.format(path))
        continue
    y = h[1]['c1d'][:] #- in the future replace '1' by '1DCOR'
    binsize = dll
    bins = sp.arange(y.size)
    x = sp.power(10,bins*binsize)
    w = h[1]['nc1d'][:]>0.
    x = x[w]
    y = y[w]
    ax1.errorbar(x,y,marker='o',linewidth=4,label=name)
    h.close()
ax1.set_xlabel(r'$\lambda_{1}/\lambda_{2}$')
ax1.set_ylabel(r'$\xi^{ff,1D}_{normed}(1,2)$')
ax1.legend()
ax1.grid()

### zoom
ymin = 1.e6
ymax = -1.e6
for name,path in all_cf1d.items():
    try:
        h = fitsio.FITS(path)
    except:
        print("Can't find file {}, skipping".format(path))
        continue
    y = h[1]['c1d'][:] #- in the future replace '1' by '1DCOR'
    binsize = dll
    bins = sp.arange(y.size)
    x = sp.power(10,bins*binsize)
    w = h[1]['nc1d'][:]>0.
    x = x[w]
    y = y[w]
    ymin = min(ymin,y.min())
    ymax = max(ymax,y[y!=1.].max())
    ax2.errorbar(x,y,marker='o',linewidth=4,label=name)
    h.close()
ax2.set_xlim([0.999,1.05])
ax2.set_ylim([ymin-0.05,ymax+0.05])
ax2.set_xlabel(r'$\lambda_{1}/\lambda_{2}$')
ax2.set_ylabel(r'$\xi^{ff,1D}_{normed}(1,2)$')
ax2.legend()
ax2.grid()

plt.subplots_adjust(hspace=0.4)

Fit results

The cell below will print out a table with the fit parameters for the auto results. If you want the cross results instead change the first line to

fi = sorted(glob.glob("xcf*.h5"))



In [ ]:

    
fi = sorted(glob.glob(mock['raw_path']+'/cf_z_2.2_2.4-exp.h5'))
#fi = [mock['raw_path']+'/cf_z_{}_{}-exp.h5'.format(round(zmin,3),round(zmin+0.2,3)) for zmin in sp.arange(2.2,3,0.2)]
ff = h5py.File(fi[0],'r')
cosmo_pars = ["bias_LYA","beta_LYA","bias_QSO","beta_QSO","ap","at"]
all_pars = [p.decode('utf-8') for p in ff['best fit'].attrs['list of free pars']]
bname0 = fi[0].replace(".h5","")
ff.close()

print("{:20}".format(""),end="")
for f in fi:
    print("{:^20}".format(os.path.basename(f).replace("-exp.h5","")),end="")
print("")

print("{:20}".format("zeff"),end="")
for f in fi:
    ff = h5py.File(f,'r')
    try:
        zeff = ff['best fit'].attrs['zeff']
    except:
        print("PROBLEM: {}".format(f))
        continue
    out = "{}".format(round(zeff,2))
    print("{:^20}".format(out),end="")
    ff.close()
print("")

for par in cosmo_pars:
    print("{:20}".format(par),end="")
    for f in fi:
        ff = h5py.File(f,'r')
        if par in ff['best fit'].attrs:
            p,dp = ff['best fit'].attrs[par]
            out = "{} +/- {}".format(round(p,3), round(dp,3))
            print("{:^20}".format(out),end="")
        ff.close()
    print("")
    
for par in all_pars:
    if par in cosmo_pars:
        continue
    print("{:20}".format(par[:20]),end="")
    for f in fi:
        ff = h5py.File(f,'r')
        bname = f.replace(".h5","")
        pa = par.replace(bname0,bname)
        if pa in ff['best fit'].attrs:
            p,dp = ff['best fit'].attrs[pa]
            out = "{} +/- {}".format(round(p,3), round(dp,3))
            print("{:^20}".format(out),end="")
        ff.close()
    print("")
print("{:20}".format("chi2/(ndata-npar)"),end="")
for f in fi:
    ff = h5py.File(f,'r')
    chi2 = ff['best fit'].attrs['fval']
    ndata = ff['best fit'].attrs['ndata']
    npar = ff['best fit'].attrs['npar']
    out = "{}/({}-{})".format(round(chi2,1),ndata,npar)
    print("{:^20}".format(out),end="")
    ff.close()
print("")
print("{:20}".format("probability"),end="")
for f in fi:
    ff = h5py.File(f,'r')
    chi2 = ff['best fit'].attrs['fval']
    ndata = ff['best fit'].attrs['ndata']
    npar = ff['best fit'].attrs['npar']
    proba = 1.-sp.stats.chi2.cdf(chi2,ndata-npar)
    out = "{}".format(round(proba,2))
    print("{:^20}".format(out),end="")
    ff.close()
print("")

Wedge plots

The cells below display wedge plots for a given file.

Change the value of base to the file you wish to display.



In [ ]:

    
base = 'cf_xcf_z_0_10-exp'
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/'+base+'.fits', 
                  'FIT':mock['raw_path']+'/'+base+'.h5', 
                  'NAME':base }
#toPlot['Mock'] = { 'CORR':mock['raw_path']+'/'+base+'.fits', 
#                  'FIT':mock['raw_path']+'/cf_xcf_z_0_10-exp.h5', 
#                  'NAME':base }
#toPlot['Data'] = { 'CORR':data['cf'], 
#                  'FIT':data['fit_cf'], 
#                  'NAME': 'LYA(LYA)xLYA(LYA)'}



In [ ]:

    
mus=[0., 0.5, 0.8, 0.95, 1.]
for mumin,mumax in zip(mus[:-1],mus[1:]):
    for name,paths in toPlot.items():
        h = fitsio.FITS(paths['CORR'])
        ff = h5py.File(paths['FIT'],'r')
        fit = ff[paths['NAME']+'/fit'][...]
        ff.close()
        da = h[1]['DA'][:]
        co = h[1]['CO'][:]
        rpmin = h[1].read_header()['RPMIN']
        np = h[1].read_header()['NP']
        h.close()
        ff.close()
        b = wedgize.wedge(mumin=mumin,mumax=mumax,rpmin=rpmin,nrp=np,absoluteMu=True)
        r,d,c = b.wedge(da,co)
        r,f,_ = b.wedge(fit,co)
        plt.errorbar(r,d*r**2,yerr=sp.sqrt(c.diagonal())*r**2,fmt="o",label=r'$\mathrm{'+name+'}$')
        plt.plot(r,f*r**2)
    plt.ylabel(r"$r^2\xi(r)$")
    plt.xlabel(r"$r~[\mathrm{Mpc/h}]$")
    plt.title(r"${}<\mu<{}$".format(mumin,mumax),fontsize=16)
    plt.legend()
    plt.grid()
    plt.show()



In [ ]:

    
## draw contours if they are there
from picca.constants import cosmo
zeff = 2.4
#h_dr9 = 0.7
Om_fid = 0.3147
#cosmo_dr9 = cosmo(0.27)
#DH_dr9 = c/h_dr9/cosmo_dr9.hubble(zeff)
#DM_dr9 = cosmo_dr9.r_comoving(zeff)/h_dr9
#rs_dr9 = 149.66 ## Mpc

h = 0.6731
cosmo = cosmo(Om_fid)
rs_fid = 147.33 ## Mpc
c = 300000 ## speed of light in km/s
DH_fid = c/h/cosmo.hubble(zeff)
DM_fid = cosmo.r_comoving(zeff)/h
ff = h5py.File(toPlot['Mock']['FIT'],'r')
chi2_min = ff['best fit'].attrs['fval']
ap_best = ff['best fit'].attrs['ap']
at_best = ff['best fit'].attrs['at']
DH_best = ap_best*DH_fid
DM_best = at_best*DM_fid
ap_min = ff['chi2 scan/ap'].attrs['min']
ap_max = ff['chi2 scan/ap'].attrs['max']
nap = ff['chi2 scan/ap'].attrs['nb_bin']
ap = sp.linspace(ap_min,ap_max,nap)
DH = ap*DH_fid
at_min = ff['chi2 scan/at'].attrs['min']
at_max = ff['chi2 scan/at'].attrs['max']
nat = ff['chi2 scan/ap'].attrs['nb_bin']
at = sp.linspace(at_min,at_max,nat)
DM = at*DM_fid
index = ff['chi2 scan/result'].attrs['fval']
plt.contour(DM/rs_fid,DH/rs_fid,ff['chi2 scan/result/values'][...][:,index].reshape(nat,-1)-chi2_min,
           levels=[2.3,6,11.8],colors=['b','b','b'])
plt.errorbar([DM_best[0]/rs_fid],[DH_best[0]/rs_fid],
             xerr=[DM_best[1]/rs_fid],yerr=[DH_best[1]/rs_fid],fmt="o")
plt.plot([DM_fid/rs_fid],[DH_fid/rs_fid],"o",ms=10)
#plt.plot([DM_dr9/rs_dr9],[DH_dr9/rs_dr9],"o",ms=10)
plt.xlabel(r'$D_M/r_s$',fontsize=40)
plt.ylabel(r'$D_H/r_s$',fontsize=40)
plt.grid()
ff.close()

Residuals

The following cells plot the residuals = (data-fit)/sigma in 2D and their histogram.



In [ ]:

    
h = fitsio.FITS(mock['raw_path']+"/"+base+".fits")
rpmin = h[1].read_header()["RPMIN"]
rtmin = 0
rpmax = h[1].read_header()["RPMAX"]
rtmax = h[1].read_header()["RTMAX"]
nrp = h[1].read_header()["NP"]
nrt = h[1].read_header()["NT"]
da = h[1]['DA'][:]
rp = h[1]['RP'][:]
rt = h[1]['RT'][:]
r = sp.sqrt(rp**2+rt**2)
dda = sp.sqrt(h[1]['CO'][:].diagonal())
ff = h5py.File(toPlot['Mock']['FIT'],'r')
fit = ff[base+"/fit"][...]
plt.imshow(((da-fit)/dda).reshape(nrt,nrp),
           extent=[rtmin,rtmax,rpmin,rpmax],
           origin='lower',interpolation='nearest',aspect='auto')
plt.colorbar()
plt.show()
_=plt.hist((da-fit)/dda,bins=100)
print('std: {}'.format(sp.std((da-fit)/dda)))



In [ ]:

    
### Hubble's diagram
c = 300000. ## speed of light in km/s
rs_fid = 147.33 ## sound horizon in Mpc/h (Planck's)

z = sp.arange(0,4,0.1)
plt.plot(z,cosmo.hubble(z)/(1+z),"--")
plt.ylabel(r"$\dot{a}\frac{r_s}{r^{fid}_s}\mathrm{[kms^{-1}Mpc^{-1}]}$",fontsize=20)
plt.xlabel("redshift")
fi = ["cf_z_2.2_2.4-exp","cf_z_2.4_2.6-exp","cf_z_2.6_2.8-exp",
      "cf_z_2.8_3.0-exp","cf_z_3.0_3.2-exp","cf_z_3.2_5.0-exp"]
fi += ["xcf_z_0_2.2-exp","xcf_z_2.2_2.4-exp",'xcf_z_2.6_2.8-exp']
fi = [mock['raw_path']+"/"+f+".fits" for f in fi]

for f in fi:
    h = fitsio.FITS(f)
    rp = h[1]['RP'][:]
    rt = h[1]['RT'][:]
    z  = h[1]['Z'][:]
    r = sp.sqrt(rp**2+rt**2)
    w = (r>80) & (r<120)
    zeff = z[w].mean()
    try:
        ff = h5py.File(f.replace('.fits','.h5'),'r')
    except:
        continue
    ap,dap = ff['best fit'].attrs['ap']
    H = cosmo.hubble(zeff)/(1+zeff)/ap
    dH = H*dap/ap
    style="ro"
    if 'xcf' in f:
        style="m^"
    plt.errorbar([zeff],[H],yerr=[dH],fmt=style)
plt.title("Jim's plot",fontsize=16)
plt.grid()

Slices plots



In [ ]:

    
base = 'cf_z_2.65_3.05-exp'
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/'+base+'.fits', 'FIT':mock['raw_path']+'/'+base+'.h5', 'NAME':base }
#toPlot['Data'] = { 'CORR':data['cf'], 'FIT':data['fit_cf'], 'NAME': 'LYA(LYA)xLYA(LYA)'}



In [ ]:

    
slices = sp.arange(10)
for s in slices:
    for name,paths in toPlot.items():
        h = fitsio.FITS(paths['CORR'])
        ff = h5py.File(paths['FIT'],'r')
        fit = ff[paths['NAME']+'/fit'][...]
        rp = h[1]['RP'][:]
        rt = h[1]['RT'][:]
        drt = h[1].read_header()['RTMAX'] / h[1].read_header()['NT']
        da = h[1]['DA'][:]
        er = sp.sqrt(sp.diag(h[1]['CO'][:]))
        h.close()
        ff.close()
        w = (rp>=s*drt) & (rp<(s+1)*drt)
        rp = rp[w]
        rt = rt[w]
        da = da[w]
        fit = fit[w]
        er = er[w]
        plt.errorbar(rt,da,yerr=er,fmt="o",label=r'$\mathrm{'+name+'}$')
        plt.plot(rt,fit)
        ff.close()
    plt.ylabel(r"$\xi(r)$")
    plt.xlabel(r"$r_{\parallel}~[\mathrm{Mpc/h}]$")
    plt.title(r"${}<r_\perp<{}$".format(int(s*drt),int((s+1)*drt)),fontsize=16)
    plt.legend()
    plt.grid()
    plt.xlim(0,25)
    plt.show()

The covariance matrices

Let's compare the covariance matrices of the data and of the mocks



In [ ]:

    
### The auto-correlation
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/xcf_z_0_2.35-exp.fits'}
#toPlot['Data'] = { 'CORR':data['cf']}
for name,paths in toPlot.items():
    h = fitsio.FITS(paths['CORR'])
    nb = h[1]['NB'][:]
    co = sp.diag(h[1]['CO'][:])
    plt.plot(nb*co,label=r'$\mathrm{'+name+'}$')
plt.xlabel(r"$\mathrm{bin\,index}$")
plt.ylabel(r"$C_{AA} N_{b}$")
plt.legend()
plt.grid()
plt.show()



In [ ]:

    
### The cross-correlation
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/xcf_z_2.65_3.05-exp.fits'}
#toPlot['Data'] = { 'CORR':data['xcf']}
for name,paths in toPlot.items():
    h = fitsio.FITS(paths['CORR'])
    nb = h[1]['NB'][:]
    co = sp.diag(h[1]['CO'][:])
    plt.plot(nb*co,label=r'$\mathrm{'+name+'}$')
plt.xlabel(r"$\mathrm{bin\,index}$")
plt.ylabel(r"$C_{AA} N_{b}$")
plt.legend()
plt.grid()

The correlation matrix

Let's look at the difference between the data and the mocks on the correlation matrix



In [ ]:

    
### The auto-correlation
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/cf_z_2.65_3.05-exp.fits'}
#toPlot['Data'] = { 'CORR':data['cf']}

### Load
corr = {}
for name,paths in toPlot.items():
    corr[name] = {}
    h = fitsio.FITS(paths['CORR'])
    corr[name]['CO'] = h[1]['CO'][:]
    tvar = sp.diag(corr[name]['CO'])
    corr[name]['NBBIN'] = tvar.size
    corr[name]['NT'] = h[1].read_header()['NT']
    corr[name]['BINSIZE'] = h[1].read_header()['RTMAX']/h[1].read_header()['NT']
    tcor = corr[name]['CO']/sp.sqrt(tvar*tvar[:,None])
    corr[name]['COR'] = tcor.copy()
    h.close()

### Plot
ymin = 1.
ymax = -1.
for j in range(2):
    for name,tc in corr.items():
        c = tc['COR'].copy()
        a = sp.array( [ sp.mean(sp.diagonal(c, offset=(tc['NT']*i+j))) for i in range(tc['NBBIN']//tc['NT']) ])
        label = '\mathrm{'+name+': \, \Delta r_{\perp} = '+str(int(tc['BINSIZE']*j))+'\,h^{-1}\, \mathrm{Mpc}}'
        plt.plot(tc['BINSIZE']*sp.arange(tc['NBBIN']//tc['NT']),a,linewidth=4,marker='o',markersize=8, label=r'$'+label+'$')
        ymin = min(ymin,a.min())
        ymax = max(ymax,a[a!=1.].max())
plt.xlabel(r'$\Delta r_{\parallel} \, [h^{-1}\, \mathrm{Mpc}]$')
plt.ylabel(r'$Corr_{AB}$')
plt.xlim([-1.,100.])
plt.ylim([ymin-0.01,ymax+0.01])
plt.legend()
plt.grid()
plt.show()



In [ ]:

    
### The cross-correlation
toPlot = {}
toPlot['Mock'] = { 'CORR':mock['raw_path']+'/xcf_z_2.65_3.05-exp.fits'}
#toPlot['Data'] = { 'CORR':data['xcf']}

### Load
corr = {}
for name,paths in toPlot.items():
    corr[name] = {}
    h = fitsio.FITS(paths['CORR'])
    corr[name]['CO'] = h[1]['CO'][:]
    tvar = sp.diag(corr[name]['CO'])
    corr[name]['NBBIN'] = tvar.size
    corr[name]['NT'] = h[1].read_header()['NT']
    corr[name]['BINSIZE'] = h[1].read_header()['RTMAX']/h[1].read_header()['NT']
    tcor = corr[name]['CO']/sp.sqrt(tvar*tvar[:,None])
    corr[name]['COR'] = tcor.copy()
    h.close()

### Plot
ymin = 1.
ymax = -1.
for j in range(2):
    for name,tc in corr.items():
        c = tc['COR'].copy()
        a = sp.array( [ sp.mean(sp.diagonal(c, offset=(tc['NT']*i+j))) for i in range(tc['NBBIN']//tc['NT']) ])
        label = '\mathrm{'+name+': \, \Delta r_{\perp} = '+str(int(tc['BINSIZE']*j))+'\,h^{-1}\, \mathrm{Mpc}}'
        plt.plot(tc['BINSIZE']*sp.arange(tc['NBBIN']//tc['NT']),a,linewidth=4,marker='o',markersize=8, label=r'$'+label+'$')
        ymin = min(ymin,a.min())
        ymax = max(ymax,a[a!=1.].max())
plt.xlabel(r'$\Delta r_{\parallel} \, [h^{-1}\, \mathrm{Mpc}]$')
plt.ylabel(r'$Corr_{AB}$')
plt.xlim([-1.,100.])
plt.ylim([ymin-0.01,ymax+0.01])
plt.legend()
plt.grid()
plt.show()



In [ ]: