ML match for LOFAR and the combined PanSTARRS WISE catalogue: Get all sources above the threshold

In this notebook the maximum likelihood cross-match between the LOFAR HETDEX catalogue and the combined PansSTARRS WISE catalogue is computed. However, we take all the sources that have likelihood ratios above the threshold found in the first run.

Configuration

Load libraries and setup


In [1]:
import numpy as np
from astropy.table import Table, join
from astropy import units as u
from astropy.coordinates import SkyCoord, search_around_sky
from IPython.display import clear_output
import pickle
import os

In [2]:
from mltier1 import (get_center, get_n_m, estimate_q_m, Field, MultiMLEstimatorU,
                     parallel_process, get_sigma_all, get_q_m)

In [3]:
%load_ext autoreload

In [4]:
%autoreload

In [5]:
from IPython.display import clear_output

In [6]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

General configuration


In [7]:
save_intermediate = True
plot_intermediate = True

In [8]:
idp = "idata/final"

In [9]:
if not os.path.isdir(idp):
    os.makedirs(idp)

Area limits


In [10]:
# Busy week Edinburgh 2017
ra_down = 172.09
ra_up = 187.5833
dec_down = 46.106
dec_up = 56.1611

In [11]:
# Busy week Hatfield 2017
ra_down = 170.
ra_up = 190.
dec_down = 46.8
dec_up = 55.9

In [12]:
# Full field July 2017
ra_down = 160.
ra_up = 232.
dec_down = 42.
dec_up = 62.

In [13]:
field = Field(170.0, 190.0, 46.8, 55.9)

In [14]:
field_full = Field(160.0, 232.0, 42.0, 62.0)

Load data


In [15]:
combined_all = Table.read("pw.fits")

In [16]:
lofar_all = Table.read("data/LOFAR_HBA_T1_DR1_catalog_v0.9.srl.fits")

In [17]:
np.array(combined_all.colnames)


Out[17]:
array(['AllWISE', 'objID', 'ra', 'dec', 'raErr', 'decErr', 'W1mag',
       'W1magErr', 'i', 'iErr'],
      dtype='<U8')

In [18]:
np.array(lofar_all.colnames)


Out[18]:
array(['Source_Name', 'RA', 'E_RA', 'E_RA_tot', 'DEC', 'E_DEC',
       'E_DEC_tot', 'Peak_flux', 'E_Peak_flux', 'E_Peak_flux_tot',
       'Total_flux', 'E_Total_flux', 'E_Total_flux_tot', 'Maj', 'E_Maj',
       'Min', 'E_Min', 'PA', 'E_PA', 'Isl_rms', 'S_Code', 'Mosaic_ID',
       'Isl_id'],
      dtype='<U16')

Filter catalogues


In [19]:
lofar = field_full.filter_catalogue(lofar_all, colnames=("RA", "DEC"))

In [20]:
combined = field_full.filter_catalogue(combined_all, 
                               colnames=("ra", "dec"))

Additional data


In [21]:
combined["colour"] = combined["i"] - combined["W1mag"]

In [22]:
combined_aux_index = np.arange(len(combined))

Sky coordinates


In [23]:
coords_combined = SkyCoord(combined['ra'], 
                           combined['dec'], 
                           unit=(u.deg, u.deg), 
                           frame='icrs')

In [24]:
coords_lofar = SkyCoord(lofar['RA'], 
                       lofar['DEC'], 
                       unit=(u.deg, u.deg), 
                       frame='icrs')

Class of sources in the combined catalogue

The sources are grouped depending on the available photometric data.


In [25]:
combined_matched = (~np.isnan(combined["i"]) & ~np.isnan(combined["W1mag"])) # Matched i-W1 sources

In [26]:
combined_panstarrs = (~np.isnan(combined["i"]) & np.isnan(combined["W1mag"])) # Sources with only i-band

In [27]:
combined_wise =(np.isnan(combined["i"]) & ~np.isnan(combined["W1mag"])) # Sources with only W1-band

In [28]:
combined_i = combined_matched | combined_panstarrs
combined_w1 = combined_matched | combined_wise
#combined_only_i = combined_panstarrs & ~combined_matched
#combined_only_w1 = combined_wise & ~combined_matched

In [29]:
print("Total    - ", len(combined))
print("i and W1 - ", np.sum(combined_matched))
print("Only i   - ", np.sum(combined_panstarrs))
print("With i   - ", np.sum(combined_i))
print("Only W1  - ", np.sum(combined_wise))
print("With W1  - ", np.sum(combined_w1))


Total    -  26674548
i and W1 -  8196213
Only i   -  13454849
With i   -  21651062
Only W1  -  5023475
With W1  -  13219688

Colour categories

The colour categories will be used after the first ML match


In [30]:
colour_limits = [-0.5, 0.0, 0.5, 1.0, 1.25, 1.5, 1.75, 2.0, 2.25, 2.5, 2.75, 3.0, 3.5, 4.0]

In [31]:
# Start with the W1-only, i-only and "less than lower colour" bins
colour_bin_def = [{"name":"only W1", "condition": combined_wise},
                  {"name":"only i", "condition": combined_panstarrs},
                  {"name":"-inf to {}".format(colour_limits[0]), 
                   "condition": (combined["colour"] < colour_limits[0])}]

# Get the colour bins
for i in range(len(colour_limits)-1):
    name = "{} to {}".format(colour_limits[i], colour_limits[i+1])
    condition = ((combined["colour"] >= colour_limits[i]) & 
                 (combined["colour"] < colour_limits[i+1]))
    colour_bin_def.append({"name":name, "condition":condition})

# Add the "more than higher colour" bin
colour_bin_def.append({"name":"{} to inf".format(colour_limits[-1]), 
                       "condition": (combined["colour"] >= colour_limits[-1])})


/disk2/jsm/prog/anaconda/envs/py36/lib/python3.6/site-packages/astropy/table/column.py:928: RuntimeWarning: invalid value encountered in less
  return getattr(self.data, oper)(other)
/disk2/jsm/prog/anaconda/envs/py36/lib/python3.6/site-packages/astropy/table/column.py:928: RuntimeWarning: invalid value encountered in greater_equal
  return getattr(self.data, oper)(other)

In [32]:
combined["category"] = np.nan
for i in range(len(colour_bin_def)):
    combined["category"][colour_bin_def[i]["condition"]] = i

In [33]:
np.sum(np.isnan(combined["category"]))


Out[33]:
11

We get the number of sources of the combined catalogue in each colour category. It will be used at a later stage to compute the $Q_0$ values


In [34]:
numbers_combined_bins = np.array([np.sum(a["condition"]) for a in colour_bin_def])

In [35]:
numbers_combined_bins


Out[35]:
array([ 5023475, 13454849,   614840,   434024,   679553,   911508,
         654963,   774322,   830266,   804997,   713558,   580621,
         438821,   309428,   319368,    95816,    34128])

Maximum Likelihood


In [36]:
bin_list, centers, Q_0_colour, n_m, q_m = pickle.load(open("lofar_params.pckl", "rb"))

In [37]:
likelihood_ratio_function = MultiMLEstimator(Q_0_colour, n_m, q_m, centers)

ML match


In [38]:
radius = 15

In [39]:
lr_threshold = 0.36

In [40]:
selection = ~np.isnan(combined["category"]) # Avoid the dreaded sources with no actual data
catalogue = combined[selection]

In [41]:
def apply_ml(i, likelihood_ratio_function):
    idx_0 = idx_i[idx_lofar == i]
    d2d_0 = d2d[idx_lofar == i]
    
    category = catalogue["category"][idx_0].astype(int)
    mag = catalogue["i"][idx_0]
    mag[category == 0] = catalogue["W1mag"][idx_0][category == 0]
    
    lofar_ra = lofar[i]["RA"]
    lofar_dec = lofar[i]["DEC"]
    lofar_pa = lofar[i]["PA"]
    lofar_maj_err = lofar[i]["E_Maj"]
    lofar_min_err = lofar[i]["E_Min"]
    c_ra = catalogue["ra"][idx_0]
    c_dec = catalogue["dec"][idx_0]
    c_ra_err = catalogue["raErr"][idx_0]
    c_dec_err = catalogue["decErr"][idx_0]
    
    sigma, sigma_maj, sigma_min = get_sigma_all(lofar_maj_err, lofar_min_err, lofar_pa, 
                      lofar_ra, lofar_dec, 
                      c_ra, c_dec, c_ra_err, c_dec_err)

    lr_0 = likelihood_ratio_function(mag, d2d_0.arcsec, sigma, sigma_maj, sigma_min, category)
    
    chosen_index = (lr_0 >= lr_threshold)
    result = [combined_aux_index[selection][idx_0[chosen_index]], # Index
              (d2d_0.arcsec)[chosen_index],                        # distance
              lr_0[chosen_index]]                                  # LR
    return result

Run the cross-match


In [42]:
idx_lofar, idx_i, d2d, d3d = search_around_sky(
    coords_lofar, coords_combined[selection], radius*u.arcsec)

In [43]:
idx_lofar_unique = np.unique(idx_lofar)

In [44]:
total_sources = len(idx_lofar_unique)

Run the ML matching


In [45]:
def ml(i):
    return apply_ml(i, likelihood_ratio_function)

In [46]:
res = parallel_process(idx_lofar_unique, ml, n_jobs=8)


100%|██████████| 312K/312K [2:28:19<00:00, 35.0it/s]
100%|██████████| 311874/311874 [00:00<00:00, 335946.24it/s]

Selection and match


In [47]:
lofar_aux_index = np.arange(len(lofar))

In [48]:
lr = []
lr_dist = []
lr_index = []
lr_order = []
lr_lofar_index = []

for i, idx in enumerate(idx_lofar_unique):
    result = res[i]
    n = len(result[0])
    lofar_index = lofar_aux_index[idx]
    if n > 0:
        order = np.argsort(result[2])[::-1]
        lr.extend(result[2][order])
        lr_dist.extend(result[1][order])
        lr_index.extend(result[0][order])
        lr_order.extend(np.arange(n, dtype=int) + 1)
        lr_lofar_index.extend(np.ones(n, dtype=int)*lofar_index)
    else:
        lr.append(np.nan)
        lr_dist.append(np.nan)
        lr_index.append(np.nan)
        lr_order.append(np.nan)
        lr_lofar_index.append(lofar_index)

In [49]:
aux_table = Table()
aux_table['aux_index'] = lr_lofar_index
aux_table['lr'] = lr
aux_table['lr_dist'] = lr_dist
aux_table['lr_index'] = lr_index
aux_table['lr_order'] = lr_order

In [50]:
aux_table


Out[50]:
<Table length=325770>
aux_indexlrlr_distlr_indexlr_order
int64float64float64float64float64
01224.065615560.25639326602910243809.01.0
1359.3782117420.5538360488510206441.01.0
2nannannannan
31719.367030.21210260936310195667.01.0
4529.0020755270.30059205896110171052.01.0
50.5807901353512.2431127513810318938.01.0
6614.1764290910.18120160523210320755.01.0
7112.0098933241.252085439610216972.01.0
8270.3929880581.2060631891510339350.01.0
85.673261378661.0770411174710339351.02.0
...............
325688nannannannan
325689nannannannan
3256900.4850059509813.239329535414562507.01.0
3256900.37148168006110.044136090726521592.02.0
3256911.53333942250.35909311848426352886.01.0
3256910.7372858336278.3612553912426275082.02.0
3256921.41009859878.9418094902314393277.01.0
3256920.57962947161310.82082530926657452.02.0
3256932.808143562810.635546637214625420.01.0
3256930.5539881803610.88005361814652394.02.0

In [51]:
lofar["aux_index"] = lofar_aux_index

In [52]:
lofar_lr = join(lofar, aux_table, join_type='outer', keys='aux_index')

In [53]:
lofar_lr


Out[53]:
<Table masked=True length=339587>
Source_NameRAE_RAE_RA_totDECE_DECE_DEC_totPeak_fluxE_Peak_fluxE_Peak_flux_totTotal_fluxE_Total_fluxE_Total_flux_totMajE_MajMinE_MinPAE_PAIsl_rmsS_CodeMosaic_IDIsl_idaux_indexlrlr_distlr_indexlr_order
degarcsecarcsecdegarcsecarcsecmJy / beammJy / beammJy / beammJymJymJyarcsecarcsecarcsecarcsecdegdegmJy / beam
str24float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64str1str8int32int64float64float64float64float64
ILTJ122108.441+491340.94185.2851687710.08507008411740.15001604900849.22803976780.0940250084340.1552694366561.487187766460.05254471658030.302043118361.451677235090.09187271628870.3045246587616.196006627070.2289610535095.673333506820.191652766315153.93537616218.49088864580.0529464996362SP22Hetde18801224.065615560.25639326602910243809.01.0
ILTJ122107.851+491951.84185.2827108840.4739381350070.48978081992149.33106902030.4749781054510.4907872211110.4942069018920.06585487722850.1187707174760.8479029903640.09754907310810.1956358885979.127065658441.357809573366.769322202050.80797100346646.459299742923.10851528360.0616828474449SP22Hetde1961359.3782117420.5538360488510206441.01.0
ILTJ122108.572+493121.89185.2857163410.5338480436110.54796133934449.522749150.3672642549360.3874931337490.3574840037830.05325748129580.08915240788890.4382941872440.0874845562910.1238451427087.790351975891.338677666445.667385151870.73200124510667.376945563423.85298030040.0519686545886SP22Hetde2022nannannannan
ILTJ122104.027+490823.80185.2667785780.06806097276890.14106803947949.13994513470.06895996035280.1415039641852.185011938760.05613023008480.4405924302922.392682735230.09436714370260.4877523804866.309829407690.1628934770776.249772884270.159758583039158.418653032107.2140192580.0549563665118SP22Hetde20431719.367030.21210260936310195667.01.0
ILTJ122105.030+491440.01185.2709600990.5850200921610.59792675469549.24444927740.5084827834230.5232806482050.2918240133940.05529496395720.08039890068130.3509853858280.09068133001590.1146766465157.26057368761.504402305175.965408470451.03365648739125.16279153444.79340492030.0535216240678SP22Hetde2054529.0020755270.30059205896110171052.01.0
ILTJ122106.722+492420.20185.2780097980.4762052561720.49197494016949.40561165210.3635688237490.3839924287670.3084829367420.05049886645170.07972831619530.3063042700940.08795540022440.1071869612586.408389749091.124319436095.579656778460.85225127660385.396163661652.85709132480.0507702061441SP22Hetde20650.5807901353512.2431127513810318938.01.0
ILTJ122106.172+492908.04185.2757175930.1838507061660.22151518662649.48556726480.2571207149350.2852699735280.6418525095130.05327387110740.1389859385030.6717351609820.0913779267850.1624778463786.730494914640.6112689853175.599466758290.42473626149512.468222960221.55089921930.0530273173354SP22Hetde2106614.1764290910.18120160523210320755.01.0
ILTJ122104.649+492207.60185.2693720380.5154251090370.53002918671949.36877913570.379257757510.3988788567720.3497703338090.05444369465020.0886435973080.3581764972340.09554048579050.1194135693797.278086841821.3548139515.066782181920.659476070658122.17040566622.06081574590.0554672114959SP22Hetde2117112.0098933241.252085439610216972.01.0
ILTJ122107.312+493759.86185.2804652850.5748670406530.58799660728449.63329437380.810226831250.8195946643460.3012448839560.05699385557140.08293514802110.5281455870870.08419662274530.1350799086779.556252296682.068687028746.606534348191.09240387499154.58528771727.26744484460.0537575579074SP22Hetde2128270.3929880581.2060631891510339350.01.0
....................................................................................
ILTJ134709.873+494747.92206.79113882310.260886324310.262144787949.79664503510.450935543710.4521711250.1368574756550.08011583740390.08466254277216.594460347540.08176161062041.3214239486242.28222021824.767068007341.025807880524.001590995326.9549901434179.8464500530.202775700018SP206+509964325688nannannannan
ILTJ134632.301+485421.18206.6345887826.090500559126.0926204992848.90588377388.203236949068.204811023420.05973795908740.01969219543110.02303318282684.181357357890.01997156330020.83650991477459.097231329519.569554895842.645176032713.9940561139166.4299742355.22006068940.0595674064243SP206+509965325689nannannannan
ILTJ134454.662+491951.32206.2277597798.719767134468.7212479821949.330922745410.094966255210.09624540020.05787809419240.02526616957450.0277916224153.617934676550.02566724320860.72404202939261.661101231127.150647816536.498467123115.796325427435.864870958653.39797947760.0725644931663SP206+5099673256900.37148168006110.044136090726521592.02.0
ILTJ134454.662+491951.32206.2277597798.719767134468.7212479821949.330922745410.094966255210.09624540020.05787809419240.02526616957450.0277916224153.617934676550.02566724320860.72404202939261.661101231127.150647816536.498467123115.796325427435.864870958653.39797947760.0725644931663SP206+5099673256900.4850059509813.239329535414562507.01.0
ILTJ134426.185+492250.83206.1091049937.353377475717.35513343149.38078812448.146332510118.14791757770.06465931223960.02165236679160.0252201913965.241648829450.02191783915561.0485588632659.588952573720.003238604148.979277686716.361623726428.837173697982.74506913570.0702290708432SP206+5099683256910.7372858336278.3612553912426275082.02.0
ILTJ134426.185+492250.83206.1091049937.353377475717.35513343149.38078812448.146332510118.14791757770.06465931223960.02165236679160.0252201913965.241648829450.02191783915561.0485588632659.588952573720.003238604148.979277686716.361623726428.837173697982.74506913570.0702290708432SP206+5099683256911.53333942250.35909311848426352886.01.0
ILTJ134416.177+484518.63206.0674055014.279048231314.282065068848.75517619322.351015506712.356501941530.1974909680780.02409024572490.046264967718419.40045903580.02433426863653.8801681134483.806414976910.299554716842.20843359825.11029566712103.0569354912.44920366640.0860225482029SP206+5099703256920.57962947161310.82082530926657452.02.0
ILTJ134416.177+484518.63206.0674055014.279048231314.282065068848.75517619322.351015506712.356501941530.1974909680780.02409024572490.046264967718419.40045903580.02433426863653.8801681134483.806414976910.299554716842.20843359825.11029566712103.0569354912.44920366640.0860225482029SP206+5099703256921.41009859878.9418094902314393277.01.0
ILTJ134311.277+491857.55205.79698707910.077480360610.078761724949.315986730511.224477456111.22562789560.1989566721450.1084107896790.1154826810169.425761169040.1106763928511.888398318357.402015833431.732898253429.716301714815.9602674185139.28539771358.54525409840.274070509477SP206+5099713256932.808143562810.635546637214625420.01.0
ILTJ134311.277+491857.55205.79698707910.077480360610.078761724949.315986730511.224477456111.22562789560.1989566721450.1084107896790.1154826810169.425761169040.1106763928511.888398318357.402015833431.732898253429.716301714815.9602674185139.28539771358.54525409840.274070509477SP206+5099713256930.5539881803610.88005361814652394.02.0

Save combined catalogue


In [54]:
combined["lr_index"] = combined_aux_index.astype(float)

In [55]:
for col in ['lr', 'lr_dist', 'lr_index', 'lr_order']:
    lofar_lr[col].fill_value = np.nan

In [56]:
pwl = join(lofar_lr.filled(), combined, join_type='left', keys='lr_index')

In [57]:
len(pwl)


Out[57]:
339587

In [58]:
pwl_columns = pwl.colnames

In [59]:
for col in pwl_columns:
    fv = pwl[col].fill_value
    #print(col, fv)
    if (isinstance(fv, np.float64) and (fv != 1e+20)):
        print(col, fv)
        pwl[col].fill_value = 1e+20


dec 1.0
W1mag 1.0
i 1.0
colour 1.0
category 1.0

In [60]:
columns_save = ['Source_Name', 'RA', 'E_RA', 'E_RA_tot', 'DEC', 'E_DEC', 'E_DEC_tot',
 'Peak_flux', 'E_Peak_flux', 'E_Peak_flux_tot', 'Total_flux', 'E_Total_flux', 'E_Total_flux_tot',
 'Maj', 'E_Maj', 'Min', 'E_Min', 'PA', 'E_PA', 'Isl_rms', 'S_Code', 'Mosaic_ID', 'Isl_id',
 'AllWISE', 'objID', 'ra', 'dec', 'raErr', 'decErr',
 'W1mag', 'W1magErr', 'i', 'iErr', 'colour', 'category',
 'lr', 'lr_dist', 'lr_order']

In [61]:
pwl[columns_save].filled().write('lofar_multi_lr_pw.fits', format="fits")

In [ ]: