Galaxias en VVV

Clasificacion con Machine Learning usando estrellas

Usamos un catálogo de galaxias identificadas en VVV en los tiles d010 y d0115 de Baravalle L.

Para saber donde estan ubicados los tiles usamos el mapa de VVV

En estos tiles encontraron 574 objetos con propiedades morfologicas, fotometricas y fotocromaticas propias de galaxias. 90 de los mismos han sido visualmente inspeccionados, y constituyen una muestra bona fide de galaxias en el VVV.

Analisis de los datos

Primero cargamos las librerias necesarias


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from IPython.display import display

from astropy.io import ascii
from astropy.table import Table, Column

%matplotlib inline

In [2]:
from astroquery.irsa_dust import IrsaDust

Leo el catalogo de los restos del tile 010:


In [3]:
colnames = "ALPHA  DELTA  MAG_PSF_Ks  MAGERR_PSF_Ks  MAG_AUTO_Ks MAGERR_AUTO_Ks MAG_APER_Ks MAGERR_APER_Ks MAG_MODEL_Ks MAGERR_MODEL_Ks SPREAD_MODEL AMODEL_IMAGE BMODEL_IMAGE ELONGATION ELLIPTICITY A_IMAGE B_IMAGE KRON_RADIUS FLUX_RADIUS_02 FLUX_RADIUS_051 FLUX_RADIUS_08 SPHEROID_SERSICN CLASS_STAR MAG_PSF_H MAGERR_PSF_H MAG_AUTO_H MAGERR_AUTO_H MAG_APER_H MAGERR_APER_H MAG_PSF_J MAGERR_PSF_J MAG_AUTO_J MAGERR_AUTO_J MAG_APER_J MAGERR_APER_J C".split()
print colnames


['ALPHA', 'DELTA', 'MAG_PSF_Ks', 'MAGERR_PSF_Ks', 'MAG_AUTO_Ks', 'MAGERR_AUTO_Ks', 'MAG_APER_Ks', 'MAGERR_APER_Ks', 'MAG_MODEL_Ks', 'MAGERR_MODEL_Ks', 'SPREAD_MODEL', 'AMODEL_IMAGE', 'BMODEL_IMAGE', 'ELONGATION', 'ELLIPTICITY', 'A_IMAGE', 'B_IMAGE', 'KRON_RADIUS', 'FLUX_RADIUS_02', 'FLUX_RADIUS_051', 'FLUX_RADIUS_08', 'SPHEROID_SERSICN', 'CLASS_STAR', 'MAG_PSF_H', 'MAGERR_PSF_H', 'MAG_AUTO_H', 'MAGERR_AUTO_H', 'MAG_APER_H', 'MAGERR_APER_H', 'MAG_PSF_J', 'MAGERR_PSF_J', 'MAG_AUTO_J', 'MAGERR_AUTO_J', 'MAG_APER_J', 'MAGERR_APER_J', 'C']

In [4]:
d010 = ascii.read('./restos/RESTO_d010.cat', names=colnames)

In [5]:
d010


Out[5]:
<Table length=751862>
ALPHADELTAMAG_PSF_KsMAGERR_PSF_KsMAG_AUTO_KsMAGERR_AUTO_KsMAG_APER_KsMAGERR_APER_KsMAG_MODEL_KsMAGERR_MODEL_KsSPREAD_MODELAMODEL_IMAGEBMODEL_IMAGEELONGATIONELLIPTICITYA_IMAGEB_IMAGEKRON_RADIUSFLUX_RADIUS_02FLUX_RADIUS_051FLUX_RADIUS_08SPHEROID_SERSICNCLASS_STARMAG_PSF_HMAGERR_PSF_HMAG_AUTO_HMAGERR_AUTO_HMAG_APER_HMAGERR_APER_HMAG_PSF_JMAGERR_PSF_JMAG_AUTO_JMAGERR_AUTO_JMAG_APER_JMAGERR_APER_JC
float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64float64
203.9948-63.784517.06270.0617.11520.0717.29940.0716.7010.10.00219.350.1482.040.5091.540.7573.50.9030.532.579.8940.8417.820.060917.750.081117.880.066818.460.084118.710.093318.910.08952.273
204.0021-63.803317.57260.0918.0470.1218.09570.1317.46430.1-0.00460.750.4451.190.160.810.683.50.6430.371.4910.00.79517.990.070518.060.095218.080.082219.10.117619.430.163319.420.14081.822
204.0225-63.855517.14640.1117.37940.1217.47880.0816.8390.740.00736.421.2872.110.5251.570.7484.81.0440.592.553.9390.60817.890.07317.590.095718.020.078219.150.113118.760.137919.180.11471.939
204.047-63.918117.19150.0717.51740.0917.51740.0816.91570.270.00221.450.3291.590.3731.210.7623.50.7570.431.915.0550.89316.010.027116.430.030316.590.022318.180.096918.380.089518.670.07412.011
204.0511-63.928717.88820.118.26750.1418.38410.1716.41230.41-0.0083506.355.4611.220.1840.820.673.50.6110.361.49.9950.62718.190.081818.270.118818.310.106818.770.086919.030.107519.030.10131.797
204.058-63.946216.93690.0617.39140.0817.40160.0716.78570.06-0.00130.30.2891.580.3661.20.7593.50.660.41.671.090.94817.170.037917.350.057417.410.044918.480.0918.840.114518.950.09332.011
204.1018-64.056518.04360.1218.14160.1418.17390.1617.45780.67-0.003711.60.4441.260.210.840.6653.50.7390.441.975.5850.60518.170.078817.930.104718.170.094218.790.077918.560.116118.810.08762.127
204.0016-63.801816.64250.0416.75360.0616.84510.0516.31980.230.00062.240.4651.50.3321.430.9543.50.8280.482.165.8070.99116.810.026216.690.038416.870.028917.860.048118.110.059918.210.05062.082
204.069-63.974116.41140.0416.81860.0516.85280.0516.26260.04-0.0010.310.2941.550.3551.330.8593.50.7080.41.780.4910.99816.150.01916.440.026816.460.020217.540.039717.680.050417.880.03942.001
204.0914-64.030717.52890.0717.47420.117.50190.0816.93960.560.003325.590.2771.120.1051.10.9873.50.8660.52.149.2930.08118.330.092418.420.129118.450.122818.760.077718.780.095618.80.08681.967
............................................................................................................
207.2554-63.482315.4790.0115.34870.0315.50590.0215.31860.01-0.00030.290.2891.340.2522.031.523.50.8410.522.520.5810.97915.90.012615.710.022715.910.01416.530.014516.360.021916.530.01552.385
207.4342-63.828216.92050.0416.41460.0616.80480.0516.41460.090.005325.122.2191.910.4762.271.1893.51.2220.743.369.8860.01217.570.047316.980.064717.450.050118.130.044117.560.058217.950.04482.199
207.7056-64.353116.14790.0214.77460.0416.20990.0314.50520.090.0029108.3130.7462.610.6176.442.4674.412.3132.3310.546.2720.01816.450.018915.350.033216.570.023816.990.019415.910.032717.020.02183.294
207.5286-64.001813.91650.0113.77130.0113.93680.0113.75120.01-0.00060.630.3171.090.0831.851.6983.50.8460.512.559.9480.98714.240.004214.010.005814.260.004514.70.00514.440.00614.710.00522.392
207.6506-64.233317.1060.0516.94570.0917.11550.0616.67610.080.000718.911.2451.420.2951.671.1753.50.9160.532.569.9790.13217.490.044817.080.074217.490.052118.410.055817.230.076218.370.06342.233
207.5016-63.948515.64140.0214.55080.0216.24150.0314.70870.10.0018.921.8612.90.6564.691.6154.242.8991.457.290.9960.02915.990.013314.470.016616.50.022316.540.014415.210.015217.130.02362.002
207.6363-64.208116.39690.0316.26280.0616.46740.0416.21780.04-0.00170.660.2911.160.1361.611.3874.220.8610.512.659.9570.9716.610.021616.170.045716.660.025717.270.023816.920.039317.330.02752.44
207.1399-63.246416.77490.0416.20230.0616.74790.0516.04720.06-0.00189.350.711.990.4972.421.2194.021.1440.774.653.6270.88717.270.037916.880.053317.210.040418.90.087318.410.103518.810.09253.047
207.4593-63.868117.64580.0817.24920.117.5070.0916.87910.170.002519.714.1122.030.5071.760.8663.51.0810.663.06.3830.01318.050.072117.680.094517.950.077918.590.064818.360.089818.570.07542.216
207.5354-64.023516.65980.0416.22050.0616.69520.0416.41630.06-0.04.020.2771.460.3142.371.6293.51.0680.724.149.8210.54717.090.031716.440.054617.30.04417.620.030316.920.04917.760.03832.94

Probamos la interfaz de IRSA dust extinction mediante tablas


In [54]:
coord_table = d010[['ALPHA', 'DELTA']]

rows_d010 = np.random.choice(len(coord_table), 20000)

submit_tab = coord_table[rows_d010]

submit_tab.add_column(Column(data=[2. for i in xrange(20000)], name='size'))

submit_tab.write('extinction_tab_d010.dat', format='ipac', names=['ra', 'dec', 'size'], overwrite=True)

Subimos la tabla creada en formato ipac para probar y funciona.

La cantidad de datos es enorme (750k filas) lo que nos obliga a adoptar otra estrategia.

Probemos con astroquery

astroquery sirve para realizar consultas a bases de datos astronomicas, siguiendo la filosofia de Astropy.

Para eso creamos la funcion dered la cual toma una fila de una tabla y realiza la correccion de extincion usando las coordenadas y una query a la base de datos IRSA.


In [10]:
from retrying import retry

@retry(stop_max_attempt_number=7)
def av(obj):
    return IrsaDust.get_query_table(obj, section='ebv')['ext SandF mean']*3.1

def dered(row):
    obj = str(row['ALPHA'])+' '+str(row['DELTA'])
    
    av_SanF = av(obj)
    AJ=0.28*av_SanF
    AH=0.184*av_SanF
    AKs=0.118*av_SanF
    
    row['MAG_PSF_Ks_C']=row['MAG_PSF_Ks'] - AKs
    row['MAG_APER_Ks_C']=row['MAG_APER_Ks'] - AKs
    row['MAG_PSF_J_C']=row['MAG_PSF_J'] - AJ
    row['MAG_APER_J_C']=row['MAG_APER_J'] - AJ
    row['MAG_PSF_H_C']=row['MAG_PSF_H'] - AH
    row['MAG_APER_H_C']=row['MAG_APER_H'] - AH

In [11]:
obj = str(test_table['ALPHA'][0])+' '+str(test_table['DELTA'][0])
av_SanF = IrsaDust.get_query_table(obj, section='ebv')['ext SandF mean']*3.1

Dejamos corriendo la correccion de la tablita.


In [12]:
from log_progress import log_progress

In [13]:
test_table = d010[0:200]

test_table['MAG_PSF_Ks_C']  = np.zeros(len(test_table))
test_table['MAG_APER_Ks_C'] = np.zeros(len(test_table))
test_table['MAG_PSF_J_C']   = np.zeros(len(test_table))
test_table['MAG_APER_J_C']  = np.zeros(len(test_table))
test_table['MAG_PSF_H_C']   = np.zeros(len(test_table))
test_table['MAG_APER_H_C']  = np.zeros(len(test_table))

%time for arow in log_progress(test_table, every=1): dered(arow)


CPU times: user 16.9 s, sys: 228 ms, total: 17.2 s
Wall time: 5min 4s

In [14]:
test_table.write('corrected_resto_d010.dat', format='ipac')

Vemos que tarda demasiado en procesar tan solo 200 filas.

Es importante que sepamos que tarda mas

test_table = d010 test_table['MAG_PSF_Ks_C'] = np.zeros(len(test_table)) test_table['MAG_APER_Ks_C'] = np.zeros(len(test_table)) test_table['MAG_PSF_J_C'] = np.zeros(len(test_table)) test_table['MAG_APER_J_C'] = np.zeros(len(test_table)) test_table['MAG_PSF_H_C'] = np.zeros(len(test_table)) test_table['MAG_APER_H_C'] = np.zeros(len(test_table)) %time for arow in log_progress(test_table, every=100): dered(arow) test_table.write('corrected_resto_d010.dat', format='ipac')

Finalmente en las celdas anteriores se seleccionaron 20000 objetos de muestra del tile d010 para realizar la correccion por extincion. Ahora se seleccionaran 20000 objetos mas del tile d115.


In [35]:
d115 = ascii.read('./restos/RESTO_d115.cat', names=colnames)

Abajo esta la celda usada para calcular los objetos para corregir. Pero ahora esta congelada para que no se sobreescriba el file.

coord_table = d115[['ALPHA', 'DELTA']] rows_d115 = np.random.choice(len(coord_table), 20000) submit_tab = coord_table[rows_d115] submit_tab.add_column(Column(data=[2. for i in xrange(20000)], name='size')) submit_tab.write('extinction_tab_d115.dat', format='ipac', names=['ra', 'dec', 'size'], overwrite=True)

Correction of magnitudes

Ahora podemos corregir, usando las tablas de resultados de IRSA:


In [57]:
d010 = ascii.read('./restos/RESTO_d010.cat', names=colnames)[rows_d010]
d115 = ascii.read('./restos/RESTO_d115.cat', names=colnames)[rows_d115]

In [75]:
exct_d010 = ascii.read('extinction_d010.tbl', format='ipac')
exct_d115 = ascii.read('extinction_d115.tbl', format='ipac')

La correccion es de la siguiente forma entonces:


In [71]:
d115['MAG_PSF_Ks_C']=d115['MAG_PSF_Ks'] - exct_d115['AV_SandF']*0.118
d115['MAG_APER_Ks_C']=d115['MAG_APER_Ks'] - exct_d115['AV_SandF']*0.118

In [72]:
d115['MAG_PSF_J_C']=d115['MAG_PSF_J'] - exct_d115['AV_SandF']*0.28
d115['MAG_APER_J_C']=d115['MAG_APER_J'] - exct_d115['AV_SandF']*0.28

In [73]:
d115['MAG_PSF_H_C']=d115['MAG_PSF_H'] - exct_d115['AV_SandF']*0.184
d115['MAG_APER_H_C']=d115['MAG_APER_H'] - exct_d115['AV_SandF']*0.184

Y para el tile d010


In [76]:
d010['MAG_PSF_Ks_C']=d010['MAG_PSF_Ks'] - exct_d010['AV_SandF']*0.118
d010['MAG_APER_Ks_C']=d010['MAG_APER_Ks'] - exct_d010['AV_SandF']*0.118

In [77]:
d010['MAG_PSF_J_C']=d010['MAG_PSF_J'] - exct_d010['AV_SandF']*0.28
d010['MAG_APER_J_C']=d010['MAG_APER_J'] - exct_d010['AV_SandF']*0.28

In [78]:
d010['MAG_PSF_H_C']=d010['MAG_PSF_H'] - exct_d010['AV_SandF']*0.184
d010['MAG_APER_H_C']=d010['MAG_APER_H'] - exct_d010['AV_SandF']*0.184

Ahora guardamos las tablas


In [79]:
d010.write('d010_resto.dat', format='ipac')
d115.write('d115_resto.dat', format='ipac')

In [ ]: