A short introduction to Gaia Archive: ADQL & TAP by examples

Morgan Fouesneau

Notebook configuration

Below we will do some plotting. The following commands are making things a bit nicer. (all personal libraries are included with this notebook).


In [1]:
# Loading configuration
# Don't forget that mac has this annoying configuration that leads
# to limited number of figures/files
# ulimit -n 4096    <---- osx limits to 256
import warnings
warnings.catch_warnings()
warnings.simplefilter("ignore")

%pylab inline
%config InlineBackend.figure_format='retina'

import pylab as plt
import numpy as np
import figrc, setup_mpl
setup_mpl.theme()
setup_mpl.solarized_colors()


Populating the interactive namespace from numpy and matplotlib

Interogating Gaia Archive

Gaia Archive website: https://gea.esac.esa.int/archive/

The entry point is a TAP (Table Access Protocol) server.

TAP provides two operation modes, Synchronous and Asynchronous.

  • Synchronous: the response to the request will be generated as soon as the request received by the server.

  • Asynchronous: the server will start a job that will execute the request. The first response to the request is the required information (a link) to obtain the job status. Once the job is finished, the results can be retrieved.

Gaia Archive TAP service

Gaia Archive TAP server provides two access modes: public and authenticated

  • Public: this is the standard TAP access. A user can execute ADQL queries and upload tables to be used in a query 'on-the-fly' (these tables will be removed once the query is executed). The results are available to any other user and they will remain in the server for a limited space of time.

  • Authenticated: some functionalities are restricted to authenticated users only. The results are saved in a private user space and they will remain in the server for ever (they can be removed by the user). ADQL queries and results are saved in a user private area.

  • Cross-match operations: a catalogue cross-match operation can be executed. Cross-match operations results are saved in a user private area.

What is ADQL?

ADQL = Astronomical Data Query Language

ADQL has been developed based on SQL92 and supports a subset of the SQL grammar with extensions to support generic and astronomy specific operations.

In other words, ADQL is a SQL-like searching language improved with geometrical functions.

for more information see the IVOA documentation http://www.ivoa.net/documents/latest/ADQL.html

  • examples of SQL minimal queries
SELECT *
FROM "gaiadr1.tgas_source"
SELECT top 1000 ra, dec, phot_g_mean_mag AS mag  
FROM "gaiadr1.gaia_source"
ORDER BY mag

Basic Keywords

  • TOP limits the number of records to display
  • ORDER BY sorts records in ascending (ASC, default) or descending (DESC)
  • WHERE filters records according to logical expressions
  • IN, NOT IN operator that can determine whether a value is (not) within a given set
  • BETWEEN x AND y operator can determine whether a value is within a given interval
  • LIKE operator allows for a partial comparison, It uses wild cards % and _ ('percent' and 'underscore'). The wild card % replaces any string of characters, including the empty string. The underscore replaces exactly one character.
  • GROUP BY groups records by identical values (or set of values)
  • = or > or < or >= or <= or <>, different operators of logical comparisons
  • +, -, *, /, compute columns using mathematical operations
  • POWER(column_name, n) returns values raised to the power n. n must be a integer positive or negative.
  • SQRT(column_name) returns the square root of values.
  • CEILING(column_name) rounds up to the nearest integer value.
  • FLOOR(column name) rounds down to the next least integer value.
  • ABS(column_name) returns the absolute value.
  • AVG (column_name) this function returns the average value in a column for a group of data lines
  • COUNT (column_name) this function returns a count of rows from a reference column values if it is not NULL.
  • SUM(column_name) this function returns the sum of values in a column for a group of data lines.
  • MAX, MIN, return the largest or smallest value of a column for a group of data lines.
  • COS, SIN, TAN, ACOS, ASIN, ATAN of an angle in radian compute the trigonometric transformation

Geometries and geometrical functions

ADQL provides a set of 2D-functions and geometries or regions

A region is always attached to a coordinate System: FK4, FK5, ICRS, GALACTIC. The coordinates expressed in degree, can be constant or the result of a mathematical expression.

  • POINT('coordinate system', right ascension, declination) expresses a point source on the sky

  • CIRCLE('coordinate system',right ascension center, declination center, radius in degrees) expresses a circular region on the sky (a cone in space)

  • BOX('coordinate system', right ascension center, declination center, width, height) defines a centered box

  • POLYGON('coordinate system', coordinate point 1, coordinate point 2, coordinate point 3...) expresses a region on the sky with sides denoted by great circles passing through specified list of POINT objects.

  • DISTANCE(point1, point2) computes distance between two points.

  • CONTAINS(region1, region2) returns a boolean value : true if region2 contains region1, false otherwise.

  • INTERSECTS(region1, region2) returns a boolean value : true if region2 intersect region1, false otherwise.

Current python package

Some common code to send ADQL Queries to TAP services and notebook polishing

  • GaiaArchive is a shortcut from TAP_service to the interface with the Gaia Archive
  • TAPVizieR is a shortcut from TAP_service to the interface with TAP service of VizieR (CDS).
  • resolve interfaces CDS/Sesame name resolver to get positions of known objects by their names.
  • QueryStr is a polished string that parses an SQL syntax to make it look nicer (for notebook and console)
  • timeit a context manager/decorator that reports execution time (for notebook and console)

In [2]:
from tap import (GaiaArchive, TAPVizieR, resolve, QueryStr, timeit)

Quick Start: How to query TAP with this package?

Let's start by checking that we can access the data by requesting the first 5 sources in TGAS.

Synchronous mode

Get the service and submit the query. The result will be downloaded automatically.


In [3]:
gaia = GaiaArchive()

select a small number of rows in the gaia DR1 data table.

Note: we use QueryStr only for giving an easier reading. A string would work as well.


In [4]:
adql = QueryStr("""
select top 5 * from gaiadr1.gaia_source
""")


ADQL query

SELECT top 5 * FROM gaiadr1.gaia_source

In [5]:
from tap import GaiaArchive
selection = ','.join(['avg({0:s}) as avg_{0:s}'.format(k) 
                      for k in gaia.get_table_info('gaiadr1.tgas_source').keys() 
                      if ('error' in k) or ('corr' in k)])

data = GaiaArchive().query(QueryStr("""
select {0:s} from gaiadr1.tgas_source
""".format(selection)))
r = data.as_array().data
cor = np.array(eval("""
[
    [avg_ra_error,         avg_ra_dec_corr,       avg_ra_parallax_corr,    avg_ra_pmra_corr,       avg_ra_pmdec_corr],
    [avg_ra_dec_corr,      avg_dec_error,         avg_dec_parallax_corr,   avg_dec_pmra_corr,      avg_dec_pmdec_corr],
    [avg_ra_parallax_corr, avg_dec_parallax_corr, avg_parallax_error,      avg_parallax_pmra_corr, avg_parallax_pmdec_corr],
    [avg_ra_pmra_corr,     avg_dec_pmra_corr,     avg_parallax_pmra_corr,  avg_pmra_error,         avg_pmra_pmdec_corr],
    [avg_ra_pmdec_corr,    avg_dec_pmdec_corr,    avg_parallax_pmdec_corr, avg_pmra_pmdec_corr,    avg_pmdec_error]
]
""", {k:float(r[k]) for k in r.dtype.names}))

cov = cor.copy()

pars = r'$\alpha$ $\delta$ $\varpi$ $\mu_{\alpha\star}$ $\mu_\delta$'.split()

for k in range(len(cor)):
    val = cor[k, k]
    cov[k, :] *= val
    cov[:, k] *= val
    cov[k, k] /= val

# plot correlations
mcor = np.ma.array(cor)
for k in range(len(cor)): 
    cor[k, k] = np.nan
lim = np.max(abs(cor))
plt.matshow(cor, vmin=-lim, vmax=lim, cmap=plt.cm.RdBu_r)
for k, l in enumerate(pars):
    plt.text(k, k, l, ha='center', va='center')
plt.colorbar(shrink=0.7).set_label(r'correlation $\rho_{x,y}$')
plt.xticks([])
plt.yticks([])
plt.savefig('tgas_correlation.pdf', bbox_inches='tight')

plt.figure()
# plot correlations
for k in range(len(cor)): 
    cov[k, k] = np.nan
lim = np.max(abs(cov))
plt.matshow(cov, vmin=-lim, vmax=lim, cmap=plt.cm.RdBu_r)
pars = r'$\alpha$ $\delta$ $\varpi$ $\mu_{\alpha\star}$ $\mu_\delta$'.split()
for k, l in enumerate(pars):
    plt.text(k, k, l, ha='center', va='center')
plt.colorbar(shrink=0.7).set_label(r'covariance $\rho_{x,y}\sigma_x\sigma_y$')
plt.xticks([])
plt.yticks([]);
plt.savefig('tgas_covariance.pdf', bbox_inches='tight')


ADQL query

SELECT avg(ra_error) AS avg_ra_error,avg(dec_error) AS avg_dec_error,avg(parallax_error) AS avg_parallax_error,avg(pmra_error) AS avg_pmra_error,avg(pmdec_error) AS avg_pmdec_error,avg(ra_dec_corr) AS avg_ra_dec_corr,avg(ra_parallax_corr) AS avg_ra_parallax_corr,avg(ra_pmra_corr) AS avg_ra_pmra_corr,avg(ra_pmdec_corr) AS avg_ra_pmdec_corr,avg(dec_parallax_corr) AS avg_dec_parallax_corr,avg(dec_pmra_corr) AS avg_dec_pmra_corr,avg(dec_pmdec_corr) AS avg_dec_pmdec_corr,avg(parallax_pmra_corr) AS avg_parallax_pmra_corr,avg(parallax_pmdec_corr) AS avg_parallax_pmdec_corr,avg(pmra_pmdec_corr) AS avg_pmra_pmdec_corr,avg(phot_g_mean_flux_error) AS avg_phot_g_mean_flux_error FROM gaiadr1.tgas_source
<matplotlib.figure.Figure at 0x113def9e8>

Now we run the query.

Note: we use timeit in this notebook to indicate how long the operation gaia.query took.


In [6]:
timeit(gaia.query)(adql)


Execution time: 1.22 s

Out[6]:
<Table masked=True length=5>
solution_idsource_idrandom_indexref_epochrara_errordecdec_errorparallaxparallax_errorpmrapmra_errorpmdecpmdec_errorra_dec_corrra_parallax_corrra_pmra_corrra_pmdec_corrdec_parallax_corrdec_pmra_corrdec_pmdec_corrparallax_pmra_corrparallax_pmdec_corrpmra_pmdec_corrastrometric_n_obs_alastrometric_n_obs_acastrometric_n_good_obs_alastrometric_n_good_obs_acastrometric_n_bad_obs_alastrometric_n_bad_obs_acastrometric_delta_qastrometric_excess_noiseastrometric_excess_noise_sigastrometric_primary_flagastrometric_relegation_factorastrometric_weight_alastrometric_weight_acastrometric_priors_usedmatched_observationsduplicated_sourcescan_direction_strength_k1scan_direction_strength_k2scan_direction_strength_k3scan_direction_strength_k4scan_direction_mean_k1scan_direction_mean_k2scan_direction_mean_k3scan_direction_mean_k4phot_g_n_obsphot_g_mean_fluxphot_g_mean_flux_errorphot_g_mean_magphot_variable_flaglbecl_lonecl_lat
Time[Julian Years]Angle[deg]Angle[mas]Angle[deg]Angle[mas]Angle[mas]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Angular Velocity[mas/year]Angular Velocity[mas/year]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Angle[mas]Angle[mas^-2]Angle[mas^-2]Angle[deg]Angle[deg]Angle[deg]Angle[deg]Flux[e-/s]Flux[e-/s]Magnitude[mag]Dimensionless[see description]Angle[deg]Angle[deg]Angle[deg]Angle[deg]
int64int64int64float64float64float64float64float64float64float64float64float64float64float64float32float32float32float32float32float32float32float32float32float32int32int32int32int32int32int32float32float64float64boolfloat32float32float32int32int16boolfloat32float32float32float32float32float32float32float32int32float64float64float64objectfloat64float64float64float64
163537841078193356844868959154436504329053508942015.0264.181225100334642.80035161658690827.051642987603799812.9212181467864-------------0.077------------------43041020--3.27802694993215972.8301321572232419False1.69207480.036015101--212False0.868653120.625081360.664678040.8813076-59.995804-51.505093-43.121849-44.79404143218.431799983236263.069824345603493519.676480402386268NOT_AVAILABLE30.75931388422234419.806472281755649263.3040433974209130.355774143855047
163537841078193356844866091834246899203097520732015.0266.166037654863150.553828719978626487.62714500284310310.65430966160671311------------0.92455------------------89089000--0.00.0False1.00.6764009--219False0.468872820.59688860.311264490.8961063-69.038536-47.735455-32.330299-42.06253189885.145090759739672.61255068991689418.15723390135684NOT_AVAILABLE32.22595084103313218.294962138635221265.5652975229179431.007424683992905
1635378410781933568448795491694381696011098520822015.0267.431466489001763.37328718707933468.09160044667352274.1776617264829969------------0.84584999------------------45043020--4.92783600699944295.5765534150526719False1.69117820.018240357--227False0.500515760.272805240.284835220.85747027-106.40694-69.268906-8.6255636-41.57030144130.705539886876442.117329420076349120.234035075545016NOT_AVAILABLE33.24258203564476617.372733263801848267.0170763745372831.504289333914077
1635378410781933568448840506387190809610855574352015.0268.5771792129770.456868720226326918.52839580717644450.66047878586423325------------0.52380002------------------80080000--0.473144607233137950.4344997036814196False1.04095090.27349713--214False0.366657350.104740150.335500450.73903608-104.398527.750549-29.422886-44.95325177492.795058952687444.709258476623794818.793104202112058NOT_AVAILABLE34.17151145197274116.544154401270578268.3414814714799931.959485697633603
163537841078193356844881689781090949127241566432015.0266.93502677670210.454722586397579628.45013676705115020.52313455431024292------------0.92054999------------------65065000--0.544438932823388732.7908803781634761False1.21347821.1420876--216False0.359407720.747101660.261411520.91352218-62.16045-45.337482-33.693073-42.355057672029.02690776667583.366914582879530417.25655054776395NOT_AVAILABLE33.35552843567393217.971330490608551266.4302659101040831.851445954397114

Asynchronous mode

From the same service, use the query_async method. The job will be submitted and accessible later.

Why this mode would be prefered? Some services (incl. the Gaia Archive) limit strongly the queries using the synchronous mode. For example the Gaia Archive limits to 1 minute jobs, mostly to avoid comminucation issues. Read the documentation of the service you want to use to decide.

Below I use the async mode to redo the exact same query as before.


In [7]:
q = gaia.query_async(adql, silent=True)
q   # pretty print display


Out[7]:

ADQL Query

SELECT top 5 * FROM gaiadr1.gaia_source

One can interogate the service to know if the task is complete.


In [8]:
q.status


Out[8]:
'COMPLETED'

Finally we can download the result when available.

(The provided python interface has an option to wait or not for completion; keyword wait=True, default behavior)


In [9]:
q.get()


Out[9]:
<Table masked=True length=5>
solution_idsource_idrandom_indexref_epochrara_errordecdec_errorparallaxparallax_errorpmrapmra_errorpmdecpmdec_errorra_dec_corrra_parallax_corrra_pmra_corrra_pmdec_corrdec_parallax_corrdec_pmra_corrdec_pmdec_corrparallax_pmra_corrparallax_pmdec_corrpmra_pmdec_corrastrometric_n_obs_alastrometric_n_obs_acastrometric_n_good_obs_alastrometric_n_good_obs_acastrometric_n_bad_obs_alastrometric_n_bad_obs_acastrometric_delta_qastrometric_excess_noiseastrometric_excess_noise_sigastrometric_primary_flagastrometric_relegation_factorastrometric_weight_alastrometric_weight_acastrometric_priors_usedmatched_observationsduplicated_sourcescan_direction_strength_k1scan_direction_strength_k2scan_direction_strength_k3scan_direction_strength_k4scan_direction_mean_k1scan_direction_mean_k2scan_direction_mean_k3scan_direction_mean_k4phot_g_n_obsphot_g_mean_fluxphot_g_mean_flux_errorphot_g_mean_magphot_variable_flaglbecl_lonecl_lat
Time[Julian Years]Angle[deg]Angle[mas]Angle[deg]Angle[mas]Angle[mas]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Angular Velocity[mas/year]Angular Velocity[mas/year]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Angle[mas]Angle[mas^-2]Angle[mas^-2]Angle[deg]Angle[deg]Angle[deg]Angle[deg]Flux[e-/s]Flux[e-/s]Magnitude[mag]Dimensionless[see description]Angle[deg]Angle[deg]Angle[deg]Angle[deg]
int64int64int64float64float64float64float64float64float64float64float64float64float64float64float32float32float32float32float32float32float32float32float32float32int32int32int32int32int32int32float32float64float64boolfloat32float32float32int32int16boolfloat32float32float32float32float32float32float32float32int32float64float64float64objectfloat64float64float64float64
163537841078193356844868959154436504329053508942015.0264.181225100334642.80035161658690827.051642987603799812.9212181467864-------------0.077------------------43041020--3.27802694993215972.8301321572232419False1.69207480.036015101--212False0.868653120.625081360.664678040.8813076-59.995804-51.505093-43.121849-44.79404143218.431799983236263.069824345603493519.676480402386268NOT_AVAILABLE30.75931388422234419.806472281755649263.3040433974209130.355774143855047
163537841078193356844866091834246899203097520732015.0266.166037654863150.553828719978626487.62714500284310310.65430966160671311------------0.92455------------------89089000--0.00.0False1.00.6764009--219False0.468872820.59688860.311264490.8961063-69.038536-47.735455-32.330299-42.06253189885.145090759739672.61255068991689418.15723390135684NOT_AVAILABLE32.22595084103313218.294962138635221265.5652975229179431.007424683992905
1635378410781933568448795491694381696011098520822015.0267.431466489001763.37328718707933468.09160044667352274.1776617264829969------------0.84584999------------------45043020--4.92783600699944295.5765534150526719False1.69117820.018240357--227False0.500515760.272805240.284835220.85747027-106.40694-69.268906-8.6255636-41.57030144130.705539886876442.117329420076349120.234035075545016NOT_AVAILABLE33.24258203564476617.372733263801848267.0170763745372831.504289333914077
1635378410781933568448840506387190809610855574352015.0268.5771792129770.456868720226326918.52839580717644450.66047878586423325------------0.52380002------------------80080000--0.473144607233137950.4344997036814196False1.04095090.27349713--214False0.366657350.104740150.335500450.73903608-104.398527.750549-29.422886-44.95325177492.795058952687444.709258476623794818.793104202112058NOT_AVAILABLE34.17151145197274116.544154401270578268.3414814714799931.959485697633603
163537841078193356844881689781090949127241566432015.0266.93502677670210.454722586397579628.45013676705115020.52313455431024292------------0.92054999------------------65065000--0.544438932823388732.7908803781634761False1.21347821.1420876--216False0.359407720.747101660.261411520.91352218-62.16045-45.337482-33.693073-42.355057672029.02690776667583.366914582879530417.25655054776395NOT_AVAILABLE33.35552843567393217.971330490608551266.4302659101040831.851445954397114

Luckily, we obtain the same result as before.

Authenticate with your account

The current python package also allows you to authenticate. This is mostly relevant when using async queries, or user tables.

gaia.login("my_user_name")

The above will prompt for a password.

Note that the password can also be provided as argument if you need to script some queries. It will not be stored, only the relevant cookie will be conserved until the end of the session.

Callback previous jobs

In some cases (mostly when authenticated) you might want to download a previous job. This python package allows you to do so using recall_query, which returns an asynchronous result.


In [ ]:
qprime = gaia.recall_query(q.jobid)
qprime.get()

Make the luminosity function of the TGAS stars

In this example, we want to make the luminosity function of TGAS stars. Of course we could download all stars and do it on our computer but one can do it on the server side and only download the computed histogram.

hint: Building a luminosity function means that we want to select all stars in TGAS (with magnitude) and count them after grouping them per bin of magnitudes. Additionally one will want to sort the bins for plotting them in order.


In [9]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(phot_g_mean_mag, 1) as val
from 
    gaiadr1.tgas_source 
group by val
order by val
""")


ADQL query

SELECT
    count(*) AS n,
    round(phot_g_mean_mag, 1) AS val
FROM
    gaiadr1.tgas_source
GROUP BY val
ORDER BY val

Run the query


In [10]:
data_tgas = timeit(gaia.query)(adql)


Execution time: 18.9 s

Note: Sometimes the Gaia Archive crashes... BUG! This is needed feedback to the service.

You can bypass the issue by using async queries or re-run until you give up...


In [12]:
data_tgas = gaia.query_async(adql).get()


Query Status: 303 Reason: 303
Location: http://gea.esac.esa.int/tap-server/tap/async/1479126879844O
Job id: 1479126879844O

Plot the histogram data.


In [13]:
plt.step(data_tgas['val'], data_tgas['n'], 
         lw=2, where='pre', label='TGAS')
plt.yscale('log')
plt.xlabel('G magnitude')
plt.ylabel('counts / mag')
figrc.hide_axis('top right'.split())


Note: Creating the luminosity function for the full DR1 required more than 1 min and thefore an async query, which is not handled by this simple query interface.

Stellar density map

Let's reproduce the stellar density map of TGAS stars with all computations on the server side.

Based on the histogram technique we used for the luminosity function, one can extend it to more than one dimension.


In [14]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(l, 0) as x, 
    round(b, 0) as y
from 
    gaiadr1.tgas_source 
group by x, y
order by x, y
""")


ADQL query

SELECT
    count(*) AS n,
    round(l, 0) AS x,
    round(b, 0) AS y
FROM
    gaiadr1.tgas_source
GROUP BY x, y
ORDER BY x, y

Run the query


In [15]:
data = timeit(gaia.query)(adql)


Execution time: 25.7 s

Below you can see that the table is flat (i.e, no 2d matrix) and it contains a number $n$ for each pair $(x,y)$. Note also that empty bins are not included.


In [16]:
data


Out[16]:
<Table masked=True length=62938>
nxy
int64float64float64
10.0-88.0
10.0-85.0
10.0-84.0
10.0-83.0
10.0-82.0
10.0-81.0
20.0-80.0
30.0-79.0
40.0-78.0
10.0-77.0
.........
2360.077.0
3360.078.0
1360.079.0
4360.080.0
2360.081.0
4360.082.0
1360.086.0
1360.088.0
1360.089.0
1360.090.0

Skipping polishing the projection


In [17]:
from matplotlib.colors import LogNorm
l = np.arange(0, 360, 1)
b = np.arange(-90, 90, 1)
n = np.zeros((len(l), len(b)))
ix = np.digitize(data['x'], l)
iy = np.digitize(data['y'], b)
n[ix - 1, iy - 1] = data['n']
plt.pcolormesh(l, b, n.T, 
               cmap=plt.cm.viridis, 
               norm=LogNorm())
figrc.hide_axis('top right'.split())
plt.xlim(l.min(), l.max())
plt.ylim(b.min(), b.max());


Query around a position

Using the region objects, one can query around a point. Below is an example of cone-search, i.e., selecting stars within 2 degrees of an object:

select * from gaiadr1.gaia_sources
where contains(point('ICRS', ra, dec), circle('ICRS',10.6847083,41.26875,2) ) = 1

The where condition selects points from the data that are contained into the circle of given center and size.

Below we make the density map of stars around the same object. We can therefore combine the 2D-histogram above and the cone-search technique.

Any idea of what object is the center of this selection?


In [18]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(l, 2) as latitude, 
    round(b, 2) as longitude
from 
    gaiadr1.gaia_source 
where 
    contains(point('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),
             circle('ICRS',10.6847083,41.26875,2) )=1  
group by latitude, longitude
order by latitude, longitude
""")


ADQL query

SELECT
    count(*) AS n,
    round(l, 2) AS latitude,
    round(b, 2) AS longitude
FROM
    gaiadr1.gaia_source
WHERE
    contains(point('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),
             circle('ICRS',10.6847083,41.26875,2) )=1
GROUP BY latitude, longitude
ORDER BY latitude, longitude

Run the query


In [19]:
data = timeit(gaia.query)(adql)


Execution time: 13.7 s

Finally we plot the map


In [20]:
plt.figure(figsize=(6,6))
plt.subplot(111, aspect=1)
plt.scatter(data['latitude'], data['longitude'], c=data['n'], 
            edgecolor='None', s=6, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='s'
           )
plt.xlim(data['latitude'].min(), data['latitude'].max())
plt.ylim(data['longitude'].min(), data['longitude'].max())
plt.axis('off');


Let's proceed to query around another object for example M33

ADQL does not provide a name resolver function. So you'll have to figure that for M33 you need

circle('ICRS',23.4621,30.6599417,0.5)

Alternatively, this python package provides a name resolver function based on the CDS/Sesame service.

from tap import resolve
ra, dec = resolve('m33')

Below I show the latter option.


In [21]:
ra, dec = resolve('m33')
adql = QueryStr("""
select 
    count(*) as n, 
    round(l, 2) as latitude, 
    round(b, 2) as longitude
from 
    gaiadr1.gaia_source 
where 
    contains(point('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),
             circle('ICRS',{ra:f}, {dec:f},0.5) )=1  
group by latitude, longitude
order by latitude, longitude
""".format(ra=ra, dec=dec))
data = timeit(gaia.query)(adql)
plt.figure(figsize=(6,6))
plt.subplot(111, aspect=1)
plt.scatter(data['latitude'], data['longitude'], c=data['n'], 
            edgecolor='None', s=9, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='s'
           )
plt.xlim(data['latitude'].min(), data['latitude'].max())
plt.ylim(data['longitude'].min(), data['longitude'].max())
plt.axis('off');


ADQL query

SELECT
    count(*) AS n,
    round(l, 2) AS latitude,
    round(b, 2) AS longitude
FROM
    gaiadr1.gaia_source
WHERE
    contains(point('ICRS',gaiadr1.gaia_source.ra,gaiadr1.gaia_source.dec),
             circle('ICRS',23.462100, 30.659942,0.5) )=1
GROUP BY latitude, longitude
ORDER BY latitude, longitude

Execution time: 922 ms

Color-magnitude diagram of TGAS

One of the first results published by the Gaia Consortium was the color-magnitude diagram (CMD) of the TGAS data, showing how good the parallaxes are.

Let's reproduce the figure.

Let's make a binned CMD with bins of $0.01$ mag and $0.05$ mag in color and magnitude, respectively. Additionally, we will need to use the parallax as distance measurements. As Bailer-Jones 2015 showed one must be careful, thus let's only consider stars with $\varpi/\sigma_\varpi > 5$. One could also filter on photometric signal-to-noise (in flux).


In [22]:
adql = QueryStr("""
select 
    count(*) as n,
    floor((hip.bt_mag - hip.vt_mag) / 0.01) * 0.01 as color,
    floor((gaia.phot_g_mean_mag + 5*log10(gaia.parallax)-10) / 0.05) * 0.05 as mag
from 
    gaiadr1.tgas_source as gaia
inner join 
    public.tycho2 as hip
    on gaia.hip = hip.hip
where 
    gaia.parallax / gaia.parallax_error >= 5 
    and (2.5/log(10)) * (gaia.phot_g_mean_flux_error / gaia.phot_g_mean_flux) <= 0.05
group by color, mag
""")
data = timeit(gaia.query)(adql)


ADQL query

SELECT
    count(*) AS n,
    floor((hip.bt_mag - hip.vt_mag) / 0.01) * 0.01 AS color,
    floor((gaia.phot_g_mean_mag + 5*log10(gaia.parallax)-10) / 0.05) * 0.05 AS mag
FROM
    gaiadr1.tgas_source AS gaia
INNER JOIN
    public.tycho2 AS hip
    ON gaia.hip = hip.hip
WHERE
    gaia.parallax / gaia.parallax_error >= 5
    AND (2.5/log(10)) * (gaia.phot_g_mean_flux_error / gaia.phot_g_mean_flux) <= 0.05
GROUP BY color, mag

Execution time: 6.42 s

Plotting the CMD is the only local operation.


In [23]:
plt.scatter(data['color'], data['mag'], c=data['n'], 
            edgecolor='None', s=1, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='o'
           )
plt.xlim(data['color'].min(), data['color'].max())
plt.ylim(data['mag'].max(), data['mag'].min())
plt.xlabel('B-V (Hipparcos)')
plt.ylabel(r'G + 5 log($\varpi$) - 10')
figrc.hide_axis('top right'.split())


Using DR1 crossmatched catalogs: joining tables

Many surveys are already crossmatched by Gaia DPAC (Data Processing and Analysis Consortium). However the access may not be as trivial for many. It requires to join tables by some id values.

Getting TGAS and Tycho2 missing Ids

Note the use of left outer join, which adds to the left table (here gaia) the missing columns when and fills it in whenever possible.


In [24]:
adql = QueryStr("""
select top 10
        gaia.hip, gaia.tycho2_id, gaia.source_id,
        tycho2.bt_mag, tycho2.vt_mag, tycho2.e_bt_mag, tycho2.e_vt_mag
from 
        gaiadr1.tgas_source as gaia
left outer join
        public.tycho2 as tycho2 
        on gaia.tycho2_id = tycho2.id
""")
timeit(gaia.query)(adql)


ADQL query

SELECT top 10
        gaia.hip, gaia.tycho2_id, gaia.source_id,
        tycho2.bt_mag, tycho2.vt_mag, tycho2.e_bt_mag, tycho2.e_vt_mag
FROM
        gaiadr1.tgas_source AS gaia
LEFT OUTER JOIN
        public.tycho2 AS tycho2
        ON gaia.tycho2_id = tycho2.id

Execution time: 135 ms

Out[24]:
<Table masked=True length=10>
hiptycho2_idsource_idbt_magvt_mage_bt_mage_vt_mag
'mag''mag''mag''mag'
int32objectint64float32float32float32float32
--1000-1009-1449371484603810880012.76212.1570.2360.193
--1000-1016-1449283980658353331211.13110.6950.0579999980.061999999
--1000-1018-1449357572345745587212.22411.8490.1550.163
--1000-1043-1449411431235636556810.2749.20800020.0350.021
--1000-1068-1449351964836573900811.9510.6080.1060.048999999
--1000-108-1449352260330323276812.39112.3890.1850.21600001
--1000-1087-1449371604862894963211.52911.0210.0770.079999998
--1000-1092-1449286646973805593612.69612.1150.2480.20900001
--1000-111-1449370952028071052812.04811.4670.1210.104
--1000-1117-1449389087097856064011.35710.8410.0640000030.064000003

What happens if you do inner join instead?


In [25]:
adql = QueryStr("""
select top 10
    gaia.hip, gaia.tycho2_id, gaia.source_id,
    gaia.ra, gaia.ra_error, gaia.dec, gaia.dec_error, 
    gaia.parallax, gaia.parallax_error, gaia.pmra, gaia.pmra_error,
    gaia.pmdec, gaia.pmdec_error, gaia.ra_dec_corr, gaia.ra_parallax_corr,
    gaia.phot_g_n_obs, gaia.phot_g_mean_flux, gaia.phot_g_mean_flux_error,
    gaia.phot_g_mean_mag, gaia.phot_variable_flag, gaia.l, gaia.b,
    gaia.ecl_lon, gaia.ecl_lat, tycho2.bt_mag, tycho2.vt_mag,
    tycho2.e_bt_mag, tycho2.e_vt_mag, 
    allwise.allwise_oid, allwise.w1mpro,
    allwise.w1mpro_error, allwise.w2mpro, allwise.w2mpro_error,
    allwise.w3mpro, allwise.w3mpro_error, allwise.w4mpro,
    allwise.w4mpro_error, allwise.var_flag, allwise.w1mjd_mean,
    allwise.w2mjd_mean, allwise.w3mjd_mean, allwise.w4mjd_mean,
    allwise.w1gmag, allwise.w1gmag_error, allwise.w2gmag,
    allwise.w2gmag_error, allwise.w3gmag, allwise.w3gmag_error,
    allwise.w4gmag, allwise.w4gmag_error,
    tmass.tmass_oid, tmass.j_m, tmass.j_msigcom, tmass.h_m, tmass.h_msigcom,
    tmass.ks_m, tmass.ks_msigcom
from 
    gaiadr1.tgas_source as gaia
left outer join
    public.tycho2 as tycho2  
    on gaia.tycho2_id = tycho2.id
left outer join
    gaiadr1.allwise_best_neighbour as allwisexmatch  
    on gaia.source_id = allwisexmatch.source_id  
left outer join 
    gaiadr1.allwise_original_valid as allwise 
    on allwisexmatch.allwise_oid = allwise.allwise_oid  
left outer join 
    gaiadr1.tmass_best_neighbour as tmassxmatch
    on gaia.source_id = tmassxmatch.source_id  
left outer join
    gaiadr1.tmass_original_valid as tmass  
    on tmassxmatch.tmass_oid = tmass.tmass_oid
""")
data = timeit(gaia.query)(adql)


ADQL query

SELECT top 10
    gaia.hip, gaia.tycho2_id, gaia.source_id,
    gaia.ra, gaia.ra_error, gaia.dec, gaia.dec_error,
    gaia.parallax, gaia.parallax_error, gaia.pmra, gaia.pmra_error,
    gaia.pmdec, gaia.pmdec_error, gaia.ra_dec_corr, gaia.ra_parallax_corr,
    gaia.phot_g_n_obs, gaia.phot_g_mean_flux, gaia.phot_g_mean_flux_error,
    gaia.phot_g_mean_mag, gaia.phot_variable_flag, gaia.l, gaia.b,
    gaia.ecl_lon, gaia.ecl_lat, tycho2.bt_mag, tycho2.vt_mag,
    tycho2.e_bt_mag, tycho2.e_vt_mag,
    allwise.allwise_oid, allwise.w1mpro,
    allwise.w1mpro_error, allwise.w2mpro, allwise.w2mpro_error,
    allwise.w3mpro, allwise.w3mpro_error, allwise.w4mpro,
    allwise.w4mpro_error, allwise.var_flag, allwise.w1mjd_mean,
    allwise.w2mjd_mean, allwise.w3mjd_mean, allwise.w4mjd_mean,
    allwise.w1gmag, allwise.w1gmag_error, allwise.w2gmag,
    allwise.w2gmag_error, allwise.w3gmag, allwise.w3gmag_error,
    allwise.w4gmag, allwise.w4gmag_error,
    tmass.tmass_oid, tmass.j_m, tmass.j_msigcom, tmass.h_m, tmass.h_msigcom,
    tmass.ks_m, tmass.ks_msigcom
FROM
    gaiadr1.tgas_source AS gaia
LEFT OUTER JOIN
    public.tycho2 AS tycho2
    ON gaia.tycho2_id = tycho2.id
LEFT OUTER JOIN
    gaiadr1.allwise_best_neighbour AS allwisexmatch
    ON gaia.source_id = allwisexmatch.source_id
LEFT OUTER JOIN
    gaiadr1.allwise_original_valid AS allwise
    ON allwisexmatch.allwise_oid = allwise.allwise_oid
LEFT OUTER JOIN
    gaiadr1.tmass_best_neighbour AS tmassxmatch
    ON gaia.source_id = tmassxmatch.source_id
LEFT OUTER JOIN
    gaiadr1.tmass_original_valid AS tmass
    ON tmassxmatch.tmass_oid = tmass.tmass_oid

Execution time: 513 ms

Example

This example is from C. A. L. Bailer-Jones

TGAS apparently has the nasty property that it does not report both Hipparcos and Tycho-2 Id at the same time but only one of the 2. However, when you need/want to have both, this starts to get a little bit tricky.

Additionally CBJ wanted to do some selections on the parallax and proper motion for updating his stellar encounter study.

Below we select the various id values but fill the Tycho-2 ids for Hipparcos stars when available in the Tycho-2 data. Moreover, CBJ wants to filter stars based on their motion:

\begin{equation} \frac{1000 \cdot 4.74047 \cdot \sqrt{\mu_\alpha^2 + \mu_\delta^2} / \varpi^2}{ \sqrt{ (\mu_\alpha^2 + \mu_\delta^2) \cdot (4.74047/\varpi)^2 + 500^2} } < 10 \end{equation}

where $4.74047$ is the equivalent of $1$ AU/yr in km/s and a radial velocity $R_V=500$ km/s.


In [30]:
adql = QueryStr("""
select 
    tgas.tycho2_id, tycho2.id, tgas.source_id, tgas.phot_g_mean_mag,
    tgas.ra, tgas.dec, tgas.parallax, tgas.pmra, tgas.pmdec, tgas.ra_error,
    tgas.dec_error, tgas.parallax_error, tgas.pmra_error, tgas.pmdec_error,
    tgas.ra_dec_corr, tgas.ra_parallax_corr, tgas.ra_pmra_corr,
    tgas.ra_pmdec_corr, tgas.dec_parallax_corr, tgas.dec_pmra_corr,
    tgas.dec_pmdec_corr, tgas.parallax_pmra_corr, tgas.parallax_pmdec_corr,
    tgas.pmra_pmdec_corr
from 
    gaiadr1.tgas_source as tgas
left outer join 
    public.tycho2 as tycho2
    on tgas.hip = tycho2.hip
where ( (
            1000 * 4.74047 * sqrt(power(tgas.pmra, 2) 
            + power(tgas.pmdec, 2)) / power(tgas.parallax, 2)
         ) /
         ( sqrt(
                (power(tgas.pmra, 2) 
                + power(tgas.pmdec, 2)) * power(4.74047 / tgas.parallax, 2)
                + power(500, 2) ) )
        ) < 10
""")
data = timeit(gaia.query)(adql)


ADQL query

SELECT
    tgas.tycho2_id, tycho2.id, tgas.source_id, tgas.phot_g_mean_mag,
    tgas.ra, tgas.dec, tgas.parallax, tgas.pmra, tgas.pmdec, tgas.ra_error,
    tgas.dec_error, tgas.parallax_error, tgas.pmra_error, tgas.pmdec_error,
    tgas.ra_dec_corr, tgas.ra_parallax_corr, tgas.ra_pmra_corr,
    tgas.ra_pmdec_corr, tgas.dec_parallax_corr, tgas.dec_pmra_corr,
    tgas.dec_pmdec_corr, tgas.parallax_pmra_corr, tgas.parallax_pmdec_corr,
    tgas.pmra_pmdec_corr
FROM
    gaiadr1.tgas_source AS tgas
LEFT OUTER JOIN
    public.tycho2 AS tycho2
    ON tgas.hip = tycho2.hip
WHERE ( (
            1000 * 4.74047 * sqrt(power(tgas.pmra, 2)
            + power(tgas.pmdec, 2)) / power(tgas.parallax, 2)
         ) /
         ( sqrt(
                (power(tgas.pmra, 2)
                + power(tgas.pmdec, 2)) * power(4.74047 / tgas.parallax, 2)
                + power(500, 2) ) )
        ) < 10

Execution time: 48.5 s


In [31]:
data


Out[31]:
<Table masked=True length=117330>
tycho2_ididsource_idphot_g_mean_magradecparallaxpmrapmdecra_errordec_errorparallax_errorpmra_errorpmdec_errorra_dec_corrra_parallax_corrra_pmra_corrra_pmdec_corrdec_parallax_corrdec_pmra_corrdec_pmdec_corrparallax_pmra_corrparallax_pmdec_corrpmra_pmdec_corr
Magnitude[mag]Angle[deg]Angle[deg]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Angle[mas]Angle[mas]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]
objectobjectint64float64float64float64float64float64float64float64float64float64float64float64float32float32float32float32float32float32float32float32float32float32
--958-1468-244642074242824753927.1198602385889878243.3276406875443513.5251517266491640.800006093117474178.74861089198765-420.799603828774540.131046494842446650.161178935335265090.310437239970455860.0646012574900593080.0579307186334810510.29621336-0.416217890.13098365-0.006937637-0.728720720.0590749120.035693608-0.095779195-0.054888006-0.16881736
--74-2361-132552529860581614087.369527691033113164.0060942737132170.45381198336562957.107176124651049424.354612773084156-20.9896185961261280.296443183474220990.0961505675871550250.351085470754071980.062906899259033380.0443697177467910140.49005944-0.608300090.168462170.062295858-0.15591720.0857815890.063546307-0.12095547-0.0469348580.028684257
--96-602-1328808275399876390410.51060062089623473.0245280871035486.475277204812374380.712650990387615153.81772481777199-305.570597285985510.261373980286629230.180541584322101460.331239625323669350.245146935086608980.147024347967397740.16450725-0.013415399-0.271410820.15436758-0.131500860.18181257-0.24545169-0.115104450.054411348-0.11440114
--642-261-12610127217209216010.95585003246107439.16182939412398412.20778309662089520.383739875496488240.67529562993337-64.4857291723089360.235827259018751540.208458976319333360.316852663851330430.19094776821155490.13636953348692929-0.37213090.43391946-0.070138283-0.16176727-0.71138328-0.0512979180.010662213-0.050761018-0.165127520.16369283
--635-26-12628392854120281610.80986117818229937.2278181917498912.08949678753522539.919961935565183-2.586432847257864281.6262924969435350.600395156039941310.468907502681810580.679341032001646080.185031268351239510.1081967345680231-0.509005840.560347620.059183244-0.15097594-0.70150846-0.0755498480.153094520.032147638-0.18398650.20139986
--705-228-1333816272550864652810.05780089877682283.0603063508127249.819909963624844877.77121731583712-180.0535819052561-216.916893829152770.401150720937413440.538750977250551320.613627737904397060.198529817350121870.10421063302057142-0.373498860.240062490.059844881-0.047105495-0.76504147-0.0356823620.178465570.065704599-0.136220480.052898854
--883-319-237310767103811269128.1492894201797128196.128152969051518.65318223167454195.169615505447686716.7913676417927070.153214564185626370.301332737054675560.199386057075085140.340025058644894240.0792809539965363270.049291718311859016-0.71896017-0.618442420.11187758-0.0714673250.47721675-0.0640675350.15949973-0.0603467340.020644929-0.4326793
--993-77-1448907013096703872011.093453539211243265.774402882478519.075098041226512517.64909121187484-32.322782317896724187.883410334017730.191908921427818650.200355392430541350.255579085897718440.187627989497596260.14448000991756382-0.0734813360.094857119-0.26464456-0.062460415-0.10070089-0.12583351-0.14554140.0494937750.014614867-0.15881599
--1176-118-1277094405562541414411.17866688441425355.4678059447344414.10681227965882723.097433789789051189.59831620509766-70.0237162869701420.352017424809906720.176421907547844730.398884800457741050.288989961104186570.159026214337860630.484702020.01193755-0.058665954-0.13205461-0.15349667-0.086680703-0.0935615-0.16123833-0.101706230.38748664
........................................................................
5867-270-1--515264704496324288010.60343620551592644.968316611370177-17.5000702988819517.402599094343214533.729324908746115-17.7339033160940890.200308205126076530.177158578059197050.219319924803418941.36128219071336990.91192106006231177-0.274558480.351454110.15453874-0.2122146-0.000127479670.24284515-0.407206240.17599767-0.3501786-0.86046755
5860-278-1--513304546705942848011.16412139694814338.266186307013285-17.3833037787559521.9774922198410643-0.037879962195114525-3.35204963025052870.193595235939135080.181202477045251390.24465519819717281.43207342193092590.89798867352525369-0.447716980.48197046-0.330750520.20729962-0.386933270.42750016-0.459983410.15902944-0.27701685-0.90613073
5894-233-1--509204344150898060811.98757721484579666.764372011830886-19.8111787245563464.33569444293373526.0191248770749803-1.70244090998145680.136976543284427740.191472887675661370.235647756148724130.981103428362115730.7809818155346101-0.278119120.18249577-0.0691531230.14320052-0.0782212990.42834297-0.603446540.45738769-0.36387622-0.75557935
5894-1920-1--50920571854045977609.766880156471135266.754995053429994-19.5700661915338724.825142245543943913.79167167269018913.623392892931610.123444234677288350.202265592395479250.253758421006097750.73271986053497740.64952147017063611-0.228048280.297185780.0326643960.00015251618-0.205088440.32589087-0.56592280.28827974-0.33773363-0.64709479
5893-1059-1--50921069383057390089.876584294130276866.513808706335112-19.5091325195237394.6914302182456424-4.37771284940416643.1773917253535080.151367873020267830.186220934375629890.26121166546684360.820860037146184270.75690803603805246-0.0623037520.25545049-0.012737771-0.0713734850.192104430.32080275-0.620452940.3567434-0.48555902-0.60528684
5893-1668-1--50921902950312563209.74541117195846665.380809466275082-19.75770100247619340.587814749808629-201.93506861940239-291.889176003031930.211120643508206580.26426095659669990.259890780378682131.32220643439515981.4249203796536405-0.70768625-0.38236192-0.556424740.545098250.595477340.75052881-0.83128220.67618316-0.71395391-0.9336803
5299-425-1--516193644074920537611.88675307045505252.71616317941902-11.8980247895467749.8755127020586908-2.8198575902724001-34.0142934897017920.248333756289793760.222695525661536470.289835438370309891.44007203159478841.0200575979695843-0.386440010.26966658-0.305075440.17098616-0.127324910.42667368-0.611367820.20748778-0.29711589-0.8360461
5861-1838-1--512486132097782118410.08196980774042833.692563652566562-20.7706109864894917.441026569260086151.264674341404856-8.54912202559467180.239772561662880550.195829619787528720.235884632927823720.72790221418304590.44240857853411397-0.667833150.34615254-0.885980070.70222366-0.539421560.73258668-0.71669334-0.459753810.17038192-0.73626155
5861-1966-1--512507167129619033611.18231495921344833.937992298003024-19.8530783955523022.8518608454983254-0.408348178375943674.81340786168501731.84769610368281680.996886634550417820.44740110689808964.62479953163716222.7479672934106851-0.97831637-0.58271301-0.995060920.98833430.491581170.98296231-0.980196890.54243255-0.59211057-0.98712897
6434-853-1--512514489189872217610.66904915674546939.845819460599643-24.0208913430120033.38853194494802389.62281541415688582.39664497187803070.251272224265509160.241189404239059770.28299417475033090.788174705668611250.7436670838360756-0.405875830.49761012-0.814176260.54049796-0.437838970.51121449-0.55641109-0.479658570.22281404-0.60499299

In [32]:
adql = QueryStr("""
select 
    tgas.tycho2_id, tycho2.id, tgas.source_id, tgas.phot_g_mean_mag,
    tgas.ra, tgas.dec, tgas.parallax, tgas.pmra, tgas.pmdec, tgas.ra_error,
    tgas.dec_error, tgas.parallax_error, tgas.pmra_error, tgas.pmdec_error,
    tgas.ra_dec_corr, tgas.ra_parallax_corr, tgas.ra_pmra_corr,
    tgas.ra_pmdec_corr, tgas.dec_parallax_corr, tgas.dec_pmra_corr,
    tgas.dec_pmdec_corr, tgas.parallax_pmra_corr, tgas.parallax_pmdec_corr,
    tgas.pmra_pmdec_corr
from 
    gaiadr1.tgas_source as tgas
left outer join 
    public.tycho2 as tycho2
    on tgas.hip = tycho2.hip
where ( (
            1000 * 4.74047 * sqrt(power(tgas.pmra, 2) 
            + power(tgas.pmdec, 2)) / power(tgas.parallax, 2)
         ) /
         ( sqrt(
                (power(tgas.pmra, 2) 
                + power(tgas.pmdec, 2)) * power(4.74047 / tgas.parallax, 2)
                + power(500, 2) ) )
        ) < 10
""")
gaia.query_async(adql).get()


ADQL query

SELECT
    tgas.tycho2_id, tycho2.id, tgas.source_id, tgas.phot_g_mean_mag,
    tgas.ra, tgas.dec, tgas.parallax, tgas.pmra, tgas.pmdec, tgas.ra_error,
    tgas.dec_error, tgas.parallax_error, tgas.pmra_error, tgas.pmdec_error,
    tgas.ra_dec_corr, tgas.ra_parallax_corr, tgas.ra_pmra_corr,
    tgas.ra_pmdec_corr, tgas.dec_parallax_corr, tgas.dec_pmra_corr,
    tgas.dec_pmdec_corr, tgas.parallax_pmra_corr, tgas.parallax_pmdec_corr,
    tgas.pmra_pmdec_corr
FROM
    gaiadr1.tgas_source AS tgas
LEFT OUTER JOIN
    public.tycho2 AS tycho2
    ON tgas.hip = tycho2.hip
WHERE ( (
            1000 * 4.74047 * sqrt(power(tgas.pmra, 2)
            + power(tgas.pmdec, 2)) / power(tgas.parallax, 2)
         ) /
         ( sqrt(
                (power(tgas.pmra, 2)
                + power(tgas.pmdec, 2)) * power(4.74047 / tgas.parallax, 2)
                + power(500, 2) ) )
        ) < 10
Query Status: 303 Reason: 303
Location: http://gea.esac.esa.int/tap-server/tap/async/1479127182720O
Job id: 1479127182720O
Out[32]:
<Table masked=True length=333180>
tycho2_ididsource_idphot_g_mean_magradecparallaxpmrapmdecra_errordec_errorparallax_errorpmra_errorpmdec_errorra_dec_corrra_parallax_corrra_pmra_corrra_pmdec_corrdec_parallax_corrdec_pmra_corrdec_pmdec_corrparallax_pmra_corrparallax_pmdec_corrpmra_pmdec_corr
Magnitude[mag]Angle[deg]Angle[deg]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Angle[mas]Angle[mas]Angle[mas]Angular Velocity[mas/year]Angular Velocity[mas/year]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]Dimensionless[see description]
objectobjectint64float64float64float64float64float64float64float64float64float64float64float64float32float32float32float32float32float32float32float32float32float32
--958-1468-244642074242824753927.1198602385889878243.3276406875443513.5251517266491640.800006093117474178.74861089198765-420.799603828774540.131046494842446650.161178935335265090.310437239970455860.0646012574900593080.0579307186334810510.29621336-0.416217890.13098365-0.006937637-0.728720720.0590749120.035693608-0.095779195-0.054888006-0.16881736
--74-2361-132552529860581614087.369527691033113164.0060942737132170.45381198336562957.107176124651049424.354612773084156-20.9896185961261280.296443183474220990.0961505675871550250.351085470754071980.062906899259033380.0443697177467910140.49005944-0.608300090.168462170.062295858-0.15591720.0857815890.063546307-0.12095547-0.0469348580.028684257
--96-602-1328808275399876390410.51060062089623473.0245280871035486.475277204812374380.712650990387615153.81772481777199-305.570597285985510.261373980286629230.180541584322101460.331239625323669350.245146935086608980.147024347967397740.16450725-0.013415399-0.271410820.15436758-0.131500860.18181257-0.24545169-0.115104450.054411348-0.11440114
--642-261-12610127217209216010.95585003246107439.16182939412398412.20778309662089520.383739875496488240.67529562993337-64.4857291723089360.235827259018751540.208458976319333360.316852663851330430.19094776821155490.13636953348692929-0.37213090.43391946-0.070138283-0.16176727-0.71138328-0.0512979180.010662213-0.050761018-0.165127520.16369283
--635-26-12628392854120281610.80986117818229937.2278181917498912.08949678753522539.919961935565183-2.586432847257864281.6262924969435350.600395156039941310.468907502681810580.679341032001646080.185031268351239510.1081967345680231-0.509005840.560347620.059183244-0.15097594-0.70150846-0.0755498480.153094520.032147638-0.18398650.20139986
--705-228-1333816272550864652810.05780089877682283.0603063508127249.819909963624844877.77121731583712-180.0535819052561-216.916893829152770.401150720937413440.538750977250551320.613627737904397060.198529817350121870.10421063302057142-0.373498860.240062490.059844881-0.047105495-0.76504147-0.0356823620.178465570.065704599-0.136220480.052898854
--883-319-237310767103811269128.1492894201797128196.128152969051518.65318223167454195.169615505447686716.7913676417927070.153214564185626370.301332737054675560.199386057075085140.340025058644894240.0792809539965363270.049291718311859016-0.71896017-0.618442420.11187758-0.0714673250.47721675-0.0640675350.15949973-0.0603467340.020644929-0.4326793
--993-77-1448907013096703872011.093453539211243265.774402882478519.075098041226512517.64909121187484-32.322782317896724187.883410334017730.191908921427818650.200355392430541350.255579085897718440.187627989497596260.14448000991756382-0.0734813360.094857119-0.26464456-0.062460415-0.10070089-0.12583351-0.14554140.0494937750.014614867-0.15881599
--1176-118-1277094405562541414411.17866688441425355.4678059447344414.10681227965882723.097433789789051189.59831620509766-70.0237162869701420.352017424809906720.176421907547844730.398884800457741050.288989961104186570.159026214337860630.484702020.01193755-0.058665954-0.13205461-0.15349667-0.086680703-0.0935615-0.16123833-0.101706230.38748664
........................................................................
----202211355263003148810.809417980891368295.2116953546075127.1355001570469841.18508043930527410.094647723443364196-1.08309780951148290.261875950948108340.384883875799660660.440527397545184330.131683954327468790.19988399431264031-0.58875716-0.147155780.0074138343-0.106396510.0010837732-0.10469290.026150052-0.076331846-0.081078418-0.23453073
----21602311796556330249.7948495862371043279.5968206149699463.5372795719469045.651389675567148-19.20597030086671717.2141893950993070.164889928289471190.218765380839996360.210437428498758630.293068295937150220.30402815260508692-0.010994269-0.28242409-0.046643503-0.0249308740.46829367-0.16849175-0.14975509-0.27645656-0.063228920.14222069
----548063600209393331211.446477304631612104.15399891384047-59.18137158818243623.424843588095328-150.78273904046455454.350286793427360.433858657056431230.446920202268785320.466651159136457380.226756097125051760.274235665090083740.141394990.370148960.0474236790.0227179750.238788020.073354833-0.0850038230.13153161-0.035147347-0.075632788
----625862809607953356810.679517992834739231.18923796930426-17.12843650908628833.146080976234572-319.46156015395218-201.679718872468950.313379042170381930.232852698209462830.376786521763898320.271977166636992840.19061893014805048-0.224848140.35710266-0.128490870.030083697-0.52006841-0.053378418-0.139153760.15462995-0.0079136845-0.14807144
----351784120486170048010.402337441827887182.79805471760096-19.96134314201502178.334858103802333-212.86048826743612-184.265656967065870.645952921074191560.457680577784492340.749602077222909390.123436351005466520.083204513649514519-0.17557824-0.276191890.187185620.00026196352-0.25005001-0.0115492420.21315245-0.033735294-0.090193689-0.31117707
----577664501069495142412.239973169418064266.32849154808196-78.5317911054801241.80408715882914582.33256856717833431.60366321591913130.254395881507159690.314721723637236160.344546016765732770.248972445648180140.273457144932466090.36303121-0.11208639-0.0970066120.085304782-0.302947940.11997031-0.13838637-0.03640610.111749040.048471533
----266479579433009651211.411825335942742349.130088637691246.995675402207151412.14805872544619117.631118317221926-58.6640070001905230.215054964038925210.127737889146436440.262685782045086910.316438606694532890.20876037445503348-0.128018780.47826239-0.24485251-0.44638935-0.47547111-0.13365172-0.05507322-0.36148632-0.489343760.57178384
----663702278939719385610.919085255653844283.21617362876736-57.13025381738318539.223923569807397-246.05303998875536-771.786176635448560.25863515034074880.289314539183041620.38399775375883760.225086446287195570.20224064951627124-0.334665270.30280566-0.14484498-0.0093929665-0.429123940.111826860.035479687-0.065175578-0.102702250.30287388
----107837430700599859210.386880096586156158.5339576339876373.8366859887843965.9706353380101298-17.15652431194281-21.25912113782260.182423748872677540.213040103018969810.244125117272061430.510457138310488220.570720254766031390.18692671-0.38935751-0.42439130.099707037-0.47588104-0.10381275-0.282606150.36046013-0.0144993350.42040604
----325032827347724262410.32353607668088856.848008296367631-1.97334930379312659.265918380724678180.4326672102344-274.095865103980490.313577801619313980.188533980696476170.331968534553695240.170013221388768180.12728117236514613-0.482520130.111799330.03517729-0.042679433-0.41038513-0.14983699-0.146418330.012961082-0.0508273950.29421562

As you can see this does not return the same number of entries but many more than the sync mode.

This can be for many reasons (esp. a bug of the service) but it could be that one of the sync mode limits was reached. Most likely the 1min time limit, but the reported time is still under that...

Testing TAPVizieR

The VizieR ADQL service (http://tapvizier.u-strasbg.fr/) a service is hosted by the CDS - Strasbourg allows access to any VizieR table and catalog. The same operations are provided through the ADQL protocols.

Below I reproduce all examples using the TAPVizieR Service.

Important note: column and table names in Vizier are not identical to the Gaia Archive ones

Testing interface


In [33]:
adql = QueryStr("""
select top 100
    gaia.source_id, gaia.hip,
    gaia.phot_g_mean_mag+5*log10(gaia.parallax)-10 as g_mag_abs_gaia,
    gaia.phot_g_mean_mag+5*log10(hip.plx)-10 as g_mag_abs_hip,
    hip."B-V"
from 
    "I/337/tgas" as gaia
inner join 
    "I/311/hip2" as hip
    on gaia.hip = hip.HIP
where
    gaia.parallax/gaia.parallax_error >= 5 
    and hip.Plx/hip.e_Plx >= 5 
    and hip."e_B-V" > 0.0 and hip."e_B-V" <= 0.05 
    and (2.5/log(10))*(gaia.phot_g_mean_flux_error/gaia.phot_g_mean_flux) <= 0.05
""")

vizier = TAPVizieR()
result = timeit(vizier.query)(adql)
result


ADQL query

SELECT top 100
    gaia.source_id, gaia.hip,
    gaia.phot_g_mean_mag+5*log10(gaia.parallax)-10 AS g_mag_abs_gaia,
    gaia.phot_g_mean_mag+5*log10(hip.plx)-10 AS g_mag_abs_hip,
    hip."B-V"
FROM
    "I/337/tgas" AS gaia
INNER JOIN
    "I/311/hip2" AS hip
    ON gaia.hip = hip.HIP
WHERE
    gaia.parallax/gaia.parallax_error >= 5
    AND hip.Plx/hip.e_Plx >= 5
    AND hip."e_B-V" > 0.0 AND hip."e_B-V" <= 0.05
    AND (2.5/log(10))*(gaia.phot_g_mean_flux_error/gaia.phot_g_mean_flux) <= 0.05

Execution time: 1.11 s

Out[33]:
<Table masked=True length=100>
source_id [1]hip [1]g_mag_abs_gaia [1]g_mag_abs_hip [1]B-V [1]
magmagmag
int64int32float64float64float64
468777414722050649668121.851376786541.900175972990.962
468778507361163814471424.83971588475.249158466640.739
468802761900614694472562.640618885252.636132370420.389
468821711296328448078143.653492752113.566090113350.59
468829253259226726411135.208917248575.32884193090.743
4688376164190784384698-0.361463990472-0.3674874961641.326
468839970061505702419533.912508512533.788555405290.594
46886545467923339528794.054577556934.021242896030.551
468875607982164505617794.973703342585.13828010060.769
...............
469849740924302784079731.811146796222.409005118920.578
469855499616493734482994.301827441134.483149126260.588
469882420471464460879542.591519196882.496798218690.956
469895006443629120088132.248647373952.78880005490.44
469909434097742041694714.033093804253.89804722680.669
469910214063790745693083.100073262173.06120596330.423
469912265340336512096862.992063363643.111255072320.418
4699595890078393088102621.282327390021.566352515011.026
469987767429261171299530.5362455415970.9866473739541.252
4700068233401598208108663.942666794763.285830573640.524

TGAS Luminosity function


In [34]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(gaia.phot_g_mean_mag, 1) as val
from 
    "I/337/tgas" as gaia
group by val
order by val
""")
data_tgas = timeit(vizier.query)(adql)

# Apparently some parsing issues in TAPVizieR...
n = [int(k) for k in data_tgas['n']]
vals = [float(k) for k in data_tgas['val']]
#
plt.step(vals, n, lw=2, where='pre', label='TGAS')
plt.yscale('log')
plt.xlabel('G magnitude')
plt.ylabel('counts / mag')
figrc.hide_axis('top right'.split())


ADQL query

SELECT
    count(*) AS n,
    round(gaia.phot_g_mean_mag, 1) AS val
FROM
    "I/337/tgas" AS gaia
GROUP BY val
ORDER BY val

Execution time: 4.48 s

TGAS density map in l,b coordinates


In [35]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(glon, 0) as x, 
    round(glat, 0) as y
from 
    "I/337/tgas"
group by x, y
order by x, y
""")
data = timeit(vizier.query)(adql)

from matplotlib.colors import LogNorm
dx = [float(k) for k in data['x']]
dy = [float(k) for k in data['y']]
dn = [int(k) for k in data['n']]
l = np.arange(0, 360, 1)
b = np.arange(-90, 90, 1)
n = np.zeros((len(l), len(b)))
ix = np.digitize(dx, l)
iy = np.digitize(dy, b)
n[ix - 1, iy - 1] = dn
plt.pcolormesh(l, b, n.T, 
               cmap=plt.cm.viridis, 
               norm=LogNorm())
figrc.hide_axis('top right'.split())
plt.xlim(l.min(), l.max())
plt.ylim(b.min(), b.max());


ADQL query

SELECT
    count(*) AS n,
    round(glon, 0) AS x,
    round(glat, 0) AS y
FROM
    "I/337/tgas"
GROUP BY x, y
ORDER BY x, y

Execution time: 22.5 s

M31 cone-search


In [36]:
adql = QueryStr("""
select
    count(*) as n, 
    round(ra, 2) as latitude, 
    round(dec, 2) as longitude
from 
    "I/337/gaia"
where 
    contains(point('ICRS', ra, dec),
             circle('ICRS',10.6847083,41.26875,2) )=1  
group by latitude, longitude
order by latitude, longitude
""")
data = timeit(vizier.query)(adql)

lat = [float(k) for k in data['latitude']]
lon = [float(k) for k in data['longitude']]
n = [int(k) for k in data['n']]

plt.figure(figsize=(6,6))
plt.subplot(111, aspect=1)
plt.scatter(lat, lon, c=n,
            edgecolor='None', s=6, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='s'
           )
plt.xlim(min(lat), max(lat))
plt.ylim(min(lon), max(lon))
plt.axis('off');


ADQL query

SELECT
    count(*) AS n,
    round(ra, 2) AS latitude,
    round(dec, 2) AS longitude
FROM
    "I/337/gaia"
WHERE
    contains(point('ICRS', ra, dec),
             circle('ICRS',10.6847083,41.26875,2) )=1
GROUP BY latitude, longitude
ORDER BY latitude, longitude

Execution time: 4.51 s

M33 cone search


In [37]:
adql = QueryStr("""
select 
    count(*) as n, 
    round(ra, 2) as latitude, 
    round(dec, 2) as longitude
from 
    gaiadr1.gaia_source 
where 
    contains(point('ICRS', ra, dec),
             circle('ICRS',23.4621,30.6599417,0.5) )=1  
group by latitude, longitude
order by latitude, longitude
""")
data = timeit(gaia.query)(adql)

lat = [float(k) for k in data['latitude']]
lon = [float(k) for k in data['longitude']]
n = [int(k) for k in data['n']]

plt.figure(figsize=(6,6))
plt.subplot(111, aspect=1)
plt.scatter(lat, lon, c=n,
            edgecolor='None', s=9, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='s'
           )
plt.xlim(min(lat), max(lat))
plt.ylim(min(lon), max(lon))
plt.axis('off');


ADQL query

SELECT
    count(*) AS n,
    round(ra, 2) AS latitude,
    round(dec, 2) AS longitude
FROM
    gaiadr1.gaia_source
WHERE
    contains(point('ICRS', ra, dec),
             circle('ICRS',23.4621,30.6599417,0.5) )=1
GROUP BY latitude, longitude
ORDER BY latitude, longitude

Execution time: 920 ms

TGAS CMD


In [38]:
adql = QueryStr("""
select
    count(*) as n,
    floor((hip."B-V") / 0.01) * 0.01 as color,
    floor((gaia.phot_g_mean_mag + 5*log10(gaia.parallax)-10) / 0.05) * 0.05 as mag
from 
    "I/337/tgas" as gaia
inner join 
    "I/311/hip2" as hip
    on gaia.hip = hip.HIP
where
    gaia.parallax/gaia.parallax_error >= 5 
    and hip."e_B-V" > 0.0 and hip."e_B-V" <= 0.05 
    and hip.Plx/hip.e_Plx >= 5 
    and (2.5/log(10))*(gaia.phot_g_mean_flux_error/gaia.phot_g_mean_flux) <= 0.05
group by color, mag
""")
data = timeit(vizier.query)(adql)
color = [float(k) for k in data['color']]
mag = [float(k) for k in data['mag']]
n = [int(k) for k in data['n']]
plt.scatter(color, mag, c=n, 
            edgecolor='None', s=2, rasterized=True, norm=LogNorm(),
            cmap=plt.cm.magma, marker='s'
           )
plt.xlim(min(color), max(color))
plt.ylim(max(mag), min(mag))
plt.xlabel('B-V (Hipparcos)')
plt.ylabel(r'G + 5 log($\varpi$) - 10')
figrc.hide_axis('top right'.split())


ADQL query

SELECT
    count(*) AS n,
    floor((hip."B-V") / 0.01) * 0.01 AS color,
    floor((gaia.phot_g_mean_mag + 5*log10(gaia.parallax)-10) / 0.05) * 0.05 AS mag
FROM
    "I/337/tgas" AS gaia
INNER JOIN
    "I/311/hip2" AS hip
    ON gaia.hip = hip.HIP
WHERE
    gaia.parallax/gaia.parallax_error >= 5
    AND hip."e_B-V" > 0.0 AND hip."e_B-V" <= 0.05
    AND hip.Plx/hip.e_Plx >= 5
    AND (2.5/log(10))*(gaia.phot_g_mean_flux_error/gaia.phot_g_mean_flux) <= 0.05
GROUP BY color, mag

Execution time: 9.46 s

Other example

Based on Watkins et al 2016.

Paper Abstract They perform a systematic search for Galactic globular cluster (GC) stars in the Tycho-Gaia Astrometric Solution (TGAS) catalog that formed part of Gaia Data Release 1 (DR1), and identify 5 members of NGC 104 (47 Tucanae), 1 member of NGC 5272 (M 3), 5 members of NGC 6121 (M 4), 7 members of NGC 6397, and 2 members of NGC 6656 (M 22).

By taking a weighted average of the member stars, fully accounting for the correlations between parameter estimates, they estimate the parallax (and, hence, distance) and proper motion (PM) of the GCs. This provides a homogeneous PM study of multiple GCs based on an astrometric catalogue with small and well-controlled systematic errors, and yields random PM errors that are similar to existing measurements. Detailed comparison to the available Hubble Space Telescope (HST) measurements generally shows excellent agreement, validating the astrometric quality of both TGAS and HST.

By contrast, comparison to ground-based measurements shows that some of those must have systematic errors exceeding the random errors. Our parallax estimates have uncertainties an order of magnitude larger than previous studies, but nevertheless imply distances consistent with previous estimates. By combining our PM measurements with literature positions, distances and radial velocities, we measure Galactocentric space motions for the clusters and find that these are also in good agreement with previous orbital analyses.

Our results highlight the future promise of Gaia for the determining accurate distances and PMs of Galactic GCs, which will provide crucial constraints on the near end of the cosmic distance ladder, and provide accurate GC orbital histories.

method For each cluster, they calculate the likelihood of each nearby stars of being a cluster member by decomposing this likelihood into a positional part $L_{\alpha\delta,i}$ and a motion part $L_{\varpi\mu,i}$.

On the one hand, stars close to the cluster centre of the cluster are more likely to be members than stars near to the $2\,R_{tidal}$ boundary, so we also calculate the likelihood of a star $i$ with coordinates $(\alpha_i, \delta_i)$ being a member of the GC with centre $(\alpha_{GC}, \delta_{GC})$ as,

\begin{eqnarray} L_{\alpha\delta,i} &=& p(\alpha_i, \delta_i | \alpha_{GC}, \delta_{GC}, \sigma) &=& \exp\left[-\frac{1}{2\sigma^2} \left((\alpha_i - \alpha_{GC})^2 + (\delta_i - \delta_{GC})^2\right) \right]. \end{eqnarray}

where they use $\sigma = \frac{1}{2} R_{tidal}$ to account for the approximate extent of the cluster. The uncertainties on the cluster centre coordinates and on the positions of the stars are negligible compared to the extent of the cluster, so we neglect the measurement errors in this calculation.

On the other hand, for a star $i$ with parallax and PM measurements $m_i = (\varpi, \mu_\alpha, \mu_\delta)$, and covariance $C_i$, they ask what is the likelihood $L_{\varpi\mu,i}$ that this star is a member of a GC with measurements $m_{GC}$ and covariance $C_{GC}$,

\begin{eqnarray} L_{\varpi\mu,i} &=& p(m_i | C_i, m_{GC}, C_{GC})\\ &=& \frac{1}{ \left[(2\pi)^3 |\det(C_{i} + C_{GC})|\right]^{1/2}}\,\exp\left[-\frac{1}{2} (m_i - m_{GC})^T \cdot (C_{i} + C_{GC})^{-1} \cdot (m_i - m_{GC}) \right]. \end{eqnarray}

The above is a standard 3-dimensional Gaussian.

To construct the GC covariance matrix $C_{GC}$, they assume that the errors are uncorrelated, so the diagonal terms are the squared uncertainties on the parallax and PM measurements and the off-diagonal elements are zero. They further add the GC velocity dispersion (H96) in quadrature to the PM terms to account for the expected spread in velocities. \begin{eqnarray} C_{GC} = \left[\begin{matrix} \sigma^2_\varpi & 0 & 0\\ 0 & \mu^2_\alpha + \sigma^2_v & 0 \\ 0 & 0 & \mu^2_\delta + \sigma^2_v \end{matrix}\right] \end{eqnarray}

additionally, (they do not explain but worth mentioning) the covariance matrix for the stars $C_i$ is given by \begin{eqnarray} C_{i} = \left[\begin{matrix} \sigma^2_\varpi & \rho_{\varpi,\mu\alpha}\sigma_\varpi\sigma_{\mu\alpha} & \rho_{\varpi,\mu\delta}\sigma_\varpi\sigma_{\mu\delta}\\ \rho_{\varpi,\mu\alpha}\sigma_\varpi\sigma_{\mu\alpha} & \sigma^2_{\mu,\alpha} & \rho_{\mu\alpha,\mu\delta}\sigma_{\mu,\alpha}\sigma_{\mu,\delta} \\ \rho_{\varpi,\mu\delta}\sigma_\varpi\sigma_{\mu\delta} & \rho_{\mu\alpha,\mu\delta}\sigma_{\mu,\alpha}\sigma_{\mu,\delta} & \sigma^2_{\mu,\delta} \end{matrix}\right] \end{eqnarray}

Finally, they compute the full likelihood as the product of the two pieces, $L_i = L_{\varpi\mu,i} \cdot L_{\alpha\delta,i}$ and they keep all stars with $\ln L_i > −11$ as possible cluster members.

Cluster $\alpha$ $\delta$ $c$ $R_{\rm core}$ $R_{\rm tidal}$ $D$ $\mu_\alpha$ $\mu_\delta$ $E$(B-V) [Fe/H] $v_{\rm r}$ $\sigma_{v}$
(deg) (deg) (arcmin) (arcmin) (kpc) (mas/yr) (mas/yr) (dex) (km/s) (km/s)
NGC 104 6.02 -72.08 2.07 0.360 42.296 4.5 5.63 $\pm$ 0.21 -2.73 $\pm$ 0.29 0.04 -0.72 -18.0 $\pm$ 0.1 11.0 $\pm$ 0.3
NGC 288 13.19 -26.58 0.99 1.350 13.193 8.9 4.67 $\pm$ 0.22 -5.60 $\pm$ 0.35 0.03 -1.32 -45.4 $\pm$ 0.2 2.9 $\pm$ 0.3
NGC 3201 154.40 -46.41 1.29 1.300 25.348 4.9 5.28 $\pm$ 0.32 -0.98 $\pm$ 0.33 0.24 -1.59 494.0 $\pm$ 0.2 5.0 $\pm$ 0.2
NGC 4372 186.44 -72.66 1.30 1.750 34.917 5.8 -6.49 $\pm$ 0.33 3.71 $\pm$ 0.32 0.39 -2.17 72.3 $\pm$ 1.2 $\dots$
NGC 4590 189.87 -26.74 1.41 0.580 14.908 10.3 -3.76 $\pm$ 0.66 1.79 $\pm$ 0.62 0.05 -2.23 -94.7 $\pm$ 0.2 2.5 $\pm$ 0.4
NGC 4833 194.89 -70.88 1.25 1.000 17.783 6.6 -8.11 $\pm$ 0.35 -0.96 $\pm$ 0.34 0.32 -1.85 200.2 $\pm$ 1.2 $\dots$

A bit of code

First, one can surely code the position likelihood in ADQL, which will also allow us to filter already stars that are too far to be member candidates.

Second, one can potentially implement the other likelihood, but it requires to code the dot-product and matrix inversion manually. This is feasible as we only manipulate 3x3 matrices. But let just not complicate our task, we will still download some data afterall. We can however compute as much as possible on the server side.

The query below shows how to implement for instance the computation of the covariance matrices and its determinant on top of the rest.


In [27]:
from tap import (GaiaArchive, QueryStr, timeit)

def get_tgas_stars(center_ra, center_dec, Rtidal, parallax, parallaxerr, 
                   mualpha, mualphaerr, mudelta, mudeltaerr, s_v):
    """ (sync)Query the database for a particular position and cluster properties. 
    
    Parameters
    ----------
    center_ra: float
        RA of the cluster center
    center_dec: float
        Dec of the cluster center
    Rtidal: float
        tidal radius of the cluster (in degrees)
    parallax: float
        mean parallax of the cluster in mas (1 mas <-> 1kpc)
    parallaxerr: float
        uncertainty on the cluster parallax
    mualpha: float
        mean proper motion of the cluster along RA (in mas/yr)
    mualphaerr: float
        mean proper motion uncertainty of the cluster along RA (in mas/yr)
    mudelta: float
        mean proper motion of the cluster along Dec (in mas/yr)
    mudeltaerr: float
        mean proper motion uncertainty of the cluster along Dec (in mas/yr)
    s_v: float
        internal velocity dispersion
    
    Returns
    -------
    data: Table
        entries from the query
    """
    adql = QueryStr("""
    select *, 
        q.a * (q.d * q.f - q.e * q.e) - q.b * (q.b * q.f - q.c * q.e) + q.c * q.e * (q.b - q.c) as det_pm_cov
    from (
        select 
            gaia.source_id, gaia.ra, gaia.dec, gaia.parallax, gaia.pmra, gaia.pmdec,
            gaia.phot_g_mean_mag as G_mag, tycho2.bt_mag, tycho2.vt_mag, parallax_error,
            pmra_error, pmdec_error, pmra_pmdec_corr, parallax_pmra_corr, parallax_pmdec_corr,
            (-0.5 / power({Rtidal:f},2) * (power({center_ra:+f} - gaia.ra , 2) + power({center_dec:+f}-gaia.dec, 2))) as lnl_alpha_delta,
            (-0.5 / power({Rtidal:f},2) * distance(point('icrs', gaia.ra, gaia.dec), point('icrs', {center_ra:+f}, {center_dec:+f}))) as lnl_alpha_delta2,
            power({s_gcparallax:f}, 2) + power(gaia.parallax_error, 2) as a,
            gaia.parallax_pmra_corr * gaia.parallax_error * gaia.pmra_error as b,
            gaia.parallax_pmdec_corr * gaia.parallax_error * gaia.pmdec_error as c,
            power({s_gcmualpha:f},2) + power({s_gcv:f}, 2) + power(gaia.pmra_error, 2) as d,
            gaia.pmra_pmdec_corr * gaia.pmra_error * gaia.pmdec_error as e,
            power({s_gcmudelta:f},2) + power({s_gcv:f}, 2) + power(gaia.pmdec_error, 2) as f,
            (gaia.parallax - {gc_parallax:f}) as delta_parallax,
            (gaia.pmra + (-1) * {gc_pmra:f}) as delta_pmalpha,
            (gaia.pmdec + (-1) * {gc_pmdec:f}) as delta_pmdec
        from 
            gaiadr1.tgas_source as gaia
        inner join 
            public.tycho2 as tycho2
            on gaia.tycho2_id = tycho2.id
        where 
            contains(point('ICRS', gaia.ra, gaia.dec),
                     circle('ICRS',{center_ra:f}, {center_dec:f}, {size:f}) )=1 
        ) as q
""".format(center_ra=center_ra, center_dec=center_dec, Rtidal=Rtidal, size=3 * Rtidal,
           gc_parallax=parallax, gc_pmra=mualpha, gc_pmdec=mudelta, 
           s_gcparallax=parallaxerr, s_gcmualpha=mualphaerr, s_gcmudelta=mudeltaerr, s_gcv=s_v
           ))
    gaia = GaiaArchive()
    return timeit(gaia.query)(adql)

def get_gaia_density(center_ra, center_dec, size):
    """ Query the database and produce a density map from the full DR1 catalog """
    adql = QueryStr("""
select 
    count(*) as n, 
    round(ra, 2) as latitude, 
    round(dec, 2) as longitude
from 
    gaiadr1.gaia_source as gaia
where 
    contains(point('ICRS', gaia.ra, gaia.dec),
             circle('ICRS',{0:f}, {1:f}, {2:f}) )=1  
group by latitude, longitude
order by latitude, longitude
""".format(center_ra, center_dec, size))
    gaia = GaiaArchive()
    return timeit(gaia.query)(adql)

Below I code the second likelihood in python for convenience.


In [28]:
def add_lnl_mu(recs):
    """ Adding the motion likelihood and final likelihood values """
    lnl_mu = np.zeros(len(recs), dtype=float)
    for k, data in enumerate(recs):
        a = data['a'] 
        b = data['b']
        c = data['c']
        d = data['d']
        e = data['e']
        f = data['f']
      
        cov = np.array(((a, b, c),
                        (b, d, e),
                        (c, e, f)))
        invcov = np.linalg.inv(cov)
        det_cov = np.linalg.det(2 * np.pi * invcov)
        m = np.array((data['delta_parallax'], data['delta_pmalpha'], data['delta_pmdec']))
        lnl_mu[k] = - 0.5 * np.log(abs(det_cov)) - 0.5 * m.T @ (invcov @ m)
    from astropy.table import Column
    if 'lnl_mu' in ngc104.keys():
        recs.remove_column('lnl_mu')
        recs.remove_column('lnl')
        recs.remove_column('lnl2')
    recs.add_column(Column(lnl_mu, name='lnl_mu'))    
    recs.add_column(Column(recs['lnl_alpha_delta'] + lnl_mu, name='lnl'))
    recs.add_column(Column(recs['lnl_alpha_delta2'] + lnl_mu, name='lnl2'))
    return recs[np.argsort(recs['lnl'][::-1])]

We make also some figures


In [29]:
def fig_plot(data, ngc104, **kwargs):
    """ Plot some figures """
    from matplotlib.colors import LogNorm
    lat = [float(k) for k in data['latitude']]
    lon = [float(k) for k in data['longitude']]
    n = [int(k) for k in data['n']]
    
    members = kwargs.pop("members", None)

    plt.figure(figsize=(10,4))
    plt.subplot(121)
    plt.scatter(lat, lon, c=n,
                edgecolor='None', s=12, rasterized=True, norm=LogNorm(),
                cmap=plt.cm.magma, marker='s', alpha=0.4
               )
    plt.plot(ngc104['ra'], ngc104['dec'], 'o', mfc='b', mec='0.8', mew=2)
    if members is not None:
        plt.plot(members['ra'], members['dec'], 'o', mfc='r', mec='0.8', mew=2)
    plt.xlim(min(lat), max(lat))
    plt.ylim(min(lon), max(lon))
    plt.axis('off');

    plt.subplot(122)
    plt.plot(ngc104['bt_mag']-ngc104['vt_mag'], ngc104['vt_mag'], 'o')
    if members is not None:
        plt.plot(members['bt_mag']-members['vt_mag'], members['vt_mag'], 'ro')
    plt.ylim(plt.ylim()[::-1])
    plt.xlabel('B$_T$ - V$_T$')
    plt.ylabel('V$_T$')
    figrc.hide_axis('top right'.split())
    plt.tight_layout()

    plt.figure(figsize=(10, 5))
    ax = plt.subplot(221)
    plt.plot(ngc104['pmra'], ngc104['parallax'], 'o', **kwargs)
    if members is not None:
        plt.plot(members['pmra'], members['parallax'], 'ro', **kwargs)
    plt.ylabel(r'$\varpi$ [mas]')
    figrc.hide_axis('top right'.split())

    ax = plt.subplot(223, sharex=ax)
    plt.plot(ngc104['pmra'], ngc104['pmdec'], 'o', **kwargs)
    if members is not None:
        plt.plot(members['pmra'], members['pmdec'], 'ro', **kwargs)
    plt.xlabel(r'$\mu_\alpha$ [mas/yr]')
    plt.ylabel(r'$\mu_\delta$ [mas/yr]')
    figrc.hide_axis('top right'.split())

    ax = plt.subplot(224, sharey=ax)
    plt.plot(ngc104['parallax'], ngc104['pmdec'], 'o', **kwargs)
    if members is not None:
        plt.plot(members['parallax'], members['pmdec'], 'ro', **kwargs)
    plt.xlabel(r'$\varpi$ [mas]')
    figrc.hide_axis('top right'.split())

    plt.tight_layout(h_pad=0, w_pad=0)

Let's look at NGC 104, a.k.a. 47 Tuc

We can use the values from the referenced paper to define our cluster properties and run our query.


In [30]:
# 1 milliarcsec = 1kpc
ra, dec, Rtidal, parallax, parallaxerr, mualpha, mualphaerr, mudelta, mudeltaerr, s_v = (
    6.02, -72.08, 0.360, 1. / 4.5, 0, 5.63,  0.21, -2.73, 0.29, 0.0
)
data = get_gaia_density(ra, dec, 3 * Rtidal)
ngc104 = get_tgas_stars(ra, dec, Rtidal, parallax, parallaxerr, mualpha, mualphaerr, mudelta, mudeltaerr, s_v)


ADQL query

SELECT
    count(*) AS n,
    round(ra, 2) AS latitude,
    round(dec, 2) AS longitude
FROM
    gaiadr1.gaia_source AS gaia
WHERE
    contains(point('ICRS', gaia.ra, gaia.dec),
             circle('ICRS',6.020000, -72.080000, 1.080000) )=1
GROUP BY latitude, longitude
ORDER BY latitude, longitude

Execution time: 4.44 s

ADQL query

SELECT *,
        q.a * (q.d * q.f - q.e * q.e) - q.b * (q.b * q.f - q.c * q.e) + q.c * q.e * (q.b - q.c) AS det_pm_cov
    FROM (
        SELECT
            gaia.source_id, gaia.ra, gaia.dec, gaia.parallax, gaia.pmra, gaia.pmdec,
            gaia.phot_g_mean_mag AS G_mag, tycho2.bt_mag, tycho2.vt_mag, parallax_error,
            pmra_error, pmdec_error, pmra_pmdec_corr, parallax_pmra_corr, parallax_pmdec_corr,
            (-0.5 / power(0.360000,2) * (power(+6.020000 - gaia.ra , 2) + power(-72.080000-gaia.dec, 2))) AS lnl_alpha_delta,
            (-0.5 / power(0.360000,2) * distance(point('icrs', gaia.ra, gaia.dec), point('icrs', +6.020000, -72.080000))) AS lnl_alpha_delta2,
            power(0.000000, 2) + power(gaia.parallax_error, 2) AS a,
            gaia.parallax_pmra_corr * gaia.parallax_error * gaia.pmra_error AS b,
            gaia.parallax_pmdec_corr * gaia.parallax_error * gaia.pmdec_error AS c,
            power(0.210000,2) + power(0.000000, 2) + power(gaia.pmra_error, 2) AS d,
            gaia.pmra_pmdec_corr * gaia.pmra_error * gaia.pmdec_error AS e,
            power(0.290000,2) + power(0.000000, 2) + power(gaia.pmdec_error, 2) AS f,
            (gaia.parallax - 0.222222) AS delta_parallax,
            (gaia.pmra + (-1) * 5.630000) AS delta_pmalpha,
            (gaia.pmdec + (-1) * -2.730000) AS delta_pmdec
        FROM
            gaiadr1.tgas_source AS gaia
        INNER JOIN
            public.tycho2 AS tycho2
            ON gaia.tycho2_id = tycho2.id
        WHERE
            contains(point('ICRS', gaia.ra, gaia.dec),
                     circle('ICRS',6.020000, -72.080000, 1.080000) )=1
        ) AS q

Execution time: 301 ms

We also add the second likelihood and final one. Finally we filter stars to keep those with lnl > 11.


In [31]:
result = add_lnl_mu(ngc104)
fields = ['source_id', 'ra', 'dec', 'parallax', 'lnl_alpha_delta', 'lnl_alpha_delta2', 'lnl_mu', 'lnl', 'lnl2']
members = result[result['lnl'] > -11]
members.sort('lnl')
members[fields][::-1]


Out[31]:
<Table masked=True length=8>
source_idradecparallaxlnl_alpha_deltalnl_alpha_delta2lnl_mulnllnl2
Angle[deg]Angle[deg]Angle[mas]
int64float64float64float64float64float64float64float64float64
46896444165011328006.2170101137593949-71.9364569423815540.37507341579041814-0.22923454596428225-0.60150036729073375-2.7988054087-3.02803995467-3.40030577599
46896384378994351365.5745638831857649-72.1034332008032950.58021017218719462-0.76760204113592645-0.53610627647358955-4.52626678339-5.29386882452-5.06237305986
46896450006166822406.0141150444826863-71.9298223895785470.7569972004848331-0.087144858693433566-0.57943138406236783-5.61636849493-5.70351335362-6.19579987899
46896203303174031365.9132796741373417-72.2777566738294720.27418891704465148-0.19481840276361445-0.77328553777998654-5.85211566785-6.04693407061-6.62540120563
46898328453018443526.0936958914206301-71.8912983836157761.0583841942748684-0.15833095848109333-0.73330595273316934-6.13607645112-6.2944074096-6.86938240385
46896235944924821766.2653565131119864-72.158812515007810.18046400711992883-0.25621616916793633-0.42062056968602829-6.19642464456-6.45264081373-6.61704521425
46895958318239703045.4133921110160106-72.4140626122376860.67526452113793323-1.8501966044470644-1.4731690619749811-4.82298628575-6.6731828902-6.29615534773
46900240228883594247.304409361221766-71.8154101929599020.1092560087723143-6.6347036002454525-1.8438481503016266-3.9692286786-10.6039322788-5.8130768289

Interestingly, Watkins et al. claim only 5 candidates for this cluster, and we obtain 8. It would be interesting to understand the differences...

They find also stars with source_id:

  • 4689620330317403136
  • 4690024022888359424
  • 4689638437899435136
  • 4689645000616682240
  • 4689595831823970304

but do not find

  • 4689644416501132800
  • 4689832845301844352
  • 4689623594492482176

The following figure shows the DR1 density with on top the TGAS stars in blue, and member candidates in red.


In [32]:
fig_plot(data, ngc104, members=members, ms=4, alpha=0.5)


Correcting the distance-based likelihood


In [33]:
members = result[result['lnl2'] > -11]
members.sort('lnl2')
members[fields][::-1]


Out[33]:
<Table masked=True length=8>
source_idradecparallaxlnl_alpha_deltalnl_alpha_delta2lnl_mulnllnl2
Angle[deg]Angle[deg]Angle[mas]
int64float64float64float64float64float64float64float64float64
46896444165011328006.2170101137593949-71.9364569423815540.37507341579041814-0.22923454596428225-0.60150036729073375-2.7988054087-3.02803995467-3.40030577599
46896384378994351365.5745638831857649-72.1034332008032950.58021017218719462-0.76760204113592645-0.53610627647358955-4.52626678339-5.29386882452-5.06237305986
46900240228883594247.304409361221766-71.8154101929599020.1092560087723143-6.6347036002454525-1.8438481503016266-3.9692286786-10.6039322788-5.8130768289
46896450006166822406.0141150444826863-71.9298223895785470.7569972004848331-0.087144858693433566-0.57943138406236783-5.61636849493-5.70351335362-6.19579987899
46895958318239703045.4133921110160106-72.4140626122376860.67526452113793323-1.8501966044470644-1.4731690619749811-4.82298628575-6.6731828902-6.29615534773
46896235944924821766.2653565131119864-72.158812515007810.18046400711992883-0.25621616916793633-0.42062056968602829-6.19642464456-6.45264081373-6.61704521425
46896203303174031365.9132796741373417-72.2777566738294720.27418891704465148-0.19481840276361445-0.77328553777998654-5.85211566785-6.04693407061-6.62540120563
46898328453018443526.0936958914206301-71.8912983836157761.0583841942748684-0.15833095848109333-0.73330595273316934-6.13607645112-6.2944074096-6.86938240385

In [34]:
fig_plot(data, ngc104, members=members, ms=4, alpha=0.5)


The likelihood values based on spherical distance are significantly different from their definition. However, for this particular cluster we do not change the result by correcting their likelihood.

A little bit of crazy: make it all in ADQL query

Below I added the calculations of the second likelihood directly in ADQL. Why? Actually just for the fun of it...

As you can see, this is (i) tedious for a simple 3x3 matrix only, (ii) not very readable. Of course it gives the same results directly with lnl and therefore one can filter directly.

But let's be honest, it's faster to download a bit more and finish the calculations locally.

This example is mostly to show that one can implement complex variables and if necessary embed one query into another for more complex calculations.


In [45]:
def get_tgas_stars_full(center_ra, center_dec, Rtidal, parallax, parallaxerr, 
                        mualpha, mualphaerr, mudelta, mudeltaerr, s_v):
    """ (sync)Query the database for a particular position and cluster properties. 
    
    Parameters
    ----------
    center_ra: float
        RA of the cluster center
    center_dec: float
        Dec of the cluster center
    Rtidal: float
        tidal radius of the cluster (in degrees)
    parallax: float
        mean parallax of the cluster in mas (1 mas <-> 1kpc)
    parallaxerr: float
        uncertainty on the cluster parallax
    mualpha: float
        mean proper motion of the cluster along RA (in mas/yr)
    mualphaerr: float
        mean proper motion uncertainty of the cluster along RA (in mas/yr)
    mudelta: float
        mean proper motion of the cluster along Dec (in mas/yr)
    mudeltaerr: float
        mean proper motion uncertainty of the cluster along Dec (in mas/yr)
    s_v: float
        internal velocity dispersion
    
    Returns
    -------
    data: Table
        entries from the query
    """
    adql = QueryStr("""
    select 
        r.source_id, r.ra, r.dec, r.parallax, r.pmra, r.pmdec,
        r.G_mag, r.bt_mag, r.vt_mag, r.parallax_error,
        r.pmra_error, r.pmdec_error, r.pmra_pmdec_corr, 
        r.parallax_pmra_corr, r.parallax_pmdec_corr,
        r.lnl_alpha_delta, r.lnl_alpha_delta2, r.lnl_mu,
        (- 0.5 * log(abs(r.det_pm_cov)) + r.lnl_mu + r.lnl_alpha_delta) as lnl,
        (- 0.5 * log(abs(r.det_pm_cov)) + r.lnl_mu + r.lnl_alpha_delta2) as lnl2
    from (
        select *, 
            q.a * (q.d * q.f - q.e * q.e) - q.b * (q.b * q.f - q.c * q.e) + q.c * q.e * (q.b - q.c) as det_pm_cov,
            -0.5 * (delta_parallax*(delta_parallax*(-(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d) + 1.0/q.a + power(q.b, 2)/(power(q.a, 2)*(q.d - power(q.b, 2)/q.a))) + delta_pmalpha*((q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) - q.b/(q.a*(q.d - power(q.b, 2)/q.a))) - delta_pmdec*(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) + delta_pmalpha*(delta_parallax*(-(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) - q.b/(q.a*(q.d - power(q.b, 2)/q.a))) + delta_pmalpha*(1.0/(q.d - power(q.b, 2)/q.a) + power(q.e - q.b*q.c/q.a, 2)*(q.a*q.d - power(q.b, 2))/(power(q.d - power(q.b, 2)/q.a, 2)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) - delta_pmdec*(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) + delta_pmdec*(delta_parallax*(q.a*q.d - power(q.b, 2))*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d) - delta_pmalpha*(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) + delta_pmdec*(q.a*q.d - power(q.b, 2))/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) as lnl_mu
        from (
            select 
                gaia.source_id, gaia.ra, gaia.dec, gaia.parallax, gaia.pmra, gaia.pmdec,
                gaia.phot_g_mean_mag as G_mag, tycho2.bt_mag, tycho2.vt_mag, parallax_error,
                pmra_error, pmdec_error, pmra_pmdec_corr, parallax_pmra_corr, parallax_pmdec_corr,
                (-0.5 / power({Rtidal:f},2) * (power({center_ra:+f} - gaia.ra , 2) + power({center_dec:+f}-gaia.dec, 2))) as lnl_alpha_delta,
                (-0.5 / power({Rtidal:f},2) * distance(point('icrs', gaia.ra, gaia.dec), point('icrs', {center_ra:+f}, {center_dec:+f}))) as lnl_alpha_delta2,
                power({s_gcparallax:f}, 2) + power(gaia.parallax_error, 2) as a,
                gaia.parallax_pmra_corr * gaia.parallax_error * gaia.pmra_error as b,
                gaia.parallax_pmdec_corr * gaia.parallax_error * gaia.pmdec_error as c,
                power({s_gcmualpha:f},2) + power({s_gcv:f}, 2) + power(gaia.pmra_error, 2) as d,
                gaia.pmra_pmdec_corr * gaia.pmra_error * gaia.pmdec_error as e,
                power({s_gcmudelta:f},2) + power({s_gcv:f}, 2) + power(gaia.pmdec_error, 2) as f,
                (gaia.parallax - {gc_parallax:f}) as delta_parallax,
                (gaia.pmra + (-1) * {gc_pmra:f}) as delta_pmalpha,
                (gaia.pmdec + (-1) * {gc_pmdec:f}) as delta_pmdec
            from 
                gaiadr1.tgas_source as gaia
            left outer join
                public.tycho2 as tycho2
                on gaia.tycho2_id = tycho2.id
            where 
                contains(point('ICRS', gaia.ra, gaia.dec),
                         circle('ICRS',{center_ra:f}, {center_dec:f}, {size:f}) )=1 
            ) as q
        ) as r
""".format(center_ra=center_ra, center_dec=center_dec, Rtidal=Rtidal, size=3 * Rtidal,
           gc_parallax=parallax, gc_pmra=mualpha, gc_pmdec=mudelta, 
           s_gcparallax=parallaxerr, s_gcmualpha=mualphaerr, s_gcmudelta=mudeltaerr, s_gcv=s_v
           ))
    gaia = GaiaArchive()
    return timeit(gaia.query)(adql)

In [46]:
# 1 milliarcsec = 1kpc
ra, dec, Rtidal, parallax, parallaxerr, mualpha, mualphaerr, mudelta, mudeltaerr, s_v = (
    6.02, -72.08, 0.360, 1. / 4.5, 0, 5.63,  0.21, -2.73, 0.29, 0.0
)
ngc104 = get_tgas_stars_full(ra, dec, Rtidal, parallax, parallaxerr, 
                             mualpha, mualphaerr, mudelta, mudeltaerr, 
                             s_v)


ADQL query

SELECT
        r.source_id, r.ra, r.dec, r.parallax, r.pmra, r.pmdec,
        r.G_mag, r.bt_mag, r.vt_mag, r.parallax_error,
        r.pmra_error, r.pmdec_error, r.pmra_pmdec_corr,
        r.parallax_pmra_corr, r.parallax_pmdec_corr,
        r.lnl_alpha_delta, r.lnl_alpha_delta2, r.lnl_mu,
        (- 0.5 * log(abs(r.det_pm_cov)) + r.lnl_mu + r.lnl_alpha_delta) AS lnl,
        (- 0.5 * log(abs(r.det_pm_cov)) + r.lnl_mu + r.lnl_alpha_delta2) AS lnl2
    FROM (
        SELECT *,
            q.a * (q.d * q.f - q.e * q.e) - q.b * (q.b * q.f - q.c * q.e) + q.c * q.e * (q.b - q.c) AS det_pm_cov,
            -0.5 * (delta_parallax*(delta_parallax*(-(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d) + 1.0/q.a + power(q.b, 2)/(power(q.a, 2)*(q.d - power(q.b, 2)/q.a))) + delta_pmalpha*((q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) - q.b/(q.a*(q.d - power(q.b, 2)/q.a))) - delta_pmdec*(q.a*q.d - power(q.b, 2))*(-q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) + q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) + delta_pmalpha*(delta_parallax*(-(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) - q.b/(q.a*(q.d - power(q.b, 2)/q.a))) + delta_pmalpha*(1.0/(q.d - power(q.b, 2)/q.a) + power(q.e - q.b*q.c/q.a, 2)*(q.a*q.d - power(q.b, 2))/(power(q.d - power(q.b, 2)/q.a, 2)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) - delta_pmdec*(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) + delta_pmdec*(delta_parallax*(q.a*q.d - power(q.b, 2))*(q.b*(q.e - q.b*q.c/q.a)/(q.a*(q.d - power(q.b, 2)/q.a)) - q.c/q.a)/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d) - delta_pmalpha*(q.e - q.b*q.c/q.a)*(q.a*q.d - power(q.b, 2))/((q.d - power(q.b, 2)/q.a)*(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d)) + delta_pmdec*(q.a*q.d - power(q.b, 2))/(q.a*q.d*q.f - q.a*power(q.e, 2) - power(q.b, 2)*q.f + 2*q.b*q.c*q.e - power(q.c, 2)*q.d))) AS lnl_mu
        FROM (
            SELECT
                gaia.source_id, gaia.ra, gaia.dec, gaia.parallax, gaia.pmra, gaia.pmdec,
                gaia.phot_g_mean_mag AS G_mag, tycho2.bt_mag, tycho2.vt_mag, parallax_error,
                pmra_error, pmdec_error, pmra_pmdec_corr, parallax_pmra_corr, parallax_pmdec_corr,
                (-0.5 / power(0.360000,2) * (power(+6.020000 - gaia.ra , 2) + power(-72.080000-gaia.dec, 2))) AS lnl_alpha_delta,
                (-0.5 / power(0.360000,2) * distance(point('icrs', gaia.ra, gaia.dec), point('icrs', +6.020000, -72.080000))) AS lnl_alpha_delta2,
                power(0.000000, 2) + power(gaia.parallax_error, 2) AS a,
                gaia.parallax_pmra_corr * gaia.parallax_error * gaia.pmra_error AS b,
                gaia.parallax_pmdec_corr * gaia.parallax_error * gaia.pmdec_error AS c,
                power(0.210000,2) + power(0.000000, 2) + power(gaia.pmra_error, 2) AS d,
                gaia.pmra_pmdec_corr * gaia.pmra_error * gaia.pmdec_error AS e,
                power(0.290000,2) + power(0.000000, 2) + power(gaia.pmdec_error, 2) AS f,
                (gaia.parallax - 0.222222) AS delta_parallax,
                (gaia.pmra + (-1) * 5.630000) AS delta_pmalpha,
                (gaia.pmdec + (-1) * -2.730000) AS delta_pmdec
            FROM
                gaiadr1.tgas_source AS gaia
            LEFT OUTER JOIN
                public.tycho2 AS tycho2
                ON gaia.tycho2_id = tycho2.id
            WHERE
                contains(point('ICRS', gaia.ra, gaia.dec),
                         circle('ICRS',6.020000, -72.080000, 1.080000) )=1
            ) AS q
        ) AS r

Execution time: 450 ms


In [47]:
fields = ['source_id', 'ra', 'dec', 'parallax', 'lnl_alpha_delta', 'lnl_alpha_delta2', 'lnl_mu', 'lnl', 'lnl2']
members = result[result['lnl'] > -11]
members.sort('lnl')
members[fields][::-1]


Out[47]:
<Table masked=True length=8>
source_idradecparallaxlnl_alpha_deltalnl_alpha_delta2lnl_mulnllnl2
Angle[deg]Angle[deg]Angle[mas]
int64float64float64float64float64float64float64float64float64
46896444165011328006.2170101137593949-71.9364569423815540.37507341579041814-0.22923454596428225-0.60150036729073375-2.7988054087-3.02803995467-3.40030577599
46896384378994351365.5745638831857649-72.1034332008032950.58021017218719462-0.76760204113592645-0.53610627647358955-4.52626678339-5.29386882452-5.06237305986
46896450006166822406.0141150444826863-71.9298223895785470.7569972004848331-0.087144858693433566-0.57943138406236783-5.61636849493-5.70351335362-6.19579987899
46896203303174031365.9132796741373417-72.2777566738294720.27418891704465148-0.19481840276361445-0.77328553777998654-5.85211566785-6.04693407061-6.62540120563
46898328453018443526.0936958914206301-71.8912983836157761.0583841942748684-0.15833095848109333-0.73330595273316934-6.13607645112-6.2944074096-6.86938240385
46896235944924821766.2653565131119864-72.158812515007810.18046400711992883-0.25621616916793633-0.42062056968602829-6.19642464456-6.45264081373-6.61704521425
46895958318239703045.4133921110160106-72.4140626122376860.67526452113793323-1.8501966044470644-1.4731690619749811-4.82298628575-6.6731828902-6.29615534773
46900240228883594247.304409361221766-71.8154101929599020.1092560087723143-6.6347036002454525-1.8438481503016266-3.9692286786-10.6039322788-5.8130768289

In [48]:
fig_plot(data, ngc104, members=members, ms=4, alpha=0.5)


Some comment for the authors of Watkins et al.

  • The position likelihood is incorrect. It should be a sherical distance not a cartesian one. This gives more weight on RA than DEC variations.
  • differences in the number of members come from a bug in their paper table generation (private communication with L. Watkins). Bug apart they add more filtering later on which does not change their conclusions.
  • They should provide the exact conversion they use to convert $\sigma_v$, $\sigma_r$ into their covariance, I am puzzled by the statistical treatment (here I ignore these terms without affecting the results apparently).

In [ ]:
table1 = gaia
table2 = list of objects (ra, dec, size, name ...)

select ra, dec, source_id
from gaia 
where contain(point(gaia.ra, gaia.dec), circle(table2.ra, table2.dec, 1))
and table2.name = bla