Marvin Queries are a tool designed to remotely query the MaNGA dataset in global and local galaxy properties, and retrieve only the results you want. Let's learn the basics of how to construct a query and also test drive some of the more advanced features that are unique to the Marvin-tools version of querying.
In [1]:
# Python 2/3 compatibility
from __future__ import print_function, division, absolute_import
# import matplolib just in case
import matplotlib.pyplot as plt
# this line tells the notebook to plot matplotlib static plots in the notebook itself
%matplotlib inline
# this line does the same thing but makes the plots interactive
#%matplotlib notebook
In [2]:
# Import the config and set to remote. Let's query MPL-5 data
from marvin import config
# by default the mode is set to 'auto', but let's set it explicitly to remote.
config.mode = 'remote'
# by default, Marvin uses the latest MPL but let's set it explicitly to MPL-5
config.setRelease('MPL-5')
# By default the API will query using the Utah server, at api.sdss.org/marvin2. See the config.sasurl attribute.
config.sasurl
# If you are using one of the two local ngrok Marvins, you need to switch the SAS Url to one of our ngrok ids.
# Uncomment out the following lines and replace the ngrokid with the provided string
#ngrokid = 'ngrok_number_string'
#config.switchSasUrl('local', ngrokid=ngrokid)
#print(config.sasurl)
Out[2]:
In [2]:
# this is the Query tool
from marvin.tools.query import Query
The Marvin Query object allows you to specify a string search condition with which you want to look for results. It will construct the necessary SQL syntax for you, send it to the database at Utah using the Marvin API, and return the results. The Query accepts as a keyword argument searchfilter.
Let's try searching for all galaxies with a redshift < 0.1.
In [4]:
# the string search condition
my_search = 'z < 0.1'
The above string search condition is a pseudo-natural language format. Natural language in that you type what you mean to say, and pseudo because it still must be formatted in the standard SQL where condition syntax. This syntax generally takes the form of parameter_name operand value.
Marvin is smart enough to figure out which database table a parameter_name belongs to if and only if that name is a unique parameter name. If not you must specify the database table name along with the parameter name, in the form of table.parameter_name. Most MaNGA global properties come from the NASA-Sloan Atlas (NSA) catalog used for target selection. The database table name thus is nsa. So the full parameter_name for redshift is nsa.z.
If a parameter name is not unique, then Marvin will return an error asking you to fine-tune your parameter name by using the full parameter table.parameter_name
In [5]:
# the search condition using the full parameter name
my_search = 'nsa.z < 0.1'
# Let's setup the query. This will not run it automatically.
q = Query(searchfilter=my_search)
print(q)
Running the query produces a Marvin Results
object (r
):
In [6]:
# To run the query
r = q.run()
For number of results < 1000, Marvin will return the entire set of results. For queries that return > 1000, Marvin will paginate the results and only return the first 100, by default. (This can be modified with the limit keyword).
In [7]:
# Print result counts
print('total', r.totalcount)
print('returned', r.count)
It can be useful for informational and debugging purposes to see the raw SQL of your query, and your query runtime. If your query times out or crashes, the Marvin team will need these pieces of info to assess anything.
In [8]:
# See the raw SQL
print(r.showQuery())
In [9]:
# See the runtime of your query. This produces a Python datetime.timedelta object showing days, seconds, microseconds
print('timedelta', r.query_runtime)
# See the total time in seconds
print('query time in seconds:', r.query_runtime.total_seconds())
Query results are stored in r.results. This is a Python list object, and be indexed like an array. Since we have 100 results, let's only look at 10 for brevity.
In [10]:
# Show the results.
r.results[0:10]
Out[10]:
We will learn how to use the features of our Results
object a little bit later, but first let's revise our search to see how more complex search queries work.
Let's add to our previous search to find only galaxies with M$_\star$ > 3 $\times$ 10$^{11}$ M$_\odot$.
Let's use the Sersic profile determination for stellar mass, which is the sersic_mass
parameter of the nsa
table, so its full search parameter designation will be nsa.sersic_mass
. Since it's unique, you can also just use sersic_mass.
Adding multiple search criteria is as easy as writing it how you want it. In this case, we want to AND the two criteria. You can also OR, and NOT criteria.
In [11]:
# my new search
new_search = 'nsa.z < 0.1 and nsa.sersic_mass > 3e11'
In [12]:
config.setRelease('MPL-5')
q2 = Query(searchfilter=new_search)
r2 = q2.run()
In [13]:
print(r2.totalcount)
r2.results
Out[13]:
Let's say we are interested in galaxies with redshift < 0.1 and stellar mass > 3e11 or 19-fiber IFUs with an NSA sersic index < 2. We can compound multiple criteria together using parantheses. Use parantheses to help set the order of precedence. Without parantheses, the order is NOT > AND > OR.
To find 19 fiber IFUs, we'll use the name
parameter of the ifu
table, which means the full search parameter is ifu.name
. However, ifu.name
returns the IFU design name, such as 1901
, so we need to to set the value to 19*
, which acts as a wildcard.
In [14]:
# new search
new_search = '(z<0.1 and nsa.sersic_logmass > 11.47) or (ifu.name=19* and nsa.sersic_n < 2)'
In [15]:
q3 = Query(searchfilter=new_search)
r3 = q3.run()
In [16]:
r3.results[0:5]
Out[16]:
Often you want to run a query and return parameters that you didn't explicitly search on. For instance, you want to find galaxies below a redshift of 0.1 and would like to know their RA and DECs.
This is as easy as specifying the returnparams
keyword option in Query with either a string (for a single parameter) or a list of strings (for multiple parameters).
In [3]:
my_search = 'nsa.z < 0.1'
q = Query(searchfilter=my_search, returnparams=['cube.ra', 'cube.dec'])
r = q.run()
r.results[0:5]
Out[3]:
So far we have seen queries on global galaxy properties. These queries returned a list of galaxies satisfying the search criteria. We can also perform queries on spaxel regions within galaxies.
Let's find all spaxels from galaxies with a redshift < 0.1 that have H-alpha emission line flux > 30.
DAP properties are in a table called spaxelprop. The DAP-derived H-alpha emission line gaussian flux is called emline_gflux_ha_6564. Since this parameter is unique, you can either specify emline_gflux_ha_6564 or spaxelprop.emline_gflux_ha_6564
In [11]:
spax_search = 'nsa.z < 0.1 and emline_gflux_ha_6564 > 30'
In [12]:
q4 = Query(searchfilter=spax_search, returnparams=['emline_sew_ha_6564', 'emline_gflux_hb_4862', 'stellar_vel'])
r4 = q4.run()
In [13]:
r4.totalcount
r4.query_runtime.total_seconds()
Out[13]:
Spaxel queries will return a list of all spaxels satisfying your criteria. By default spaxel queries will return the galaxy information, and spaxel x and y.
In [15]:
r4.results[0:5]
Out[15]:
In [16]:
# We have a large number query spaxel results but from how many actual galaxies?
plateifu = r4.getListOf('plateifu')
print('# unique galaxies', len(set(plateifu)))
print(set(plateifu))
Once you have a set of query Results, you can easily convert your results into Marvin objects in your workflow. Depending on your result parameters, you can convert to Marvin Cubes, Maps, Spaxels, ModelCubes, or RSS. Let's convert our Results to Marvin Cubes. Note: Depending on the number of results, this conversion step may take a long time. Be careful!
In [17]:
# Convert to Cubes. For brevity, let's only convert only the first object.
r4.convertToTool('cube', limit=1)
In [18]:
print(r4.objects)
cube = r4.objects[0]
In [19]:
# From a cube, now we can do all things from Marvin Tools, like get a MaNGA MAPS object
maps = cube.getMaps()
print(maps)
# get a emission line sew map
em=maps.getMap('emline_sew', channel='ha_6564')
# plot it
em.plot()
# .. and a stellar velocity map
st=maps.getMap('stellar_vel')
# plot it
st.plot()
Out[19]:
or since our results are from a spaxel query, we can convert to Marvin Spaxels
In [20]:
# let's convert to Marvin Spaxels. Again, for brevity, let's only convert the first two.
r4.convertToTool('spaxel', limit=2)
print(r4.objects)
In [21]:
# Now we can do all the Spaxel things, like plot
spaxel = r4.objects[0]
spaxel.spectrum.plot()
Out[21]:
You can also convert your query results into other formats like an Astropy Table, or FITS
In [29]:
r4.toTable()
Out[29]:
In [34]:
r4.toFits('my_r4_results_2.fits')
In Queries you must specify a parameter_name or table.parameter_name. However to make it a bit easier, we have created table shortcuts and parameter name shortcuts for a few parameters. (more to be added..)
There are many parameters to search with. You can retrieve a list of available parameters to query. Please note that while currently many parameters in the list can technically be queried on, they have not been thoroughly tested to work, nor may they make any sense to query on. We cannot guarantee what will happen. If you find a parameter that should be queryable and does not work, please let us know.
In [4]:
# retrieve the list
allparams = q.get_available_params()
allparams
Out[4]:
Now let's play around with the web. Go to https://sas.sdss.org/marvin2
In [ ]: