In [17]:
import warnings
warnings.simplefilter('ignore')
This tutorial explores some basics of how to handle results of your Marvin Query. Much of this information can also be found in the Marvin Results documentation.
Table of Contents:
Our first step is to generate a query. Let's perform a simple metadata query to look for all galaxies with a redshift < 0.1. Let's also return the absolute magnitude g-r color and the Elliptical Petrosian half-light radius. This step assumes familiarity with Marvin Queries. To learn how to write queries, please see the Marvin Query documentation or the Marvin Query Tutorial.
In [1]:
# set up and run the query
from marvin.tools.query import Query
q = Query(search_filter='nsa.z < 0.1', return_params=['absmag_g_r', 'nsa.elpetro_th50_r'])
r = q.run()
In [2]:
# repr the results
r
Out[2]:
Our query runs and indicates a total count of 4275 results. By default, queries that return more than 1000 rows will be automatically paginated into sets (or chunks) of 100 rows, indicated by count=100
. The number of rows queries return can be changed using the limit
keyword argument to Qeuery
. The results are stored in the results
attribute.
In [3]:
# look at the results
r.results
Out[3]:
A ResultSet
contains a list of tuple rows with some default parameters like mangaid
and plateifu
, plus any parameters used in the Query
search_filter
or requested with the return_params
keyword. The redshift, g-r color, and half-light radius has been returned. We can look at all the columns available using the columns
attribute.
In [4]:
# look at the columns returned by your results
r.columns
Out[4]:
In [5]:
# get the next set of results
n = r.getNext()
In [6]:
# look at page 2
r.results
Out[6]:
In [7]:
# get the previous set
p = r.getPrevious()
To extend your results and keep them, use the extendSet
method. By default, extending a set grabs the next page of 100 results (defined by r.chunk
) and appends to the existing set of results. Rerunning extendSet
continues to append results until you've retrieved them all. To avoid running extendSet
multiple times, you can run use the loop
method, which will loop over all pages appending the data until you've retrieved all the results.
In [8]:
# extend the set by one page
r.extendSet()
r
Out[8]:
We now have 200 results out of the 4275. For results with a small number of total counts, you can attempt to retrieve all of the results with the getAll
method. Currently this method is limited to returning results containing 500,000 rows or rows with 25 columns.
There are several options for getting all of the results.
getAll
method to attempt to retrieve all the results in one request. loop
method to loop over all the pages to extend/append the results togetherQuery
using a new limit
to retrieve all the results. Note: A bug was recently found in getAll
and might not work. Instead we will rerun the query using a large limit to return all the results.
In [11]:
# get all the results
# r.getAll()
# rerun the query
q = Query(search_filter='nsa.z < 0.1', return_params=['absmag_g_r', 'nsa.elpetro_th50_r'], limit=5000)
r = q.run()
r
Out[11]:
We now have all the results. We can extract columns of data by indexing the results list using the column name. Let's extract the redshift and color.
In [33]:
# extract individual columns of data
redshift = r.results['nsa.z']
color = r.results['absmag_g_r']
You can convert the results to a variety of formats using the toXXX
methods. Common formats are FITS, Astropy Table, Pandas Dataframe, JSON, or CSV. Only the FITS and CSV conversions will write the output to a file. Astropy Tables and Pandas Dataframes have more options for writing out your dataset to a file. Let's convert to Pandas Dataframe.
In [36]:
# convert the marvin results to a Pandas dataframe
df = r.toDF()
df.head()
Out[36]:
You can also convert the data into Marvin objects using the convertToTool
method. This will attempt to convert each result row into its corresponding Marvin Object. The default conversion is to a Cube
object. Converted objects are stored in the r.objects
attribute. Let's convert our results to cubes. Depending on the number of results, this may take awhile. Let's limit our conversion to 5. Once converted, we now have Marvin Tools at our disposal.
In [37]:
# convert the top 5 to cubes
r.convertToTool('cube', limit=5)
In [38]:
# look at the objects
r.objects
Out[38]:
You can quickly plot the full set of results using the plot
method. plot
accepts two string column names and will attempt to create a scatter plot, a hex-binned plot, or a scatter-density plot, depending on the total number of results. The plot
method returns the matplotlib Figure and Axes objects, as well as a dictionary of histogram information for each column. The Results.plot
method uses the underlying plot utility function. The utility function offers up more custom plotting options. Let's plot g-r color versus redshift. Regardless of the number of results you currently have loaded, the plot
method will automatically retrieve all the results before plotting.
In [18]:
# make a scatter plot
fig, ax, histdata = r.plot('z', 'absmag_g_r')
By default, it will also plot histograms of the column as well. This can be turned off by setting with_hist=False
.
In [29]:
# make only a scatter plot
fig, ax = r.plot('z', 'absmag_g_r', with_hist=False)
We can also quickly plot a histogram of a single column of data using the hist
method, which uses an underlying hist utility function.
In [39]:
histdata, fig, ax = r.hist('absmag_g_r')
In [40]:
# download the DRP datacube files from the results
# r.download()
In [ ]: