In [ ]:
from marvin import config
config.mode='local' # 'remote'
config.switchSasUrl('local')
config.setRelease('MPL-4')
In [ ]:
from marvin.tools.query import Query, Results, doQuery
# make a query
myquery = 'nsa.sersic_logmass > 10.3 AND nsa.z < 0.1'
q = Query(searchfilter=myquery)
# run a query
r = q.run()
Let's look at the Marvin Results object. We can see how many results were returned with r.count and r.totalcount
In [ ]:
print(r)
print('Total count', r.totalcount)
print('Page count', r.count)
Queries returning more than 1000 results are paginated into chunks of 100. For anything less than 1000, the query will return everything. Totalcount shows the total result count, and count shows the returned count in just that page.
The results from your query are stored in the .results attribute, as a list of NamedTuples. These are like regular tuples except they have names (like dictionary key names)
In [ ]:
r.results
You can access specific values of the results through tuple indexing or via the named attribute, but this is not recommended in general.
In [ ]:
res = r.results[0]
print('single row', res)
print('mangaid', res[0])
print('mangaid', res.mangaid)
# what are the names
print('names', res.keys())
print(res.sersic_mass)
But be careful Names using the full table.parameter
syntax cannot be accessed via the named attribute. This syntax is returned when two parameters with non-unique names are returned, like ifu.name
and bintype.name
. Instead we recommend using the Marvin Results getListOf and getDictOf methods.
In [ ]:
# if you want a retrieve a list of a single parameter, use getListOf
mangaid = r.getListOf('mangaid')
print(mangaid)
To see what columns are available, use r.columns and r.coltoparam
In [ ]:
# these are the column names in the results
print('columns', r.columns)
# this is a mapping between the column and full parameter name, see also r.paramtocol for the inverse
print('full parameter names', r.coltoparam)
print('parameter keys', r.coltoparam.keys())
print('parameter values', r.coltoparam.values())
if you want to retrieve the results as a list of dictionaries or dictionary of lists, use getDictOf
In [ ]:
# by default, getDictOf returns a list of dictionaries, that you can iterate over
mylist = r.getDictOf()
print(mylist)
print('mangaid', mylist[0]['cube.mangaid'], mylist[1]['cube.mangaid'])
you can change the format returned using the format_type keyword. format_type='dictlist' returns a dictionary of lists getDictOf returns a list of dictionaries
In [ ]:
mydict = r.getDictOf(format_type='dictlist')
print(mydict)
print('keys', mydict.keys())
print('mangaid', mydict['cube.mangaid'])
In [ ]:
# get the next set of results
r.getNext()
In [ ]:
# get only the next 10 results
r.getNext(chunk=10)
In [ ]:
# get the previous 20 results
r.getPrevious(chunk=20)
In [ ]:
# get a subset of results giving the starting index and number limit
# total results
print('total', r.totalcount)
# let's get a subset of 10 rows starting at 300
r.getSubset(300, limit=10)
In [ ]:
# let's sort by redshift. Default is in ascending order
r.sort('z')
# or in descending order
r.sort('nsa.z', order='desc')
Once you have a set of results, you may want to work with them using Marvin Tools. You can easily convert to Marvin Tools using the method r.convertToTool. This method lets you convert to Marvin Cubes, Spaxels, Maps, RSS, or ModelCube objects. Note: You must have the necessary parameters to initialize a particular Marvin object.
In [ ]:
# See some results
r.results[0:3]
# Let's convert our results to Marvin Cube objects
r.columns
r.convertToTool('cube')
# Your new objects are stored as a list in your results called objects
r.objects
In [ ]:
# We strongly recommend saving to a Marvin pickle file (.mpf), so that you can restore the Results object later
r.save('results.mpf')
restored = Results.restore('results.mpf')
In [ ]:
# Saving to CSV, JSON, xlsx, txt, or FITS
df = r.toDataFrame()
df.to_csv('results.csv')
df.to_json('results.json')
df.to_excel('results.xlsx')
table = r.toTable()
table.write('results.txt')
r.toFits('results.fits')
In [ ]:
%matplotlib inline
df = r.toDataFrame()
df.plot.scatter('nsa.sersic_logmass', 'nsa.z')
In [ ]: