SDSS data release 12 (DR12) is described at the SDSS3 website and in the survey paper by Alam et al 2015.
We will use the SDSS DR12 SQL query interface. For help designing queries, the sample queries page is invaluable, and you will probably want to check out the links to the "schema browser" at some point as well. Notice the "check syntax only" button on the SQL query interface: this is very useful for debugging SQL queries.
Small test queries can be executed directly in the browser. Larger ones (involving more than a few tens of thousands of objects, or that involve a lot of processing) should be submitted via the CasJobs system. Try the browser first, and move to CasJobs when you need to.
In [10]:
%load_ext autoreload
%autoreload 2
In [11]:
import numpy as np
import SDSS
import pandas as pd
import matplotlib
%matplotlib inline
In [12]:
objects = "SELECT top 10000 \
ra, \
dec, \
type, \
dered_u as u, \
dered_g as g, \
dered_r as r, \
dered_i as i, \
petroR50_i AS size \
FROM PhotoObjAll \
WHERE \
((type = '3' OR type = '6') AND \
ra > 185.0 AND ra < 185.2 AND \
dec > 15.0 AND dec < 15.2)"
print objects
In [13]:
# Download data. This can take a while...
sdssdata = SDSS.select(objects)
sdssdata
Out[13]:
Notice:
- Some values are large and negative - indicating a problem with the automated measurement routine. We will need to deal with these.
- Sizes are "effective radii" in arcseconds. The typical resolution ("point spread function" effective radius) in an SDSS image is around 0.7".
Let's save this download for further use.
In [14]:
!mkdir -p downloads
sdssdata.to_csv("downloads/SDSSobjects.csv")
This is, in general, difficult.
Looking at all possible 1 and 2-dimensional histograms/scatter plots helps a lot.
Color coding can bring in a 3rd dimension (and even a 4th). Interactive plots and movies are also well worth thinking about.
Here we'll follow a multi-dimensional visualization example due to Josh Bloom at UC Berkeley:
In [15]:
# We'll use astronomical g-r color as the colorizer, and then plot
# position, magnitude, size and color against each other.
data = pd.read_csv("downloads/SDSSobjects.csv",usecols=["ra","dec","u","g",\
"r","i","size"])
# Filter out objects with bad magnitude or size measurements:
data = data[(data["u"] > 0) & (data["g"] > 0) & (data["r"] > 0) & (data["i"] > 0) & (data["size"] > 0)]
# Log size, and g-r color, will be more useful:
data['log_size'] = np.log10(data['size'])
data['g-r_color'] = data['g'] - data['r']
# Drop the things we're not so interested in:
del data['u'], data['g'], data['r'], data['size']
data.head()
Out[15]:
In [7]:
# Get ready to plot:
pd.set_option('display.max_columns', None)
# !pip install --upgrade seaborn
import seaborn as sns
sns.set()
In [8]:
def plot_everything(data,colorizer,vmin=0.0,vmax=10.0):
# Truncate the color map to retain contrast between faint objects.
norm = matplotlib.colors.Normalize(vmin=vmin, vmax=vmax)
cmap = matplotlib.cm.jet
m = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
plot = pd.scatter_matrix(data, alpha=0.2,figsize=[15,15],color=m.to_rgba(data[colorizer]))
return
plot_everything(data,'g-r_color',vmin=-1.0, vmax=3.0)
In [9]:
zoom = data.copy()
del zoom['ra'],zoom['dec'],zoom['g-r_color']
plot_everything(zoom,'i',vmin=15.0, vmax=21.5)
In [ ]: