A First Look at the SDSS Photometric "Galaxy" Catalog

  • The Sloan Digital Sky Survey imaged over 10,000 sq degrees of sky (about 25% of the total), automatically detecting, measuring and cataloging millions of "objects".
  • While the primary data products of the SDSS was (and still are) its spectroscopic surveys, the photometric survey provides an important testing ground for dealing with pure imaging surveys like those being carried out by DES and that is planned with LSST.
  • Let's download part of the SDSS photometric object catalog and explore it.

SDSS data release 12 (DR12) is described at the SDSS3 website and in the survey paper by Alam et al 2015.

We will use the SDSS DR12 SQL query interface. For help designing queries, the sample queries page is invaluable, and you will probably want to check out the links to the "schema browser" at some point as well. Notice the "check syntax only" button on the SQL query interface: this is very useful for debugging SQL queries.

Small test queries can be executed directly in the browser. Larger ones (involving more than a few tens of thousands of objects, or that involve a lot of processing) should be submitted via the CasJobs system. Try the browser first, and move to CasJobs when you need to.


In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from __future__ import print_function
import numpy as np
import SDSS
import pandas as pd
import matplotlib
%matplotlib inline

In [3]:
objects = "SELECT top 10000 \
ra, \
dec, \
type, \
dered_u as u, \
dered_g as g, \
dered_r as r, \
dered_i as i, \
petroR50_i AS size \
FROM PhotoObjAll \
WHERE \
((type = '3' OR type = '6') AND \
 ra > 185.0 AND ra < 185.2 AND \
 dec > 15.0 AND dec < 15.2)"
print (objects)


SELECT top 10000 ra, dec, type, dered_u as u, dered_g as g, dered_r as r, dered_i as i, petroR50_i AS size FROM PhotoObjAll WHERE ((type = '3' OR type = '6') AND  ra > 185.0 AND ra < 185.2 AND  dec > 15.0 AND dec < 15.2)

In [4]:
# Download data. This can take a while...
sdssdata = SDSS.select(objects)
sdssdata


Out[4]:
ra dec type u g r i size
0 185.174133 15.004124 3 22.33496 22.19454 21.89461 21.10469 0.709057
1 185.189163 15.197551 6 26.11512 24.50371 26.04464 22.25306 0.237942
2 185.184024 15.105112 3 22.76294 22.10217 21.36093 20.89776 1.442636
3 185.184024 15.105112 3 22.76275 22.10201 21.36074 20.89749 1.441834
4 185.174176 15.086321 3 21.97472 22.25791 21.50554 21.03117 0.681753
5 185.174167 15.004113 3 23.30358 22.75329 22.23895 21.51213 0.929715
6 185.173507 15.100482 3 23.71084 21.22972 20.45069 20.15299 1.236944
7 185.173507 15.100482 3 23.71184 21.22768 20.44868 20.15103 1.246297
8 185.160872 15.183245 3 23.64891 22.01288 20.66436 19.77663 0.968503
9 185.160481 15.051130 3 22.65050 21.69220 20.84294 20.68277 0.758833
10 185.160278 15.015190 3 22.29378 21.86032 21.21288 21.14824 1.048619
11 185.161335 15.115801 3 21.06481 30.03448 18.55439 19.92333 2.449451
12 185.160320 15.061328 3 26.74979 24.20751 24.21772 24.12367 4.501909
13 185.149363 15.150214 3 22.84525 22.76656 22.59401 24.48730 -9999.000000
14 185.183629 15.045231 3 25.79914 24.94180 22.71320 21.33978 1.082858
15 185.189804 15.074775 6 24.70587 23.79190 22.71208 21.63833 1.070796
16 185.189732 15.173083 3 23.47996 23.34071 21.85003 21.74128 1.304138
17 185.173431 15.053419 6 24.64098 25.99376 22.49798 25.44639 -9999.000000
18 185.173215 15.041334 3 24.06297 23.92913 22.13200 21.15222 1.117936
19 185.160857 15.183192 3 24.20175 23.12813 20.92296 19.97423 0.961196
20 185.130090 15.053850 3 21.38045 25.63755 24.65242 25.89001 -9999.000000
21 185.129952 15.099690 6 21.83577 25.91430 25.32569 24.18635 -9999.000000
22 185.189930 15.147503 3 24.78864 22.77318 22.18535 24.98346 -9999.000000
23 185.150040 15.139869 3 22.98555 23.96383 21.72278 22.41098 0.300402
24 185.173911 15.132231 6 25.03276 24.76488 22.28400 22.83933 0.355917
25 185.161093 15.145196 3 25.64366 22.19277 24.04657 24.04233 0.585439
26 185.174231 15.130506 6 25.05994 23.66190 22.36292 21.43403 0.829801
27 185.160761 15.152551 6 25.23600 23.71532 21.92325 21.20704 0.580546
28 185.173358 15.151139 6 23.57170 22.67396 21.30098 21.68457 0.605273
29 185.129822 15.035232 3 22.51110 23.34988 22.55728 21.36899 0.892731
... ... ... ... ... ... ... ... ...
2447 185.073593 15.100585 6 16.31131 15.32675 14.89447 14.72004 0.616982
2448 185.112753 15.058770 3 26.34808 22.42831 25.11665 21.96970 0.856860
2449 185.112670 15.104398 3 23.39239 22.70518 21.37488 21.09438 1.079672
2450 185.112659 15.014775 3 24.44075 23.70547 22.76173 21.60184 0.775546
2451 185.113416 15.017082 3 23.56576 23.11698 22.18479 21.83790 0.912609
2452 185.113416 15.017083 3 23.56593 23.63780 22.18460 21.55571 0.942204
2453 185.040560 15.186559 6 25.54833 21.76430 20.30897 19.17608 0.740849
2454 185.003163 15.050728 6 23.10736 23.06618 22.56425 22.34448 1.218857
2455 185.003012 15.173349 3 21.37295 20.46574 19.55855 19.18560 1.508568
2456 185.073604 15.100571 6 16.29427 15.31606 14.91252 14.70557 0.728097
2457 185.073611 15.100571 6 16.29181 15.31575 14.91162 14.70550 0.728044
2458 185.112903 15.028303 3 26.32390 23.79241 21.83684 20.67531 0.818312
2459 185.112652 15.104430 3 23.77350 23.61073 21.66300 20.93991 1.850223
2460 185.113340 15.078588 3 23.07026 21.40395 20.72423 20.88052 0.539137
2461 185.113289 15.073502 3 25.53489 25.16925 23.20912 21.47431 0.700088
2462 185.113380 15.182983 3 20.92631 20.15413 19.22689 18.91954 1.521851
2463 185.113380 15.182991 3 19.57985 18.62855 18.18661 17.89426 3.695539
2464 185.073237 15.005219 6 23.27776 23.47816 22.27221 22.32788 2.337857
2465 185.072705 15.050040 6 21.42534 20.95000 21.02919 21.20917 0.524798
2466 185.040358 15.056847 3 23.04855 21.80625 26.61385 24.43346 -9999.000000
2467 185.003017 15.173347 3 21.61085 20.52976 19.62304 19.33310 1.170892
2468 185.002573 15.199076 6 20.89789 18.41478 17.05010 16.41570 0.633420
2469 185.072852 15.162776 3 22.13252 22.49815 22.21279 21.80203 1.201037
2470 185.073128 15.108100 6 22.35243 21.82195 21.40913 20.97308 0.687657
2471 185.072817 15.080475 3 24.05563 22.03536 20.35034 19.78883 0.902125
2472 185.002557 15.199069 6 20.94635 18.49170 17.06518 16.39022 0.682359
2473 185.002557 15.199069 6 20.94669 18.49215 17.06533 16.39035 0.682359
2474 185.002567 15.199066 6 20.90245 18.49307 17.06329 16.39084 0.681720
2475 185.002573 15.199076 6 20.89792 18.41478 17.05010 16.41569 0.633584
2476 185.002590 15.199080 6 20.90619 18.42412 17.05487 16.41843 0.631534

2477 rows × 8 columns

Notice:

  • Some values are large and negative - indicating a problem with the automated measurement routine. We will need to deal with these.
  • Sizes are "effective radii" in arcseconds. The typical resolution ("point spread function" effective radius) in an SDSS image is around 0.7".

Let's save this download for further use.


In [5]:
!mkdir -p downloads
sdssdata.to_csv("downloads/SDSSobjects.csv")

Visualizing Data in N-dimensions

This is, in general, difficult.

Looking at all possible 1 and 2-dimensional histograms/scatter plots helps a lot.

Color coding can bring in a 3rd dimension (and even a 4th). Interactive plots and movies are also well worth thinking about.

Here we'll follow a multi-dimensional visualization example due to Josh Bloom at UC Berkeley:


In [6]:
# We'll use astronomical g-r color  as the colorizer, and then plot 
# position, magnitude, size and color against each other.

data = pd.read_csv("downloads/SDSSobjects.csv",usecols=["ra","dec","u","g",\
                                                "r","i","size"])

# Filter out objects with bad magnitude or size measurements:
data = data[(data["u"] > 0) & (data["g"] > 0) & (data["r"] > 0) & (data["i"] > 0) & (data["size"] > 0)]

# Log size, and g-r color, will be more useful:
data['log_size'] = np.log10(data['size'])
data['g-r_color'] = data['g'] - data['r']

# Drop the things we're not so interested in:
del data['u'], data['g'], data['r'], data['size']

data.head()


Out[6]:
ra dec i log_size g-r_color
0 185.174133 15.004124 21.10469 -0.149319 0.29993
1 185.189163 15.197551 22.25306 -0.623528 -1.54093
2 185.184024 15.105112 20.89776 0.159157 0.74124
3 185.184024 15.105112 20.89749 0.158915 0.74127
4 185.174176 15.086321 21.03117 -0.166373 0.75237

In [7]:
# Get ready to plot:
pd.set_option('display.max_columns', None)
# !pip install --upgrade seaborn 
import seaborn as sns
sns.set()


---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-7-3b941d4d067a> in <module>()
      2 pd.set_option('display.max_columns', None)
      3 # !pip install --upgrade seaborn
----> 4 import seaborn as sns
      5 sns.set()

ImportError: No module named seaborn

In [8]:
def plot_everything(data,colorizer,vmin=0.0,vmax=10.0):
    # Truncate the color map to retain contrast between faint objects.
    norm = matplotlib.colors.Normalize(vmin=vmin, vmax=vmax)
    cmap = matplotlib.cm.jet
    m = matplotlib.cm.ScalarMappable(norm=norm, cmap=cmap)
    plot = pd.scatter_matrix(data, alpha=0.2,figsize=[15,15],color=m.to_rgba(data[colorizer]))
    return

plot_everything(data,'g-r_color',vmin=-1.0, vmax=3.0)


Size-magnitude

Let's zoom in and look at the objects' (log) sizes and magnitudes.


In [9]:
zoom = data.copy()
del zoom['ra'],zoom['dec'],zoom['g-r_color']
plot_everything(zoom,'i',vmin=15.0, vmax=21.5)


Q: What features do you notice in this plot?

Talk to your neighbor for a minute or two about all the things that might be going on, and be ready to point things out to the class.


In [ ]: