Examples of using the speclite package to perform basic operations on spectral data accessed with the bossdata package. To keep the examples small, we use data from a single BOSS plate (6641 observed on MJD 56383) and show how to work with both the individual spec-lite files and the combined spPlate file (see here for details on the different data products).
In [1]:
%pylab inline
In [2]:
import speclite
print(speclite.version.version)
In [3]:
import bossdata
print(bossdata.__version__)
In [4]:
finder = bossdata.path.Finder()
mirror = bossdata.remote.Manager()
Get a list of sky spectra on plate 6641:
In [5]:
spAll = bossdata.meta.Database(lite=True)
In [6]:
sky_table = spAll.select_all(where='PLATE=6641 and OBJTYPE="SKY"')
print('Found {0} sky fibers for plate 6641.'.format(len(sky_table)))
Plot a stacked spectrum:
In [7]:
def plot_stack(data, truncate_percentile):
valid = data['ivar'] > 0
wlen = data['wavelength'][valid]
flux = data['flux'][valid]
dflux = data['ivar'][valid]**(-0.5)
plt.figure(figsize=(12,5))
plt.fill_between(wlen, flux, lw=0, color='red')
plt.errorbar(wlen, flux, dflux, color='black', alpha=0.5, ls='None', capthick=0)
plt.xlim(np.min(wlen), np.max(wlen))
plt.ylim(0, np.percentile(flux + dflux, truncate_percentile))
plt.xlabel('Wavelength ($\AA$)')
plt.ylabel('Flux $10^{-17}$ erg/(s cm$^2 \AA$)')
plt.tight_layout();
Loop over all sky spectra on the plate. The necessary spec-lite files will be automatically downloaded, if necessary, which will take several minutes.
In [8]:
spec_sky = None
for row in sky_table:
filename = finder.get_spec_path(plate=row['PLATE'], mjd=row['MJD'], fiber=row['FIBER'], lite=True)
spectrum = bossdata.spec.SpecFile(mirror.get(filename))
data = spectrum.get_valid_data(include_sky=True, use_ivar=True, fiducial_grid=True)
spec_sky = speclite.accumulate(spec_sky, data, data_out=spec_sky, join='wavelength',
add=('flux', 'sky'), weight='ivar')
spec_sky['flux'] += spec_sky['sky']
In [9]:
plot_stack(spec_sky, truncate_percentile=97.5)
Accumulate the sky spectra from a Plate file, which will be automatically downloaded if necessary.
In [10]:
plate_sky = None
filename = finder.get_plate_spec_path(plate=6641, mjd=56383)
plate = bossdata.plate.PlateFile(mirror.get(filename))
plate_data = plate.get_valid_data(sky_table['FIBER'], include_sky=True, use_ivar=True, fiducial_grid=True)
for data in plate_data:
plate_sky = speclite.accumulate(plate_sky, data, data_out=plate_sky, join='wavelength',
add=('flux', 'sky'), weight='ivar')
plate_sky['flux'] += plate_sky['sky']
In [11]:
plot_stack(plate_sky, truncate_percentile=97.5)
Get a list of sky spectra on plate 6641, observed on MJD 56383:
In [12]:
DR12Q = bossdata.meta.Database(finder, mirror, quasar_catalog=True)
In [13]:
qso_table = DR12Q.select_all(where='PLATE=6641 and ZWARNING=0', what='PLATE,MJD,FIBER,Z_VI')
print('Found {0} QSO targets for plate 6641.'.format(len(qso_table)))
Plot the redshift distribution of the selected quasars:
In [14]:
plt.hist(qso_table['Z_VI'], bins=25);
plt.xlabel('Redshift z')
plt.ylabel('Quasars')
plt.tight_layout();
Loop over all quasar spectra on the plate. The necessary spec-lite files will be automatically downloaded, if necessary, which will take several minutes.
In [15]:
fiducial_grid = np.arange(1000.,3000.)
rest_frame, resampled, spec_qso = None, None, None
for row in qso_table:
filename = finder.get_spec_path(plate=row['PLATE'], mjd=row['MJD'], fiber=row['FIBER'], lite=True)
spectrum = bossdata.spec.SpecFile(mirror.get(filename))
data = spectrum.get_valid_data(use_ivar=True, fiducial_grid=True)
rest_frame = speclite.redshift(z_in=row['Z_VI'], z_out=0, data_in=data, data_out=rest_frame, rules=[
dict(name='wavelength', exponent=+1),
dict(name='flux', exponent=-1),
dict(name='ivar', exponent=+2)])
resampled = speclite.resample(rest_frame, x_in='wavelength', x_out=fiducial_grid, y=('flux', 'ivar'),
data_out=resampled)
spec_qso = speclite.accumulate(spec_qso, resampled, data_out=spec_qso, join='wavelength',
add='flux', weight='ivar')
In [16]:
plot_stack(spec_qso, truncate_percentile=99.5)
In [17]:
filename = finder.get_plate_spec_path(plate=6641, mjd=56383)
plate = bossdata.plate.PlateFile(mirror.get(filename))
plate_data = plate.get_valid_data(qso_table['FIBER'], use_ivar=True, fiducial_grid=True)
zorder = np.argsort(qso_table['Z_VI'])
Transform each spectrum to its quasar rest frame. We perform this operation in place (re-using the memory of the input array) and in parallel on all spectra.
In [18]:
z_in = qso_table['Z_VI'][:,np.newaxis]
plate_data = speclite.redshift(z_in=z_in, z_out=0, data_in=plate_data, data_out=plate_data, rules=[
dict(name='wavelength', exponent=+1),
dict(name='flux', exponent=-1),
dict(name='ivar', exponent=+2)
])
Resample each spectrum to a uniform rest wavelength grid and stack them together to calculate the mean rest-frame quasar spectrum. The resample() and accumulate() operations re-use the same memory for each input spectrum, so this loop has fixed (small) memory requirements, independent of the number of spectra being stacked.
In [19]:
resampled, plate_qso = None, None
for data in plate_data:
resampled = speclite.resample(data, x_in='wavelength', x_out=fiducial_grid, y=('flux', 'ivar'), data_out=resampled)
plate_qso = speclite.accumulate(spec_qso, resampled, data_out=plate_qso, join='wavelength',
add='flux', weight='ivar')
In [20]:
plot_stack(plate_qso, truncate_percentile=99.5)