Welcome to the LIGO data visualization tutorial!

Installation

Please make sure you have GWpy installed before you begin!

Only execute the below cell if you have not already installed GWpy


In [ ]:
#! python3 -m pip install gwpy

Jess notes: the following was produced using a python3.7 kernel. All code dependencies should be installed via the installation instructions above except for python; careful with your python paths here.

Learning goals

With this tutorial, you will learn how to:

  • Use basic GWOSC tools to query for LIGO-Virgo observing run times and GW event times
  • Download public LIGO (and Virgo) data from the GWOSC with GWpy
  • Plot a LIGO h(t) time series with GWpy
  • Make a spectral density plot of a time series
  • Design a filter in the frequency domain
  • Apply and characterize a whitening filter
  • Visualize LIGO data with spectrograms

This tutorial borrows from the excellent work of Duncan Macleod, Jonah Kanner, Alex Nitz, and others involved in the 2018 GWOSC webcourse - you can find many more great examples there.

Let's get started!


Using GWOSC tools

Here we'll use tools from the Gravitational Wave Open Science Center (GWOSC), namely the gwosc python module, which we should have already installed via our GWpy install.

First, let's see what's in GWOSC open datasets:


In [ ]:
from gwosc.datasets import find_datasets
find_datasets()

Here we see data for:

  • GWTC-1 confident events: GW150914, GW151012, GW151226, GW170104, GW170608, GW170729, GW170809, GW170814, GW170817, GW170818, and GW170823
  • GWTC-1 marginal events: 151008, 151012A, 151116, 161202, 161217, 170208, 170219, 170405, 170412, 170423, 170616, 170630, 170705, 170720
  • Observing runs: Advanced LIGO and Advanced Virgo observing runs O1 and O2 at different sampling rates, as well as past science runs S5 and S6 (pre 2010).
  • And background data for an event included in the GWTC-1 catalog, but not in the O2 data release: BKGW170608_16KHZ_R1

Observing runs

Knowing this, let's try to query for the start and end of LIGO-Virgo observing runs. Currently, data from the first two observing runs are available via the GWOSC: O1 and O2.

gwosc.datasets.run_segment will return the start and end GPS times for an observing run given a dataset tag. Let's try O1:


In [ ]:
from gwosc.datasets import run_segment
print(run_segment('O1'))

And O2:


In [ ]:
print(run_segment('O2'))

Exercise

There's an error here! Can you find a way to fix the cell above and print the start and end times for the second LIGO-Virgo observing run (O2)?

</span>



In [ ]:
# complete

Querying for data around detected events

We can also query for event releases (data on the order of minutes around the event time) specifically by using find_datasets and specifying the event type.


In [ ]:
from gwosc.datasets import find_datasets
events = find_datasets(type='event')
print(events)

Querying event GPS times

If we want to know the GPS time of a particular event, we can grab that with event_gps. Let's try that for the first binary neutron star detection, GW170817.


In [ ]:
from gwosc.datasets import event_gps
GW170817gps = event_gps('GW170817')
print(GW170817gps)

Detector tags

We can also filter our results by which gravitational wave detectors were active during the time of the event.

LIGO-Livingston = 'L1'
LIGO-Hanford = 'H1'
Virgo = 'V1'
KAGRA = 'K1'
GEO600 = 'G1'

For example, let's see which events LIGO-Livingston was active for:


In [ ]:
find_datasets(type='event', detector= # complete

Exercises for using the GWOSC tools

  • How many events were detected during O2?
  • Which O2 event releases include data for the Virgo detector? </span>

In [ ]:
# complete

Importing open LIGO data

We can use GWpy to download a LIGO time series from the GWOSC with fetch_open_data. Let's try to grab some data around a detected event, say GW150914 (the very first direct gravitational wave detection).

Note: The first time you import `gwpy.timeseries`, matplotlib may try to import some extra fonts and that can take a couple minutes.


In [ ]:
# Import
from gwpy.timeseries import TimeSeries

In [ ]:
# Grab the event gps time with GWOSC tools
GW150914gps = event_gps('GW150914')
print(GW150914gps)

In [ ]:
# Set a start and end time around the event time (in units of seconds) to download LIGO data 
window = 15
start = GW150914gps - window
end = GW150914gps + window

# Check that this looks sane
print('start time GPS = '+str(start))
print('end time GPS = '+str(end))

Now that we've identified a time range of interest, we can grab some data. Let's choose the LIGO-Hanford detector (H1).


In [ ]:
# Grab H1 data around the time of interest from the GWOSC
data = TimeSeries.fetch_open_data('H1', start, end, verbose=True)
print(data)

Requiring verbose=True gives us details on the data download (and helps with debugging if needed).

The downloaded file is not stored permanently! If you run this cell again it will be downloaded again. However, you can use cache=True to store the file on your computer if you like.


Plot a LIGO time series

We can use GWpy to plot the data we downloaded from the GWOSC by calling the plot() method of our data object.
Jess note: This may take 2-3 minutes to render the first time.


In [ ]:
plot = data.plot()

This is the real LIGO data (one of two detectors) used to make the first direct detection of GWs. Do you spot the signal yet?

Let's add a title and y-axis labels to make it more clear what we're looking at.


In [ ]:
plot = data.plot(
    title='LIGO-Hanford Observatory data for GW150914',
    ylabel='Strain amplitude'
)

Nice.


Make a spectral density plot

What does the data containing the signal look like in the frequency domain?

We can then call the .asd() method to calculate the amplitude spectral density and tranform our TimeSeries into a FrequencySeries.

Syntax: asd(FFT_length_in_seconds, FFT_overlap_in_seconds, default_time_window='hann', default_FFT_method='welch')


In [ ]:
gw140914asd = data.asd(5, 2)
print(gw140914asd)

Now we can make a plot of our FrequencySeries with the same .plot() method we used for a TimeSeries.


In [ ]:
plot = gw140914asd.plot()
ax = plot.gca()
ax.set_xlim(5, 2000)

Exercise

How does this plot change with different FFT lengths and overlap? Try an FFT length of 15 seconds with 7 seconds of overlap and an FFT length of 2 with 1 second of overlap. What do you notice? Feel free to try other combinations as well.

</span>


In [ ]:
# complete

Filter design in the frequency domain

Now that we have a sense of what frequency content is dominating our noise, let's see if we can dig out the basic shape of our signal with a few simple filters.

First, since we know our signal duration is short (on the order of miliseconds), let's zoom way in on the time series and see if we can spot it.


In [ ]:
## Make a new plot, this time zoomed in around the time of GW150914
zoomplot = data.plot(
    title='LIGO-Hanford Observatory data for GW150914',
    ylabel='Strain amplitude'
)
ax = zoomplot.gca()
ax.set_xlim( # select a window that starts 200 ms before the event and ends 200 ms afer
ax.set_ylabel='Strain amplitude'

Not yet... Let's get rid of some of that dominating low frequency noise and see if we can see our signal any better.

Apply a highpass filter

Use the data.highpass method to apply a highpass filter to supress signals below 20 Hz in the above data.


In [ ]:
## apply a highpass filter with a corner frequency of 20 Hz 
gw140914hp = data.highpass( # frequency for the highpass filter

## make a plot of the resulting time series
tsplot = gw140914hp.plot(
    title='LIGO-Hanford Observatory data for GW150914',
    ylabel='Strain amplitude'
)
ax = tsplot.gca()
ax.set_xlim(GW150914gps-0.2, GW150914gps+0.2)

Still no. What's left? What does our highpassed data look like in the frequency domain?

Make an ASD of the data that has been passed through the highpass filter. Use an FFT length of 5 seconds and an overlap of 2 seconds.


In [ ]:
gw140914hpasd = gw140914hp.asd( # complete
    
## make a plot of that ASD! 
asdplot = gw140914hpasd.plot()
ax = asdplot.gca()
ax.set_xlim(5, 2000)

We've really supressed that low frequency noise! But there's still too much noise at other frequencies.

Apply a bandpass filter

Because we've already identified our signal as a binary black hole system (two equal-ish mass black holes of roughly 30 solar masses each), we know our signal has frequency content (within our detector's sensitive range) between 50 and 250 Hz.

Let's apply a bandpass filter to look for excess power in that critical frequency range.

Use the data.bandpass method to apply a bandpass filter with corner frequencies of 50 and 250 Hz.


In [ ]:
## apply a bandpass filter to the highpassed data with a corner frequencies of 50 and 250 Hz 
gw140914bp = gw140914hp.bandpass( # complete

## make a plot of the resulting time series
tsplot = gw140914bp.plot(
    title='LIGO-Hanford Observatory data for GW150914',
    ylabel='Strain amplitude'
)
ax = tsplot.gca()
ax.set_xlim(GW150914gps-0.2, GW150914gps+0.2)
ax.set_ylim(-0.2e-20, 0.2e-20)

Starting to look promising, but check out that strong unrelated sinusoid. What the heck is that?

To the frequency domain! (Read: let's make a spectrum and check it out.)

Make an ASD of the data that has been passed through the highpass filter. Use an FFT length of 5 seconds and an overlap of 2 seconds.


In [ ]:
gw140914bpasd = gw140914bp.asd( # complete

## make a plot of that ASD 
asdplot = gw140914bpasd.plot()
ax = asdplot.gca()
ax.set_xlim(5, 2000)

Wow, check out that strong line; this is the 60 Hz AC power line!

Let's get rid of it with a notch filter.

Apply a notch filter

Use the data.notch method to apply a notch filter that will supress the signal at 60 Hz from nearby AC power lines.


In [ ]:
gw140914n = gw140914bp.notch( # complete

## make a plot of the resulting time series
tsplot = gw140914n.plot(
    title='LIGO-Hanford Observatory data for GW150914',
    ylabel='Strain amplitude'
)
ax = tsplot.gca()
ax.set_xlim(GW150914gps-0.2, GW150914gps+0.2)
ax.set_ylim(-0.1e-20, 0.1e-20)

There it is! Nice work! We can put that on a T-shirt.


Exercise

Repeat this for LIGO-Livingston detector data around the time of GW150914. Design your own set of simple filters for LIGO-Livingston, and plot the filtered LIGO-Livingston and LIGO-Hanford time series data overlaid. Can you estimate the difference in signal arrival times between detectors?

</span>

Note: the LIGO-Virgo analyses always use whitening; not the procedure above.



In [ ]:
# complete

Apply and characterize a whitening filter

It's much easier to spot excess power in the data if we can weight it proportionally by the consistent frequency contributions from the noise its embedded in. (Whitening is a critical step for matched filtering and gravitational wave search algorithms that look for coherent excess power across a gravitational wave detector network.)

We can use GWpy to whiten our data for the time of GW150914 with an FFT length of 5 seconds and an overlap of 2 seconds.


In [ ]:
whitened_gw150914 = data.whiten(5,2)
plot = whitened_gw150914.plot(
    title='Whitened LIGO Hanford Observatory data for GW150914',
    ylabel='Strain amplitude',
    xlim=(GW150914gps-0.2, GW150914gps+0.2)
)

Hmm, that looks like it still has some high frequency noise left.

What does the whitened data look like in the frequency domain?


In [ ]:
whitened_gw150914_asd = whitened_gw150914.asd()
plot = whitened_gw150914_asd.plot()
ax = plot.gca()
ax.set_xlim(5, 2000)

Not perfectly "white" yet. How would you improve it?


Exercise

Compare the ASD of your whitened LIGO-Hanford data to the ASD of your data with the simple filter set (highpass, bandpass, notch) for LIGO-Hanford applied. Which frequencies still stand out in which?

</span>



In [ ]:
# complete

Visualize LIGO data with spectrograms

A great way to visualize how the frequency content of our data is changing over time is with whitened or "normalized" spectrograms.

Let's try it for GW150914.

We can apply the spectrogram2() method in GWpy to our whitened data, and set some other formatting variables to make the plot look nice:


In [ ]:
specgram = whitened_gw150914.spectrogram2(fftlength=1/16., overlap=15/256.) ** (1/2.)
plot = specgram.plot(norm='log', cmap='viridis', yscale='log')
ax = plot.gca()
ax.set_title('LIGO-Hanford strain data around GW150914')
ax.set_xlim(GW150914gps-0.5, GW150914gps+0.5)
ax.set_ylim(15,1000)
ax.colorbar(label=r'Strain ASD [1/$\sqrt{\mathrm{Hz}}$]')

There is something there! It looks like it's sweeping up in frequency over time, but it's still hard to make out the details.


The Q transform

To zoom in even further, we can employ a muli-resolution technique called the Q transform.


In [ ]:
qspecgram = data.q_transform(outseg=(GW150914gps-0.2, GW150914gps+0.2))
plot = qspecgram.plot()
ax = plot.gca()
ax.set_xscale('seconds')
ax.set_yscale('log')
ax.set_ylim(20, 500)
ax.set_ylabel('Frequency [Hz]')
ax.grid(True, axis='y', which='both')
ax.colorbar(cmap='viridis', label='Normalized energy')

Beautiful. Nice work!


Challenge

Generate a frequency vs. time plot of LIGO data around the binary neutron star event (GW170817) where you can clearly see the signal track. (Hints: consider using LIGO-Livingston data, a Q transform, and note that binary neutron star signals last for 10s of seconds!) </span>



In [ ]:
# complete

Challenge 2

Do the same thing for GW170817, but this time using data from LIGO-Hanford. Can you estimate the time delay between the two detectors after running this procedure?

</span>



In [ ]:
# complete