```
In [1]:
```%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import scipy.signal
import scipy.stats
import matplotlib.patches

```
In [2]:
```waterfall = np.load('/mnt/perseids2018/data.npz')['waterfall']
waterfall_linear = 10**(0.1*waterfall)
waterfall.shape

```
Out[2]:
```

First we plot the average power spectral density. Note the GRAVES pings on the centre of the graph.

```
In [3]:
```average_psd = 10*np.log10(np.average(waterfall_linear, axis = 0))
plt.plot(average_psd)
plt.title('Average power spectral density')
plt.ylabel('Power spectral density (dB)')
plt.xlabel('FFT bin')
plt.ylim([-120,-108]);

```
```

```
In [4]:
```span = 10
centre = waterfall.shape[1]//2 - 3
signal_bins = np.arange(centre-span, centre+span)
plt.plot(signal_bins, average_psd[signal_bins])
plt.title('Average power spectral density')
plt.ylabel('Power spectral density (dB)')
plt.xlabel('FFT bin')
plt.ylim([-120,-108]);

```
```

```
In [5]:
```noise_bins_left = np.arange(12,39)
noise_bins_right = np.arange(57,246)
noise_bins = np.concatenate((noise_bins_left, noise_bins_right))
plt.plot(noise_bins_left, average_psd[noise_bins_left])
plt.plot(noise_bins_right, average_psd[noise_bins_right])
plt.title('Average power spectral density')
plt.ylabel('Power spectral density (dB)')
plt.xlabel('FFT bin')
plt.ylim([-120,-108]);

```
```

We now calculate the SNR in dB units usin the signal bins and noise bins chosen above.

```
In [6]:
```signal = np.sum(waterfall_linear[:,signal_bins], axis=1)
signal_plus_noise = np.sum(waterfall_linear[:,noise_bins], axis=1)
snr = 10*np.log10(signal/(signal_plus_noise-signal))

We plot the SNR versus time. We see strong spikes corresponding to meteor scatter pings.

```
In [7]:
```time = np.arange(waterfall.shape[0]) * 256 / 4e3
plt.plot(time, snr)
plt.title('SNR')
plt.xlabel('Time (s)')
plt.ylabel('SNR (dB)');

```
```

```
In [8]:
```plt.hist(snr, bins=1000)
plt.yscale('log')
plt.title('SNR histogram')
plt.ylabel('Number of samples')
plt.xlabel('SNR (dB)');

```
```

To chose the threshold we set a desired probability of false acquisition and obtain the threshold from there. In this case the null hypothesis corresponds to no meteor scatter signal being present. Thus, under the null hypothesis we assume that all the signal and noise bins are independent random variables whose distribution is a chi-squared with 2 degrees of freedom.

We denote by $n$ the number of noise bins and by $k$ the number of signal bins. Then the noise and signal power are distributed as chi-squared distributions with $2n$ and $2k$ degrees of freedom respectively.

The SNR is the quotient of two chi-squared distributions, and so it is distributed as $k/n$ times an F-distribution with parameters $2k$ and $2n$.

```
In [9]:
```k = signal_bins.size
n = noise_bins.size - signal_bins.size
def threshold(pfa):
return 10*np.log10(k/n*scipy.stats.f.ppf(1-pfa, dfn=2*k, dfd=2*n))
pfa = np.logspace(-9,-1)
plt.semilogx(pfa, threshold(pfa))
plt.title('Probability of false aquisition and threshold')
plt.ylabel('Threshold (dB)')
plt.xlabel('Probability of false acquisition');

```
```

Since there are roughly $10^5$ samples, we choose a probability of false acqusition of $10^{-7}$.

```
In [10]:
```alpha = threshold(1e-6)
alpha

```
Out[10]:
```

We now get the number of detections.

```
In [11]:
```detections = np.where(snr > alpha)[0]
detections.size

```
Out[11]:
```

```
In [12]:
```beta = alpha - 1
detections2 = np.where(snr > beta)[0]
detections2.size

```
Out[12]:
```

```
In [13]:
```cluster_jump_seconds = 2
cluster_jump = cluster_jump_seconds * 4e3 / 250
marks = list()
right_mark = -np.inf
while np.any(detections > right_mark):
left_mark = detections[detections > right_mark][0]
right_mark = left_mark
while np.any(detections2[detections2> right_mark]) and \
detections2[detections2 > right_mark][0] < right_mark + cluster_jump:
right_mark = detections2[detections2 > right_mark][0]
marks.append((left_mark, right_mark))
len(marks)

```
Out[13]:
```

Now we plot each of the detections to a file.

```
In [16]:
```margin_seconds = 5
margin = margin_seconds * 4e3 / 250
for j,m in enumerate(marks):
l = m[0]
r = m[1]
start = int(np.clip(l - margin, 0, waterfall.shape[0]-1))
end = int(np.clip(r + margin, 0, waterfall.shape[0]-1))
plt.imsave('/tmp/ping_{:0{size}d}'.format(j, size=int(np.ceil(np.log10(len(marks))))),\
waterfall[start:end,::-1].T, vmin = -120, vmax = -90);