Work in Progress -- starting to add commentary and tidy up
I connected a BMP180 temperature and pressure centre to a raspberry pi and have it running in my study.
I have been using this note book to look at the data as it is generated.
The code uses the Adafruit python library to extract data from the sensor.
I find plotting the data is a good way to take an initial look at it.
So, time for some pandas and matplotlib.
In [35]:
# Tell matplotlib to plot in line
%matplotlib inline
# import pandas
import pandas
# seaborn magically adds a layer of goodness on top of Matplotlib
# mostly this is just changing matplotlib defaults, but it does also
# provide some higher level plotting methods.
import seaborn
# Tell seaborn to set things up
seaborn.set()
In [36]:
# just check where I am
!pwd
In [37]:
infile = '../files/light.csv'
In [38]:
!scp 192.168.0.133:Adafruit_Python_BMP/light.csv .
!mv light.csv ../files
In [39]:
data = pandas.read_csv(infile, index_col='date', parse_dates=['date'])
In [40]:
data.describe()
Out[40]:
In [41]:
# Lets look at the temperature data
data.temp.plot()
Out[41]:
Looks like we have some bad data here. For the first few days things look ok though To start, lets look at the good bit of the data.
In [42]:
data[:4500].plot(subplots=True)
Out[42]:
That looks good. So for the first 4500 samples the data looks clean.
The pressure and sealevel_pressure plots have the same shape.
The sealevel_pressure is just the pressure recording adjusted for altitude.
Actually, since I am not telling the software what my altitude it is
It is a bit of a mystery what is causing the bad data after this.
One possibility is I have a separate process that is talking to the sensor that I am running in a console just so I can see the current figures.
I am running this with a linux watch command. I used the default parameters and it is running every 2 seconds.
I am wondering if the sensor code, or the hardware itself has some bugs if the code polls the sensor whilst it is already being probed.
I am now (11am BDA time July 3rd) running the monitor script with watch -n 600 so it only polls every 10 minutes. Will see if that improves things.
So, lets see if we can filter out the bad data
In [43]:
data.temp.plot()
Out[43]:
In [44]:
# All the good temperature readings appear to be in the 25C - 32C range,
# so lets filter out the rest.
data.temp[(data.temp < 50.0) & (data.temp > 15.0)].plot()
Out[44]:
That looks good. You can see 8 days of temperatures rising through the day and then falling at night. Only a couple of degree difference here in Bermuda at present.
The Third day with the dip in temperature I believe there was a thunderstorm or two which cooled things off temporarily.
I really need to get a humidity sensor working to go with this.
Now lets see if we can spot the outliers and filter them out.
In [45]:
def spot_outliers(series):
""" Compares the change in value in consecutive samples to the standard deviation
If the change is bigger than that, assume it is an outlier.
Note, that there will be two bad deltas, since the sample after the
bad one will be bad too.
"""
delta = series - series.shift()
return delta.abs() > data.std()
outliers = spot_outliers(data)
In [46]:
# Plot temperature
data[~outliers].temp.plot()
Out[46]:
In [47]:
data[~outliers].altitude.plot()
Out[47]:
In [48]:
data[~outliers].plot(subplots=True)
Out[48]:
In [49]:
data[~outliers].sealevel_pressure.plot()
Out[49]:
In [50]:
def smooth(data, thresh=None):
means = data.mean()
if thresh is None:
sds = data.std()
else:
sds = thresh
delta = data - data.shift()
good = delta[abs(delta) < sds]
print(good.describe())
return delta.where(good, 0.0)
In [51]:
smooth(data).temp.cumsum().plot()
Out[51]:
In [52]:
smooth(data).describe()
Out[52]:
In [53]:
start = data[['temp', 'altitude']].irow(0)
(smooth(data, 5.0).cumsum()[['temp', 'altitude']] + start).plot(subplots=True)
Out[53]:
Bingo! we have clean plots. Of course the irony is that I also seem to have found the problem with the bad data I was getting: don't have two processes querying these sensors at the same time, at least not with the current software. So the recent data no longer needs this smoothing.
So the daily rise and fall of temperature is pretty clear. There is only 2C spread most days.
The pressure plot is more interesting. Over the last week or so it has been generally high, but there is an interesting wave feature.
The other day I was at the Bermduda Weather service and mentioned this to Ian Currie, who immediately pointed out that air pressure is tidal.
So, my next plan is to dig out scikit-learn and some lunar data, maybe using astropy and see if we can fit a model to the pressure data for the tidal component.