The general idea of the project is to get a handle on how the house heats and cools so that we can better program the thermostat.
To gather data, I've assembled and programmed 5 probes using inexpensive hardware (Wemos D1 Mini ESP8266 Wifi boards and SHT30 temperature/humidity sensors). The intent is to move the probes around the house to help us tune the thermostat.
Here, I look at several hours of data from an initial test run. The probes were colocated, but not identically oriented.
The initial version of software connected probes to the house WiFi at startup, but not if WiFi dropped out. And there was a hiccup, and all of the probes stopped reporting. Fortunately, I was able to get a decent data sample.
The probes report temperature and humidity readings every 30 seconds or so, along with the probe's WiFi MAC address. The web server on a spare laptop collects the data, adds a timestamp, and appends to a .csv file. (Eventually, data will go into a database, but flat files are fine for getting started.)
Here's what we're starting with
In [1]:
!head -5 temps.csv
Some prelimaries. Import code, and configure chart sizes to be larger than the default.
In [2]:
%matplotlib inline
import matplotlib
matplotlib.rcParams['figure.figsize'] = (12, 5)
import pandas as pd
Load the .csv into a pandas DataFrame, adding column names.
In [3]:
df = pd.read_csv('temps.csv', header=None, names=['time', 'mac', 'f', 'h'], parse_dates=[0])
df.head()
Out[3]:
A quick plot to get a rough idea of how the sensors differ.
In [4]:
df.plot();
Aside from a wider spread in sensor values than I'd like, and higher temperatures (the room wasn't that hot!), this is roughly what I expected for the temperature pattern. It was a hot, humid day. The bedroom starts off warm, cools when I turned on A/C at 9pm, then oscillates during the night as the A/C kicks in on a scheduled setting.
I didn't know what to expect for humidity.
To get per-sensor plots, the data needs to be reorganized so that each probe is in a different column. This'll need to be done for temperature and humidity independently. Temperature first, since that's what I'm interested in.
In [5]:
per_sensor_f = df.pivot(index='time', columns='mac', values='f')
per_sensor_f.head()
Out[5]:
This is roughly what's needed, except for the NaN (missing) values. Resampling the data into 2 minute buckets deals with those.
In [6]:
downsampled_f = per_sensor_f.resample('2T').mean()
downsampled_f.head()
Out[6]:
In [7]:
downsampled_f.plot();
The first thing that jumps out is that one of the sensor ~5 degrees lower than the others. The SHT30 sensors are inexpensive; it might be a manufacturing problem, or I might have damaged one while soldering on the headers. (Or maybe it's the sane one, and the other four are measuring hot.)
There also seems to be a 20-30 minute warmup period. I suspect here that a probe, being basically a small computer with stuff attached, is generating its own heat, and the chart shows the slow warm-up. That might also explain why temperatures were higher than expected.
Let's try adding 5F to the suspect sensor's temperature reading to bring it in line with the others.
In [8]:
downsampled_f['5C:CF:7F:33:F7:F8'] += 5.0
downsampled_f.plot();
That looks promising.
Next, reorganize the data so that we plot humidity. I'm not as interested in humidity, since it's not as easily controlled, but hey, it's data!
In [9]:
per_sensor_h = df.pivot(index='time', columns='mac', values='h')
downsampled_h = per_sensor_h.resample('2T').mean()
downsampled_h.plot();
That same sensor is the outlier. Eyeballing the graph, that sensor's humidity reading looks high by about 9 units.
In [10]:
downsampled_h['5C:CF:7F:33:F7:F8'] -= 9.0
downsampled_h.plot();
Also promising.
First, fix the probes to reconnect to the WiFi if the connection drops.
Next, build a physical test harness so that the sensors are aligned and getting approximately identical airflow (and are somewhat isolated from any heat generated by the CPU/WiFi chipset). An excuse to reach for foam core board and the hot glue gun!
Then, gather another dataset, and use that to calculate per-sensor adjustments. I'll calibrate against a known-good thermometer if I can lay my hands on one.
I might replace the errant sensor, but the lead time for sourcing the parts is a nuisance. If naive math is sufficient to fix values, I'll go with that.