Our Home Control System (HCS) collects a lot of data over the course of each day and this is captured in an HCS log. One objective of this project is to work out how much of this captured data is actually useful. The other objective is to work out the clearest and most useful ways to visualise this data. The initial focus is on a web presentation but we are also looking at display on mobile devices.
The log data is captured in a text file and includes data from all of the wired and wireless sensors in our home. The data can be split into the following basic groups:
- Occupancy and motion data from PIR sensors and door contact sensors.
- Temperature data from the temperature sensors in each room.
- Environmental data from weather sensors, twilight sensor, etc.
- Digital images captured on IP network cameras (time stamped).
- User actions and received commands, such as SMS messages sent, lights switched on/off, etc.
Our HCS log is created in a folder structure, within a root folder called 'Logs'. Beneath this is a folder for the year, then a folder for the month, and within this folder is a log file for each day of the month. These have a filename of the form '2012_01_18_HCS.txt'. Each log file is a plain text file with an timestamped entry per line. A typical snippet of the file looks like this:
- 21:22:44 PIR Dining = On
- 21:22:48 PIR Kitchen = On
- 21:22:48 PIR Dining = On
- 21:22:54 PIR Kitchen = On
- 21:22:54 PIR Dining = On
- 21:23:00 PIR Dining = On
- 21:23:05 PIR Dining = On
- 21:23:31 PIR Dining = On
- 21:24:27 PIR Kitchen = On
- 21:24:31 PIR Kitchen = On
- 21:24:35 PIR Landing = On
- 21:24:41 PIR Kitchen = On
- 21:24:43 PIR Kitchen = On
- 21:24:46 PIR Dining = On
- 21:25:08 PIR Dining = On
- 21:25:11 PIR Dining = On
- 21:25:16 PIR Dining = On
- 21:25:20 PIR Dining = On
Each line starts with a character that defines the type of entry. You will notice that most (all in this case) of events are PIR sensor activations. To keep the log file size down, we do not log the PIR sensor turning off.
There is no such thing as a typical day but these two days provide an example of the volume of data collected during two days. We have written some simple Java code to filter and parse the log data.
Sunday 22nd Jan 2012
There were 3173 PIR sensor activations combined and a peak of 21 events in a single minute. From all the PIR sensors in our house, there were 530 minutes in the day where 1 or more PIR activations occurred.
From just the dining room PIR sensor alone, there were 1247 activation events and those events occured in 341 minutes of the 1440 in each day. The peak was 15 activations in any one minute.
Given the minimal amount of storage required, it makes sense to do the processing at the end of each day, so that it is cached and ready to use in any graphical presentation.
One way to represent occupancy for a room/zone would be to have a 24-hour timeline with a pixel representing activity for each minute during the day. There doesn't seem much point in going down to a finer grain view of occupancy as far as time is concerned. This means we would need 24 hours × 60 minutes = 1440 pixels. Whilst this might work on a PC screen, it is unlikely to be much use on a mobile device.
For each minute, we could add up the sensor events and initial anlaysis shows that never see more than about 25 per zone, per minute. We are not sure if this adds value but variation of intensity may convey more information. The problem with this approach though is that one single event in a day could be really important, so we don't want to have a presentation that makes this one event hard to spot.
For plotting on an iPhone or iPod, the timescale could be compressed down by a factor of 2 and use a scale across 720 pixels.
At the start and end of each day we log the current temperature cached for each sensor. This guarantees that we have data points that span a full 24-hour period, were we to plot room temperature over the course of each day as a graph.
Graphics & Graphs
We found a number of interesting sites that enable data to be displayed as graphs:
This graph shows the Dining Room PIR data from Sunday 23rd Jan 2012. It is scatter chart plotted out over 24 hours and the vertical scale is the number of PIR detections in each minute.
This was our first attempt at visualising the data. There are a number of things we have inferred from this inital view:
- You can clearly see someone stayed up beyond mid-night.
- The vertical scale, i.e. the number of events in each minute doesn't really add anything.
- Just showing one room in isolation doesn't help much. If we are going to display data like this to provide an overview of daily activity, then it needs to show all rooms and sensors.
- The data would be much more visible if it was plotted as vertical lines (a bar chart).
Google Chart Tools is pretty good but, it doesn't quite have the flexibility to plot and draw information as we require it.
Highcharts has proved to be a much more usefule and flexible tool for the views we are trying to create. There are a number of things it does that has made it much more suitable:
- It is very configurable in terms of style and layout.
- It can easily plot bar charts on with time data at irregular intervals. This is a 'must have' feature for the kind of data we are displaying.
- It can easily plot bar charts against a time axis.
- We can add a dummy zero value at 00:00:00 and 23:59:59, to force the X-axis to cover a whole day (or month).
- It is very quick and can produce graphics that work on both web pages and mobile devices.
- It can plot dynamic data generated from PHP code and delivered as JSON or XML. We have written scripts to generate the raw data in this format.
- It is easy to incorporate multiple plots onto one web page.
This is the same data as above plotted out using Highcharts:
The above plots have been really useful and we have learnt a lot about how our house is used and works. We can already see how to improve the efficiency of devices and how we can save energy too.
Summary & Learning
- Number of PIR activitations per minute doesn't add a huge amount of value but, if we could plot vertical lines instead of points it does add something. To avoid single values being made too insignificant, we have put a cap on maximum activation per minute at 4.
- Door contact sensor data can't be plotted usefully as a line. It needs to be a vertical lines with clear distinction between open and closed (e.g. as values of 1 for open and -1 for closed).
- We have found it more useful to plot PIR sensors over a fixed 24-hour period, as this gives clearer reference points.
- For door contact sensors and similar devices we have found it better to use a positive value to show open/on and a negative value to show closed/off.
- Because door contacts and devices generate fewer points to plot, we have found it useful to plot them along a dynamic time scale. This gives much better granularity on the time axis. Having said that, we stuck with the 24-hour view as it enables better comparison of events with PIR activations.
- We have used colour coding to more clearly differentiate between PIR sensor plots and contacts/devices.
- Through analysis we have learnt that the optimum extension time (time lights stay on from last PIR activation) for our kitchen worktop convenience lights is 6 minutes. This means they don't go on and off whilst we are using the kitchen to cook or eat (it is a kitchen diner).
- The web presentation we are using, based up Highcharts can be set to use 100% screen width and thus adapts to different PCs/monitors well. This is working very well and is clear and easy to read.
- We have split the data into four main sections, to provide simple and easily understandable 'views'. These are 'perimeter' to understand if any external doors and sensors have been activiated/opened, 'downstairs', upstairs' and 'devices'. The latter is for testing and analysis.
- This visualisation of the data has been really useful and we have learnt that we can quickly spot 'unexpected' events and also manage to interpret them. In some cases it is possible to link events and use the pattern and sequence to identify family members.
- The web presentation we are using, based up Highcharts doesn't work well on an iPhone 4 screen. It is too difficult to read and something about the Highcharts graphs makes the pinch/zoom very slow to react.