Analyzing Wood Burning Stoves with FLaNK Stack Part 2 - Analytics - Part 2
Part 1: https://www.datainmotion.dev/2020/01/analyzing-wood-burning-stoves-with.html
See:
https://shop.pimoroni.com/products/sgp30-air-quality-sensor-breakout
- Sensiron SGP30 TVOC and eCO2 sensor
- TVOC sensing from 0-60,000 ppb (parts per billion)
- CO2 sensing from 400 to 60,000 ppm (parts per million)
Running the fire I can see I am getting higher CO2 production than normal.
Since I stored my data in Kudu tables, it's easy to analyze with Impala and Hue.
select equivalentco2ppm, totalvocppb, systemtime
from gassensors
order by equivalentco2ppm desc
select avg( cast( equivalentco2ppm as double) ) CO2PPM
from gassensors
The average was 493.
Now that we have some time series data, I can start feeding this to some standard machine learning algorithms and have CML and a Data Scientist if me some analytics and help me determine where I a may want an alert.
Up to 400 is considered normal.
400 to 1,000 is typical of occupied locations with air exchange.
Once you get over 1,000 you start getting drowsy and noticeable effects.
Over 2,000 you get headaches, this is a concern. Over 5,000 you should remove yourself from the situation.
select appx_median(cast(equivalentco2ppm as double)) median, min(cast(equivalentco2ppm as double)) min,
max(cast(equivalentco2ppm as double)) max, avg(cast(equivalentco2ppm as double)) avg,
stddev(cast(equivalentco2ppm as double)) standarddev,
stddev_pop(cast(equivalentco2ppm as double)) standardpop
from gassensors
Let's start setting alerts at various levels.
We can also look at the indoor air quality.
As a baseline for the sensor, in an empty ventilated room my numbers are:
{"uuid": "sgp30_uuid_glv_20200123132631", "ipaddress": "192.168.1.221", "runtime": "0", "host": "garden3", "host_name": "garden3", "macaddress": "dc:a6:32:32:98:20", "end": "1579785991.7173052", "te": "0.0261075496673584", "systemtime": "01/23/2020 08:26:31", "cpu": 53.5, "diskusage": "109138.7 MB", "memory": 46.5, "equivalentco2ppm": " 412", "totalvocppb": " 6", "id": "20200123132631_dec207f1-9234-4bee-ad38-a0256629c976"}
{"uuid": "sgp30_uuid_snt_20200123132633", "ipaddress": "192.168.1.221", "runtime": "0", "host": "garden3", "host_name": "garden3", "macaddress": "dc:a6:32:32:98:20", "end": "1579785993.7479923", "te": "0.02589273452758789", "systemtime": "01/23/2020 08:26:33", "cpu": 55.6, "diskusage": "109137.0 MB", "memory": 46.5, "equivalentco2ppm": " 403", "totalvocppb": " 5", "id": "20200123132633_3bd5fb39-d6b2-4f23-8904-0ada862ede2b"}
{"uuid": "sgp30_uuid_uha_20200123132635", "ipaddress": "192.168.1.221", "runtime": "0", "host": "garden3", "host_name": "garden3", "macaddress": "dc:a6:32:32:98:20", "end": "1579785995.7779448", "te": "0.025917768478393555", "systemtime": "01/23/2020 08:26:35", "cpu": 51.1, "diskusage": "109135.3 MB", "memory": 46.5, "equivalentco2ppm": " 406", "totalvocppb": " 3", "id": "20200123132635_0412f445-9b8c-43a8-b34a-a5466f914be7"}
{"uuid": "sgp30_uuid_wau_20200123132637", "ipaddress": "192.168.1.221", "runtime": "0", "host": "garden3", "host_name": "garden3", "macaddress": "dc:a6:32:32:98:20", "end": "1579785997.8079107", "te": "0.02591681480407715", "systemtime": "01/23/2020 08:26:37", "cpu": 58.7, "diskusage": "109133.5 MB", "memory": 47.1, "equivalentco2ppm": " 406", "totalvocppb": " 13", "id": "20200123132637_73f069d9-0beb-4d06-a638-2bd92e50ece7"}
{"uuid": "sgp30_uuid_lse_20200123132639", "ipaddress": "192.168.1.221", "runtime": "0", "host": "garden3", "host_name": "garden3", "macaddress": "dc:a6:32:32:98:20", "end": "1579785999.83777", "te": "0.025897502899169922", "systemtime": "01/23/2020 08:26:39", "cpu": 53.1, "diskusage": "109131.6 MB", "memory": 46.5, "equivalentco2ppm": " 410", "totalvocppb": " 1", "id": "20200123132639_1aa392fe-0eb7-4332-9631-83ac5838e153"}
Very low parts per billion between 1 and 13, with nothing changing in the static room seems like that's a 10 ppb margin of error, we can run some queries in Hue for better stats.
Let's look at some data over time for TVOC.
select appx_median(cast(totalvocppb as double)) median, min(cast(totalvocppb as double)) min,
max(cast(totalvocppb as double)) max, avg(cast(totalvocppb as double)) avg,
stddev(cast(totalvocppb as double)) standarddev,
stddev_pop(cast(totalvocppb as double)) standardpop
from gassensors
So what's a good TVOC? On average we are below the range of potential irritation of 120 - 1200 ppb. We do have some variance for sensor capabilities and lack of professional calibration. Median and Average numbers look good. The maximum is a bit disturbing but can be sensor error, warm up time or other data quality issues. We'll have to dive more into the numbers.
Next we can look at PM 2.5 values.
Need to crowd source some science here.
We had 3,500+ records of data over 120.
select count(*)
from gassensors
where cast(totalvocppb as double) > 120
I can see a number of records and the data climb as the fire burns and we add more cherry wood.
select systemtime, equivalentco2ppm, totalvocppb
from gassensors
where cast(totalvocppb as double) > 120
order by systemtime asc
I should also note that the time series data is coming in every 2 seconds.
select to_timestamp(systemtime, 'MM/dd/yyyy HH:mm:ss'), EXTRACT(to_timestamp(systemtime, 'MM/dd/yyyy HH:mm:ss'),
'MINUTE') as minute ,
cast(totalvocppb as double) as TVOC, cast(equivalentco2ppm as double) CO2PPM
from gassensors
order by systemtime desc
Resources