Showing posts with label minifi. Show all posts
Showing posts with label minifi. Show all posts

IoT Series: MiNiFi Agent on Raspberry Pi 4 with Enviro+ Hat For Environmental Monitoring and Analytics


IoT Series:  MiNiFi Agent on Raspberry Pi 4 with Enviro+ Hat For Environmental Monitoring and Analytics


Summary:  Our powerful edge device streams sensor readings for environmental readings while also performing edge analytics with deep learning libraries and enhanced edge VPU.   We can perform complex running calculations on sensor data locally on the box before making potentially expense network calls.  We can also decide when to send out data based on heuristics, machine learning or simple if-then logic.

Use Case:   Monitor Environment.   Act Local, Report Global.


Stack:   FLANK


Category:   AI, IoT, Edge2AI, Real-Time Streaming, Sensors, Cameras, Telemetry.


Hardware:  Intel Movidius NCC 2 VPU (Neural Computing), Pimoroni Enviro Plus pHAT, Raspberry Pi 4 (4GB Edition).


Software:  Python 3 + Java + MiNiFi Agents + Cloudera Edge Flow Manager (EFM/CEM) + Apache NiFi.   Using Mm... FLaNK Stack.


Keywords:  Edge2AI, CV, AI, Deep Learning, Cloudera, NiFi, Raspberry Pi, Deep Learning, Sensors, IoT, IIoT, Devices, Java, Agents, FLaNK Stack, VPU, Movidius.








Open Source Assets:  https://github.com/tspannhw/minifi-enviroplus


I am running a Python script that streams sensor data continuously to MQTT to be picked up by MiNiFi agents or NiFi.   For development I am just running my Python script with a shell script and nohup.


enviro.sh
python3 /opt/demo/enviro.py


nohup ./enviro.sh &


Example Enviro Plus pHAT Sensor Data


 {
  "uuid" : "rpi4_uuid_xki_20191220215721",
  "ipaddress" : "192.168.1.243",
  "host" : "rp4",
  "host_name" : "rp4",
  "macaddress" : "dc:a6:32:03:a6:e9",
  "systemtime" : "12/20/2019 16:57:21",
  "cpu" : 0.0,
  "diskusage" : "46958.1 MB",
  "memory" : 6.3,
  "id" : "20191220215721_938f2137-5adb-4c22-867d-cdfbce6431a8",
  "temperature" : "33.590520852237226",
  "pressure" : "1032.0433707836428",
  "humidity" : "7.793797584651376",
  "lux" : "0.0",
  "proximity" : "0",
  "gas" : "Oxidising: 3747.82 Ohms\nReducing: 479652.17 Ohms\nNH3: 60888.05 Ohms"

}



We are also running a standard MiNiFi Java Agent 0.6 that is running a Python application to do sensors, edge AI with Intel's OpenVino and some other analytics.


test.sh


#!/bin/bash


DATE=$(date +"%Y-%m-%d_%H%M")
source /opt/intel/openvino/bin/setupvars.sh
fswebcam -q -r 1280x720 --no-banner /opt/demo/images/$DATE.jpg
python3 -W ignore /opt/intel/openvino/build/test.py /opt/demo/images/$DATE.jpg 2>/dev/null


test.py

https://github.com/tspannhw/minifi-enviroplus/blob/master/test.py



Example OpenVino Data


{"host": "rp4", "cputemp": "67", "ipaddress": "192.168.1.243", "endtime": "1577194586.66", "runtime": "0.00", "systemtime": "12/24/2019 08:36:26", "starttime": "12/24/2019 08:36:26", "diskfree": "46889.0", "memory": "15.1", "uuid": "20191224133626_55157415-1354-4137-8472-424779645fbe", "image_filename": "20191224133626_9317880e-ee87-485a-8627-c7088df734fc.jpg"}


In our flow I convert to Apache Avro, as you can see Avro schema is embedded.







The flow is very simple, consume MQTT messages from our broker on the topic we are pushing messages to from our field sensors.   We also ingest MiNiFi events through standard Apache NiFi HTTP(s) Site-to-Site (S2S).   We route images to our image processor and sensor data right to Kudu tables.





Now that the data is stored to Apache Kudu we can do our analytics.




Easy to Run an MQTT Broker



References:



Demo Info:


https://subscription.packtpub.com/book/application_development/9781787287815/1/ch01lvl1sec12/installing-a-mosquitto-broker-on-macos


Run Mosquitto MQTT on Local Machine (RPI, Mac, Win10, ...)


On OSX, brew install mosquitto


 /usr/local/etc/mosquitto/mosquitto.conf


To have launchd start mosquitto now and restart at login:
  brew services start mosquitto


Or, if you don't want/need a background service you can just run:
  mosquitto -c /usr/local/etc/mosquitto/mosquitto.conf


For Python, we need pip3 install paho-mqtt.


Run Sensors on Device that pushes to MQTT


Python pushes continuous stream of sensor data to MQTT


MiNiFi Agent Reads Broker


Send to Kafka and/or NiFi


Example Image Grabbed From Webcam in Dark Office (It's Christmas Eve!)








 When ready we can push to a CDP Data Warehouse in AWS.



With CDP, it's very easy to have a data environment in many clouds to store my live sensor data.



 I can now use this data in Kudu tables from Cloudera Data Science Workbench for real Data Science, machine learning and insights.




What do we do with all of this data?   Check in soon for real-time analytics and dash boarding.








































Using GrovePi with Raspberry Pi and MiNiFi Agents for Data Ingest to Parquet, Kudu, ORC, Kafka, Hive and Impala

Using GrovePi with Raspberry Pi and MiNiFi Agents for Data Ingest


Source Code:  https://github.com/tspannhw/minifi-grove-sensors

Acquiring sensor data from Grove sensors is easy using a GrovePi Hat and some compatible sensors.


Just before my talk at the Future of Data Meetup @ Bell Works in Holmdel, NJ, I thought I should ingest some data from a grove sensor interface.

It's so easy a sleeping cat could do it.




So what does this device look like?  



I have a temperature and humidity sensor on there.




The distance sonic sensor is in there too, that's for the next article.




Let's do this with minimal RAM.




That's a 64GB hard drive underneath in the white case with the RPI.





I need more data and BACON.



We design our MiNiFi Agent Flow in CEM/EFM.   Grab JSON data stream and run sensors.


Apache NiFi 1.9.2 / CFM 1.0 Received HTTPS S2S Events From MiNiFi Agent




A simple flow to query and convert our JSON data, then store it to Kudu and HDFS (ORC) as well as push it to Kafka with a schema.




Let's read that Kafka message and store to Parquet, we will push to MQTT and JMS in the next article.   This is our universal proxy/gateway.



We could infer a schema and not save it.   But by saving a schema to the schema registry it makes SMM, Kafka, NiFi and others schema aware and easy to automagically query and convert between CSV/JSON/XML/AVRO/Parquet and more.

Let's store the data in Parquet files on HDFS with an Impala table.   In Apache NiFi 1.10 there is a ParquetWriter



Before we push to Kafka, let's create a topic for it with Cloudera SMM



Let's build an impala table for that Kudu data.



We can query our tables with ease as data rapidly is added.





Let's Examine the Parquet Files that NiFi Generated





 Let's query that parquet data with Impala in Hue



 Let's monitor that data in Kafka with Cloudera SMM






That was easy from device to enterprise cloud data store(s) with enterprise messages, security, governance, lineage, data catalog, SDX, monitoring and more.   How easy can you ingest IoT data, query it mid stream and store it in multiple data stores.   It took longer to write the article then to do the project and code.   All graphical, Single Sign On, multiple schemas/verisons/data types/engines, multiple OSs, edge, cloud and laptop.   Easy.

Table DDL


CREATE EXTERNAL TABLE IF NOT EXISTS grovesensors2 
(humidity STRING, uuid STRING, systemtime STRING, runtime STRING, cpu DOUBLE, id STRING, te STRING, host STRING, `end` STRING, 
macaddress STRING, temperature STRING, diskusage STRING, memory DOUBLE, ipaddress STRING, host_name STRING) 
STORED AS ORC
LOCATION '/tmp/grovesensors'

CREATE TABLE grovesensors ( uuid STRING,  `end` STRING,humidity STRING, systemtime STRING, runtime STRING, cpu DOUBLE, id STRING, te STRING, 
host STRING,
macaddress STRING, temperature STRING, diskusage STRING, memory DOUBLE, ipaddress STRING, host_name STRING,
PRIMARY KEY (uuid, `end`)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU
TBLPROPERTIES ('kudu.num_tablet_replicas' = '1')

hdfs dfs -mkdir -p /tmp/grovesensors
hdfs dfs -mkdir -p /tmp/groveparquet

CREATE  EXTERNAL TABLE grove_parquet 
 (
 diskusage STRING, 
  memory DOUBLE,  host_name STRING,
  systemtime STRING,
  macaddress STRING,
  temperature STRING,
  humidity STRING,
  cpu DOUBLE,
  uuid STRING,  ipaddress STRING,
  host STRING,
  `end` STRING,  te STRING,
  runtime STRING,
  id STRING
)
STORED AS PARQUET
LOCATION '/tmp/groveparquet/'

Parquet Format



message org.apache.nifi.grove {
  optional binary diskusage (STRING);
  optional double memory;
  optional binary host_name (STRING);
  optional binary systemtime (STRING);
  optional binary macaddress (STRING);
  optional binary temperature (STRING);
  optional binary humidity (STRING);
  optional double cpu;
  optional binary uuid (STRING);
  optional binary ipaddress (STRING);
  optional binary host (STRING);
  optional binary end (STRING);
  optional binary te (STRING);
  optional binary runtime (STRING);
  optional binary id (STRING);
}

References







Updating Machine Learning Models At The Edge With Apache NiFi and MiNiFi

Updating Machine Learning Models At The Edge With Apache NiFi and MiNiFi

Yes, we have bidirectional communication with MiNiFi agents from Apache NiFi via Site-to-Site (S2S) over HTTPS.   This means I can push in anything I want to the agent, including commands, files and updates.

I can also transmit data to edge agents via MQTT, REST and Kafka amongst other options.


NiFi Ready To Send and Receive Messages From Other NiFi Nodes, Clusters and MiNiFi Agents


Our NiFi flow is consuming Kafka and MQTT Messages, as well as reading updated model files and generating integration test sensor data.



MiNiFi Agents Have Downloaded The Model and Anything Else We Send to It



It's Easy to Configure MQTT Message Consumption in CEM, we just need the broker (with port) and a topic to filter on if you wish.




To Listen For Files/Models You can easily add a REST End Point to Proxy in Data of Your Choice with or without SSL


Here's an example CURL script to test that REST API:

curl -d '{"key1":"value1", "key2":"value2"}' -H "Content-Type: application/json" -X POST http://ec2-3-85-54-189.compute-1.amazonaws.com:8899/upload


We can generate JSON IoT Style Data for Integration Tests with ease using GenerateFlowFile:


Let's grab updated models when they change from my Data Science server:


I can read Kafka messages and send them to MiNiFi agents as well.




So I pushed a TFLITE model, but ONNX, PMML, Docker or Pickle are all options.