Edge Processing with Jetson Nano Part 2 - Apache NiFi Flow

Edge Data Processing with Jetson Nano Part 2 - Apache NiFi - Process, Route, Transform, Store





Apache NiFi Flow to Process Data



We route images from the webcameras, logs from the runs and JSON sensor readings to appropriate processors.  We also convert JSON to AVRO for storage in Hadoop or S3 while running queries on the data to check temperatures of the device.   TensorFlow and Apache MXNet are run on the images in-stream as they pass through Apache NiFi.

Example Device and Deep Learning Data



Logs Returned From the Device



Push Some Results to Slack





Edge Data Processing with Jetson Nano Part 1 - Deploy, Setup and Ingest

Edge Data Processing with Jetson Nano Part 1 - Deploy, Setup and Ingest




















Configuring Executing Image Capture and Jetson Nano Classify Python Script



Configuring Tailing JSON Log



Configuring Acquiring Images from File Directory




Configuring the Remote Connection to NiFi






Example CEM Events





Simple NiFi Flow to Receive Remote Events


Apache NiFi Server receives from annotated images as well as JSON packets.


JSON Data Packet Example

{"uuid": "nano_uuid_kwo_20190719182103", "ipaddress": "192.168.1.254", "top1pct": 32.6171875, "top1": "desktop computer", "cputemp": "32.5", "gputemp": "31.5", "gputempf": "89", "cputempf": "90", "runtime": "5", "host": "jetsonnano", "filename": "/opt/demo/images/image_bei_20190719182103.jpg", "imageinput": "/opt/demo/images/2019-07-19_1421.jpg", "host_name": "jetsonnano", "macaddress": "de:07:5a:27:1e:7f", "end": "1563560468.7867181", "te": "4.806252717971802", "systemtime": "07/19/2019 14:21:08", "cpu": 55.8, "diskusage": "5225.1 MB", "memory": 57.5, "id": "20190719182103_fcaa94d4-7629-423a-b76e-714168e64677"}


Notes

It was very easy to setup a simple flow to execute out Deep Learning classification and data acquisition with Apache NiFi, MiNiFi and Cloudera EFM.  We can now do something with the data like push it to the cloud.

 Source:

Philadelphia Open Crime Data on Phoenix / HBase

This is an update to a previous article on accessing Philadelphia Open Crime Data and storing it in Apache Phoenix on HBase.

It seems an update to Spring Boot, Phoenix and Zeppelin make for a cleaner experience.

I also added a way to grab years of historical Policing data.

All NiFi, Zeppelin and Source is here:   https://github.com/tspannhw/phillycrime-springboot-phoenix




Part 1: https://community.hortonworks.com/articles/54947/reading-opendata-json-and-storing-into-phoenix-tab.html












We convert JSON to Phoenix Upserts.
We push JSON Records to HBase with PutHBaseReord.


Query Phoenix at the Command Line, Super Fast SQL



Resources




Example Data
"dc_dist":"18",
"dc_key":"200918067518",
"dispatch_date":"2009-10-02",
"dispatch_date_time":"2009-10-02T14:24:00.000",
"dispatch_time":"14:24:00",
"hour":"14",
"location_block":"S 38TH ST / MARKETUT ST",
"psa":"3",
"text_general_code":"Other Assaults",
"ucr_general":"800"}


Create a Phoenix Table

/usr/hdp/current/phoenix-client/bin/sqlline.py localhost:2181:/hbase-unsecure

CREATE TABLE phillycrime (dc_dist varchar,
dc_key varchar not null primary key,dispatch_date varchar,dispatch_date_time varchar,dispatch_time varchar,hour varchar,location_block varchar,psa varchar,
text_general_code varchar,ucr_general varchar);


Add NiFi / Spring Boot Connectivity to Phoenix
org.apache.phoenix.jdbc.PhoenixDriver
jdbc:phoenix:localhost:2181:/hbase-unsecure
/usr/hdp/3.1/phoenix/phoenix-client.jar
/usr/hdp/3.1/hbase/lib/hbase-client.jar
/etc/hbase/conf/hbase-site.xml



Run spring boot …

Powering Edge AI with the Powerful Jetson Nano

NVidia Jetson Nano Deep Learning Edge Device


Nano The Cat





Hardware:
Jetson Nano developer kit. Built around a 128-core Maxwell GPU and quad-core ARM A57 CPU running at 1.43 GHz and coupled with 4GB of LPDDR4 memory! This is power at the edge. I now have a favorite new device.

You need to add some kind of USB WiFi adaptor if you are not hardwired to ethernet. This is cheap and easy, I added a tiny $15 WiFi adapter and was off to the races.


Operating System:
Ubuntu 18.04

Library Setup:


sudo apt-get update -y
sudo apt-get install git cmake -y
sudo apt-get install libatlas-base-dev gfortran -y
sudo apt-get install libhdf5-serial-dev hdf5-tools -y

sudo apt-get install python3-dev -y
sudo apt-get install libcv-dev libopencv-dev -y
sudo apt-get install fswebcam -y
sudo apt-get install libv4l-dev -y
sudo apt-get install python-opencv -y
pip3 install psutil
pip2 install psutil
pip3.6 install easydict -U
pip3.6 install scikit-learn -U
pip3.6 install opencv-python -U --user
pip3.6 install numpy -U
pip3.6 install mxnet -U
pip3.6 install mxnet-mkl -U
pip3.6 install gluoncv --upgrade
sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev -y
sudo apt-get install python3-pip
sudo pip3 install -U pip
sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu
sudo nvpmodel -q --verbose
pip3 install numpy
pip3 install keras
git clone https://github.com/dusty-nv/jetson-inference
cd jetson-inference
git submodule update --init
tegrastats
pip3 install -U jetson-stats

Source:
https://github.com/tspannhw/iot-device-install
https://github.com/tspannhw/minifi-jetson-nano

IoT Setup

Download MiNiFi 0.6.0 Source from Cloudera and Build.
Download MiNiFi Java Agent (Binary)  and Unzip.

Follow these instructions.

On a Server

We want to hookup to EFM to make flow development, deploy, management and monitoring of MiNiFi agents trivial.   Download NiFi Registry.    You will also need Apache NiFi.

For a good walkthrough and hands-on demonstration see this workshop.

See these cool Jetson Nano Projects:  https://developer.nvidia.com/embedded/community/jetson-projects

Monitor Status
https://github.com/rbonghi/jetson_stats

Example Flow

It's easy to add MiNiFi Java or CPP Agents to the Jetson Nano.   I did a custom NiFi CPP 0.6.0 build for Jetson.  I did a quick flow to run the jetson-inference imagenet-console CPP binary on an image captured from a compatible Logitech USB Webcam with fswebcam.   I store the images to /opt/demo/images and pass it on the command line to the CPP console as a proof of concept.

#!/bin/bash

DATE=$(date +"%Y-%m-%d_%H%M")

fswebcam -q -r 1280x720 --no-banner /opt/demo/images/$DATE.jpg

/opt/demo/jetson-inference/build/aarch64/bin/imagenet-console  /opt/demo/images/$DATE.jpg  /opt/demo/images/out_$DATE.jpg
==
imagenet-console
  args (3):  0 [/opt/demo/jetson-inference/build/aarch64/bin/imagenet-console]  1 [/opt/demo/images/2019-07-01_1405.jpg]  2 [/opt/demo/images/out_2019-07-01_1405.jpg]


imageNet -- loading classification network model from:
         -- prototxt     networks/googlenet.prototxt
         -- model        networks/bvlc_googlenet.caffemodel
         -- class_labels networks/ilsvrc12_synset_words.txt
         -- input_blob   'data'
         -- output_blob  'prob'
         -- batch_size   2

[TRT]  TensorRT version 5.0.6
[TRT]  detected model format - caffe  (extension '.caffemodel')
[TRT]  desired precision specified for GPU: FASTEST
[TRT]  requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT]  native precisions detected for GPU:  FP32, FP16
[TRT]  selecting fastest native precision for GPU:  FP16
[TRT]  attempting to open engine cache file /opt/demo/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT]  loading network profile from engine cache... /opt/demo/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT]  device GPU, /opt/demo/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel loaded
[TRT]  device GPU, CUDA engine context initialized with 2 bindings
[TRT]  binding -- index   0
               -- name    'data'
               -- type    FP32
               -- in/out  INPUT
               -- # dims  3
               -- dim #0  3 (CHANNEL)
               -- dim #1  224 (SPATIAL)
               -- dim #2  224 (SPATIAL)
[TRT]  binding -- index   1
               -- name    'prob'
               -- type    FP32
               -- in/out  OUTPUT
               -- # dims  3
               -- dim #0  1000 (CHANNEL)
               -- dim #1  1 (SPATIAL)
               -- dim #2  1 (SPATIAL)
[TRT]  binding to input 0 data  binding index:  0
[TRT]  binding to input 0 data  dims (b=2 c=3 h=224 w=224) size=1204224
[cuda]  cudaAllocMapped 1204224 bytes, CPU 0x100e30000 GPU 0x100e30000
[TRT]  binding to output 0 prob  binding index:  1
[TRT]  binding to output 0 prob  dims (b=2 c=1000 h=1 w=1) size=8000
[cuda]  cudaAllocMapped 8000 bytes, CPU 0x100f60000 GPU 0x100f60000
device GPU, /opt/demo/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel initialized.
[TRT]  networks/bvlc_googlenet.caffemodel loaded
imageNet -- loaded 1000 class info entries
networks/bvlc_googlenet.caffemodel initialized.








Reference:




Performance Testing Apache NiFi - Part 1 - Loading Directories of CSV

Performance Testing Apache NiFi - Part 1 - Loading Directories of CSV

I am running a lot of different flows on different Apache NiFi configurations to get some performance numbers in different situations.

One situation I thought of was access directories of CSV files from HTTP.  Fortunately there's some really nice data available from NOAA (https://www.ncei.noaa.gov/data/global-hourly/access/2019/).

Example Flow:  NOAA





In this example performance testing flow I use my LinkProcessor to grab all of the links to CSV files on the HTTP download site.  I then split this JSON list into individual records and pull out the URL.   If it's a valid URL with a .CSV ending then I call invokeHTTP to download the CSV.   I then query the CSV for all the records (SELECT *) and for a count (SELECT COUNT(*)).   As part of this the records are written to JSON.



In this example we grab a specific CSV file and get 739 records.


 This CSVReader uses Jackson to parse the CSV files and figures out fields from the header.



I pull out the URL returned from the Link Processor.



This is my JSON Record Set Writer, it doesn't include a schema since I never built one.



I am looking at some performance stats for my NiFi instance which has 31GB of JVM space.  32GB causes issues due to the JVM's problem with 32bit addressing.









In this flow I generate unique JSON files in mass quantities at about 250bytes, merge them together, compress them, then push them to a file system.   This is to see how many records I can push.




QueryRecord is easy on CSV files even with no known schema.



The Results of the recordCount query:


I can also test with really fast multithreaded calls to a popular btc.com BitCoin exchange REST API.


Even encrypting and compressing won't slow me down.






Example Translated Data Segment
[{"STATION":"16541099999","DATE":"2019-01-07T05:55:00","SOURCE":"4","LATITUDE":"39.6666667","LONGITUDE":"9.4333333","ELEVATION":"645.0","NAME":"PERDASDEFOGU, IT","REPORT_TYPE":"FM-15","CALL_SIGN":"99999","QUALITY_CONTROL":"V020","WND":"330,1,N,0010,1","CIG":"99999,9,9,Y","VIS":"999999,9,9,9","TMP":"+0030,1","DEW":"+0020,1","SLP":"99999,9","MA1":null,"MD1":null,"REM":null},{"STATION":"16541099999","DATE":"2019-01-07T06:55:00","SOURCE":"4","LATITUDE":"39.6666667","LONGITUDE":"9.4333333","ELEVATION":"645.0","NAME":"PERDASDEFOGU, IT","REPORT_TYPE":"FM-15","CALL_SIGN":"99999","QUALITY_CONTROL":"V020","WND":"330,1,N,0010,1","CIG":"99999,9,9,Y","VIS":"999999,9,9,9","TMP":"+0030,1","DEW":"+0030,1","SLP":"99999,9","MA1":null,"MD1":null,"REM":null},{"STATION":"16541099999","DATE":"2019-01-07T07:55:00","SOURCE":"4","LATITUDE":"39.6666667","LONGITUDE":"9.4333333","ELEVATION":"645.0","NAME":"PERDASDEFOGU, IT","REPORT_TYPE":"FM-15","CALL_SIGN":"99999","QUALITY_CONTROL":"V020","WND":"300,1,N,0010,1","CIG":"99999,9,9,Y","VIS":"999999,9,9,9","TMP":"+0030,1","DEW":"+0020,1","SLP":"99999,9","MA1":null,"MD1":null,"REM":null},{"STATION":"16541099999","DATE":"2019-01-07T09:55:00","SOURCE":"4","LATITUDE":"39.6666667","LONGITUDE":"9.4333333","ELEVATION":"645.0","NAME":"PERDASDEFOGU, IT","REPORT_TYPE":"FM-15","CALL_SIGN":"99999","QUALITY_CONTROL":"V020","WND":"280,1,N,0026,1","CIG":"99999,9,9,Y","VIS":"999999,9,9,9","TMP":"+0070,1","DEW":"+0050,1","SLP":"99999,9","MA1":null,"MD1":null,"REM":null},{"STATION":"16541099999","DATE":"2019-01-07T10:55:00","SOURCE":"4","LATITUDE":"39.6666667","LONGITUDE":"9.4333333","ELEVATION":"645.0","NAME":"PERDASDEFOGU, IT","REPORT_TYPE":"FM-15","CALL_SIGN":"99999","QUALITY_CONTROL":"V020","WND":"260,1,N,0046,1","CIG":"99999,9,9,Y","VIS":"999999,9,9,9","TMP":"+0080,1