Monitoring Mac Laptops With Apache NiFi and osquery
The other way is pass a SQL query to osquery interpreter (ala osqueryi --json "SELECT * FROM $1") and get the query results back as JSON.
We can tail the main file (/var/log/osquery/osqueryd.results.log) and send the JSON to be used at scale as events. We can also grab any and all osquery logs like INFO, WARN and ERROR via osquery.+.
We then turn JSON osquery records into records that can be used for routing, queries, aggregates and ultimately pushing it to Impala/Kudu for rich Cloudera Visual Apps and to Kafka as Schema Aware AVRO to use in Kafka Connect as well as a live continuous query feed to Flink SQL streaming analytic applications.
We could also have osquery push directly to Kafka, but since I am often disconnected from a Kafka server, in offline mode or just want a local buffer for these events lets use Apache NiFi which can run as a single 2GB node on my machine. I can also do local processing of the data and some local alerting if needed.
Once you have the data from one or million machines you can do log aggregation, anomaly detection, predictive maintenance or whatever else you might need to do. Sending this data to Cloudera Data Platform in AWS or Azure and having CML and Visual Apps to store, analyze, report, query, build apps, build pipelines and ultimately build production machine learning flows on really makes this a simple example of how to take any data and bring it into a full data platform.
Again, these types of ingests are so easy in Apache NiFi.
Step 1, schedule when we want these. There is a limit of 1,000 calls per hour, so let's keep it to 4 calls a minute for each of the three REST end points.
Let's get satellite information on right above me.
We set parameters for: your latitude, your longitude, your apikey and then just change up bits of the REST URL. Note for this one we are using SSL, so make sure you have an SSL context.
Now we have three streams of JSON data that has lat and long, so we can plot this on a map with Cloudera Visual Apps, storing our data in Impala tables in Kudu.
I have given this one a test run, it has all the features you like for Jetson, with just 2 GB less RAM and 2 less USB ports. This is a very affordable device to do cool apps.
128-core NVIDIA MaxwellTM
64-bit Quad-core ARM A57 (1.43 GHz)
2 GB 64-bit LPDDR4 (25.6 GB/s bandwidth)
Gigabit Ethernet
1x USB 3.0 Type A ports, 2x USB 2.0 Type A ports, 1x USB 2.0
Micro-B
HDMI
WiFi
GPIOs, I2C, I2S, SPI, PWM, UART
1x MIPI CSI-2 connector
MicroSD Connector
12-pin header (Power and related signals, UART)
100mm x 80mm x 29mm
USB-C Port for Power
Depending where you or or how you buy the package you may need to buy a power supply and USB WiFi.
All of my existing workloads have been working fine in the 2GB version, but with a very nice cost saving. The setup is easy, the system is fast, I have to highly recommend anyone looking for a quick way to do Edge AI and other edge workloads a try. This could be a decent machine for learning. I hooked mine up to a monitor, keyboard and mouse and I could use it right away for edge development and also as a basic desktop. Nice work! I might need to get 11 more of these. These will run MiNiFi agents, Python and Deep Learning classifications at ease.
NVIDIA didn't stop with the ultimate low-cost edge device, they have some serious enterprise updates as well:
Cloudera superchargers the Enterprise Data Cloud with NVIDIA
There seems to be a ton more news coming at this virtual event, so I recommend attending and watching for more detailed posts on new things coming out.
Using DJL.AI For Deep Learning BERT Q&A in NiFi DataFlows
Introduction:
I will be talking about this processor at Apache Con @ Home 2020 in my "Apache Deep Learning 301" talk with Dr. Ian Brooks.
Sometimes you want your Deep Learning Easy and in Java, so let's do that with DJL in a custom Apache NiFi processor running in CDP Data Hubs. This one does BERT QA.
To use the processor feed in a paragraph to analyze via the paragraph parameter in the NiFi processor. Also feed in a question, like Why? or something very specific like asking the date or an event.
The pretrained model is BERT QA model using PyTorch. the NiFi Processor Source:
Make sure you have 1-2 GB of RAM extra for your NiFi instance for running each DJL processor. If you have a lot of text, run more nodes and/or RAM. Make sure you have at least 8 cores per Deep Learning process. I prefer JDK 11 for this.
The pretrained model is DistilBERT model trained by HuggingFace using PyTorch.
Tip
Make sure you have 1-2 GB of RAM extra for your NiFi instance for running each DJL processor. If you have a lot of text, run more nodes and/or RAM. Make sure you have at least 8 cores per Deep Learning process. I prefer JDK 11 for this.
After seeing Caito Scherr's amazing talk, I want to build up some useful dashboards. My first step is exploring all the available APIs in my CSA/Flink environment. The easiest way to discover them was I turned on Developer Console in Chrome while using the Flink Dashboard which is a great dashboard in it's own right. But it is not focused on some key metrics that some customers are asking about in a very easy to read format for end-users.
{"jobs":[{"jid":"7c01884b74ff981a896307c4a06f2b15","name":"default: select * from itemprice","state":"CANCELED","start-time":1599576455857,"end-time":1599576486876,"duration":31019,"last-modification":1599576486876,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"faeb308856db337ce628af5fea24b895","name":"default: insert into krogerprices\nselect upc,updatedate,itemdescription,brandname,CAST(price as float) as price, UUID() as uuid\nfrom itemprice\nwhere originstore = 'kroger'","state":"CANCELED","start-time":1599674296089,"end-time":1599766705456,"duration":92409367,"last-modification":1599766705456,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"5d6ae4f72ab9fca3cea28ba6d4905ca7","name":"default: select * from krogerprices","state":"CANCELED","start-time":1599576795487,"end-time":1599579517485,"duration":2721998,"last-modification":1599579517485,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"ec80a32b6ab59d96f649f5b3e493ec67","name":"Streaming WordCount","state":"FINISHED","start-time":1599571302659,"end-time":1599571318768,"duration":16109,"last-modification":1599571318768,"cluster":null,"tasks":{"total":5,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":5,"canceling":0,"canceled":0,"failed":0,"reconciling":0}},{"jid":"ad949e727a8c0267c9f2550c6a9b6000","name":"default: select * from itemprice","state":"CANCELED","start-time":1599676984684,"end-time":1599677004620,"duration":19936,"last-modification":1599677004620,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"8c1a6903b81e7b926b7105720e24aee8","name":"default: insert into krogerprices\nselect upc,updatedate,itemdescription,brandname,CAST(price as float) as price, UUID() as uuid\nfrom itemprice\nwhere originstore = 'kroger'","state":"CANCELED","start-time":1599576540243,"end-time":1599582887998,"duration":6347755,"last-modification":1599582887998,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"4cb8a5983b0bd3a14fe90618e17e2488","name":"default: select * from krogerprices","state":"CANCELED","start-time":1599674323425,"end-time":1599676592438,"duration":2269013,"last-modification":1599676592438,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}},{"jid":"01988557ccd71cbab899ded9babab606","name":"default: insert into krogerprices\nselect upc,updatedate,itemdescription,brandname,CAST(price as float) as price, UUID() as uuid\nfrom itemprice\nwhere originstore = 'kroger'","state":"RUNNING","start-time":1599673893701,"end-time":-1,"duration":103791030,"last-modification":1599673903811,"cluster":{"url":"http://ec2-3-86-165-80.compute-1.amazonaws.com:8088/proxy/application_1599570933443_0003/","originalUrl":"http://ec2-3-86-165-80.compute-1.amazonaws.com:35981","id":"application_1599570933443_0003","hostAndPort":"ec2-3-86-165-80.compute-1.amazonaws.com:35981"},"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":1,"finished":0,"canceling":0,"canceled":0,"failed":0,"reconciling":0}},{"jid":"7c7932678f193f51c32cd3a2ebff6d59","name":"default: select * from itemprice","state":"CANCELED","start-time":1599573232967,"end-time":1599576425840,"duration":3192873,"last-modification":1599576425840,"cluster":null,"tasks":{"total":1,"created":0,"scheduled":0,"deploying":0,"running":0,"finished":0,"canceling":0,"canceled":1,"failed":0,"reconciling":0}}]}
2020-09-11 13:37:42,586 INFO org.apache.flink.yarn.YarnResourceManager - Disconnect job manager 9c2e3f25dae1d548dc730941d6484cbb@akka.tcp://flink@ec2-3-86-165-80.compute-1.amazonaws.com:33566/user/jobmanager_7 for job 67a2de8fb291333bbd90b334f8f83def from the resource manager.
2020-09-11 13:37:42,586 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Stopping ZooKeeperLeaderElectionService ZooKeeperLeaderElectionService{leaderPath='/leader/67a2de8fb291333bbd90b334f8f83def/job_manager_lock'}.
2020-09-11 13:37:42,591 INFO org.apache.flink.runtime.jobmanager.ZooKeeperJobGraphStore - Removed job graph 67a2de8fb291333bbd90b334f8f83def from ZooKeeper.
2020-09-09 17:58:17,456 INFO org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer - Starting FlinkKafkaInternalProducer (1/1) to produce into default topic krogerprices
2020-09-09 17:58:17,465 INFO org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase - Consumer subtask 0 has no restore state.
2020-09-09 17:58:17,475 INFO org.apache.flink.kafka.shaded.org.apache.kafka.clients.consumer.ConsumerConfig - ConsumerConfig values: