Skip to main content

Posts

Showing posts from February, 2020

New and Improved: It's NiFi

Apache NiFi 1.11.3
http://nifi.apache.org/download.html

If you have downloaded anything after NiFi 1.10, please upgrade now.   This has some major improvements and some fixes.

Release note highlights can be found here:
https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.11.3

I am running this now in Anaheim, and it's no Mickey Mouse upgrade.   It's fast and nice.

Some of the more recent upgrades:

https://www.datainmotion.dev/2019/11/exploring-apache-nifi-110-parameters.html

For parameters and stateless, and ability to download a flow as JSON is worth the price of install.



See some more NiFi 1.11 features here:   https://www.datainmotion.dev/2020/02/edgeai-google-coral-with-coral.html


EdgeAI: Google Coral with Coral Environmental Sensors and TPU With NiFi and MiNiFi (Updated EFM)

EdgeAI:   Google Coral with Coral Environmental Sensors and TPU With NiFi and MiNiFi
Building MiNiFi IoT Apps with the new Cloudera EFM 

It is very easy to build a drag and drop EdgeAI application with EFM and then push to all your MiNiFi agents.

Cloudera Edge Management CEM-1.1.1
Download the newest CEM today! https://www.cloudera.com/downloads/cdf/cem.html https://docs.cloudera.com/cem/1.1.1/release-notes/topics/cem-whats-new.html








NiFi Flow Receiving From MiNiFi Java Agent

In a cluster in my CDP-DC Cluster I consume Kafka messages sent from my remote NiFi gateway to publish alerts to Kafka and push records to Apache HBase and Apache Kudu.  We filter our data with Streaming SQL.

We can use SQL to route, create aggregates like averages, chose a subset of fields and limit data returned.   Using the power of Apache Calcite, Streaming SQL in NiFi is a game changer against Record Data Types including CSV, XML, Avro, Parquet, JSON and Grokable text.   Read and write different formats and convert …

Connecting Apache NiFi to Apache Atlas For Data Governance At Scale in Streaming

Connecting Apache NiFi to Apache Atlas For Data Governance At Scale in Streaming
Once connected you can see NiFi and Kafka flowing to Atlas.

You must add AtlasReport to NiFi cluster.



Add a ReportLineageToAtlas under Controller Settings / Reporting Tasks You must add URL for Atlas, Authentication method and if basic, username/password.




You need to set Atlas Configuration directory, NiFi URL to use, Lineage Strategy - Complete Path

Another example with an AWS hosted NiFi and Atlas:


IMPORTANT NOTE:   Keep your Atlas Default Cluster Name consistent with other applications for Cloudera clusters, usually the name cm is a great option or default.


You can now see the lineage state:



Configure Atlas to Be Enabled and Have Kafka

Have Atlas Service enabled in NiFi configuration


Example Configuration

You must have access to Atlas Application Properties.

/etc/atlas/conf


atlas-application.properties
#Generated by Apache NiFi ReportLineageToAtlas ReportingTask at 2020-02-21T17:18:28.493Z #Fri Feb 21 17:18:28 UTC …

Example SMM Notification Email

Example SMM Notification Email
Notification id: 12f61ec2-11a3-45ba-b7bb-2416d8a1b076,
Root resource name: ANY,
Root resource type: CONSUMER,
Created timestamp: Tue Jan 07 21:13:45 UTC 2020 : 1578431625199,Last updated timestamp: Mon Jan 13 13:09:38 UTC 2020 : 1578920978293,
State: RAISED,

Message:Alert policy : "ALERT IF ( ANY CONSUMER MILLISECONDS_LAPSED_SINCE_CONSUMER_WAS_ACTIVE >= 1200 )" has been evaluated to true Condition : "MILLISECONDS_LAPSED_SINCE_CONSUMER_WAS_ACTIVE>=1200" has been evaluated to true for following CONSUMERS - CONSUMER = "tensorflow-nifi-aws-client" had following attribute values * MILLISECONDS_LAPSED_SINCE_CONSUMER_WAS_ACTIVE = 308208428 - CONSUMER = "atlas" had following attribute values * MILLISECONDS_LAPSED_SINCE_CONSUMER_WAS_ACTIVE = 596819269 - CONSUMER = "nifi-gassensor-aws-client" had following attribute values * MILLISECONDS_LAPSED_SINCE_CONSUMER_WAS_ACTIVE = 310692570 - CONSUMER = "NIFI-TEST-GROU…

Apache Atlas for Monitoring Edge2AI IoT Flows

Apache Atlas for Monitoring Edge2AI IoT Flows