Producing and Consuming Pulsar messages with Apache NiFi

Producing and Consuming Pulsar messages with Apache NiFi

Thanks to Pulsar committer and Author, David Kjerrumgaard, we have a brand new advanced feature Apache NiFi 1.14/1.15 record processor for consuming and producing messages from StreamNative Cloud and any other Apache Pulsar cluster.   I recommend utilizing the latest Apache NiFi 1.15 with Apache Pulsar 2.8.1.

Official First NAR Release



Pulsar Summit

 Pulsar Summit Europe 2021 is taking place virtually on October 6. Sessions include industry experts from Apache Pulsar PMC, CleverCloud, and Databricks. You’ll learn about the latest Pulsar project updates, technology. Register today and save your seat: 

Building Bad Titles For Talks

Building Bad Titles For Talks

from textgenrnn import textgenrnn

textgen = textgenrnn()

textgen.train_from_file('tim.txt', num_epochs=1)



Example Run

tspann@Timothys-MBP code % python3.7

/Users/tspann/Library/Python/3.7/lib/python/site-packages/tensorflow/python/keras/optimizer_v2/ UserWarning: The `lr` argument is deprecated, use `learning_rate` instead.

  "The `lr` argument is deprecated, use `learning_rate` instead.")

2021-08-02 10:40:28.146481: I tensorflow/core/platform/] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA

To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

69 texts collected.

Training on 2,506 character sequences.

2021-08-02 10:40:28.710370: I tensorflow/compiler/mlir/] None of the MLIR Optimization Passes are enabled (registered 2)

19/19 [==============================] - 6s 143ms/step - loss: 1.8994


Temperature: 0.2


Apache Streaming Streaming Station Stack

First Anti-Tatto Stack (A File State Stack Pack And Pussions

A Stack of Apache Stack


Temperature: 0.5


Cloud Dead Folk Streaming And Analance Art Past Flink

Into Apache Space Trades Channel Stack

Push Lake Station


Temperature: 1.0


Batt-Indunes Means Stgut

Sometimes time page

I real-posts, UIP Puming this reaction

Real-Timobitman with Apache and Flire

Note installing on Mac:

pip3 install git+git://


Apache NiFi 101:   Introduction and Best Practices

Cracking the Nut, Solving Edge AI with Apache Tools and Frameworks

FLANK Stack for Cloud Data Lakes

FLIP Stack for Cloud Data Lakes

Lightning Introduction to FLaNK

Pack Your Bags, We’re Going on a Data Journey!

Real-Time Streaming in Azure

Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp

Using the Mm FLaNK Stack for Edge AI (Flink, NiFi, Kafka, Kudu)

Utilizing Apache Kafka, Apache NiFi and MiNiFi for EdgeAI IoT at Scale

Real-Time Streaming in Any and All Clouds, Hybrid and Beyond

Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)

Using Apache NiFi with Apache Pulsar for Fast Data On-Ramp

Hail Hydrate! From Stream to Lake with Pulsar and Friends

Continuous SQL with Kafka and Flink

FLiP Stack for Cloud Data Lakes









Cloud Enterprise Data Platforms

Hybrid Cloud

Streaming with Flink, Kafka, NiFi

AI at the Edge with Microcontrollers and Small Devices

Voice Data In Queries

Event Handler as a Service (Automatic Kafka Message Reading)

More Powerful Parameter Based Modular Streaming

Cloud First For Big Data

Log Handling Moves to MiNiFi

Full AI At The Edge with Deployable Models

More Powerful Edge TPU/GPU/VPU

Kafka is everywhere

Open Source UI Driven Event Engines

FLaNK Stack gains popularity

FLINK Everywhere

Real-Time Stock Processing

Edge to AI:  Analytics from the Edge

Utilizing Apache NiFi for IoT

Let's Build A Simple Ingest To Cloud Datawarehouse with Low Code

Learning the Basics of Apache NiFi for IoT

Introduction to Flank Stack

Introduction to Flip Stack

Introduction to Pulsar

Apache Deep Learning 101

Big Data DevOps

Automating Social Media

Accessing Feeds from Etherdelta on Trades

Vision Thing

Deep Dive into Apache NiFi

Apache NiFi : Ingesting Enterprise Data at Scale

Continous SQL with Pulsar and Flink

Apache NiFi Deep Dive 300

Smart Transit:  Real-time Transit Information with FliP

Build in the Cloud

Streaming SQL and Data Flow

Real-Time Streaming Pipelines with FLaNK

Real-Time Streaming Pipelines with FLiP

Apache NiFi DevOps

Flink SQL for Continuous SQL & ETL

Next-Gen Apache NiFi

Ask the Experts

Hello, NiFi

Using Apache MXNet in Production Deep Learning Streaming Pipelines

From Stream to Lake

Upcoming Apache Pulsar and Apache Flink Talks - ApacheCon Asia and ApacheCon 2021

ApacheCon Asia 2021



StreamNative - David Kjerrumgaard's Talk

ENGLISH SESSION 2021-08-08 15:30 GMT+8

In this talk I will present a technique for deploying machine learning models to provide real-time predictions using Apache Pulsar Functions. In order to provide a prediction in real-time, the model usually receives a single data point from the caller, and is expected to provide an accurate prediction within a few milliseconds.

Throughout this talk, I will demonstrate the steps required to productionize a fully-trained ML that predicts the delivery time for a food delivery service based upon real-time traffic information, the customer;s location, and the restaurant that will be fulfilling the order.


David Kjerrumgaard: David is the author of “Pulsar in Action”

StreamNative - Tim Spann's Talk

ENGLISH SESSION 2021-08-08 14:50 GMT+8

oday, data is being generated from devices and containers living at the edge of networks, clouds and data centers. We need to run business logic, analytics and deep learning at the edge before we start our real-time streaming flows. Fortunately using the all FLiP & FLaNK stacks we can do this with ease! Streaming AI Powered Analytics From the Edge to the Data Center is now a simple use case. With MiNiFi we can ingest the data, do data checks, cleansing, run machine learning and deep learning models and route our data in real-time to Apache NiFi and Apache Pulsar for further transformations and processing. Apache Flink will provide our advanced streaming capabilities fed real-time via Apache Pulsar topics. Apache MXNet models will run both at the edge and in our data centers via Apache NiFi and MiNiFi. 

Tools: Apache Flink, Apache Pulsar, Apache NiFi, MiNiFi, Apache MXNet


Timothy Spann: Tim Spann is a Developer Advocate at StreamNative where he works with Apache NiFi, MiniFi, Kafka, Apache Flink, Apache MXNet, TensorFlow, Apache Spark, big data, the IoT, machine learning, and deep learning. Tim has over a decade of experience with the IoT, big data, distributed computing, streaming technologies, and Java programming. Previously, he was a senior solutions architect at AirisData and a senior field engineer at Pivotal. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton on big data, the IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as IoT Fusion, Strata, ApacheCon, Data Works Summit Berlin, DataWorks Summit Sydney, and Oracle Code NYC. He holds a BS and MS in computer science.

ApacheCon Global 2021

StreamNative Talks

Tuesday 17:10 UTC - Apache NIFi Deep Dive 300 - Tim Spann
Tuesday 18:00 UTC - Apache Deep Learning 302 - Tim Spann
Wednesday 15:00 UTC - Smart Transit: Real-Time Transit Information with FLaNK- Tim Spann 
Wednesday 17:10 UTC - Cracking the Nut, Solving Edge AI with Apache Tools and Frameworks - Tim Spann
Thursday 14:10 UTC - Apache NiFi 101: Introduction and Best Practices - Tim Spann

Apache Flink and Apache Pulsar

A New FLiP!

A New FLiP! 

As some have noticed, I have left Cloudera. It has been an incredible journey. I joined Hortonworks in April of 2016 and then we merged with Cloudera in 2019. This is was my first article on Apache NiFi I got to grow with Apache NiFi as it grew from 1.0 to 1.14 during my time!  A lot of things changed, evolved and the tech grew so much.

I got to do my first major conference talks at DataWorks Summit which will always be one of my favorite event series ever. I am excited to be involved with Pulsar Summit ( and many other conferences now

My Final Tallies at Hortonworks/Cloudera:
11 videos on my Youtube channel
1,719 members Future of Data Meetup Princeton from 0
Over 48 Meetups events around the world
Over 230K Blog Views
Over 192 Blog Articles
344 DZone Articles for 3 Million Views 
Over 41 Conferences Spoken at.   
Hosted One Mardis Gras at Client, it was awesome
60 Slideshares 
266 Github Repos

I got to work with some of the best tech people in the world and also the best people. I really enjoyed the community and the teamwork.

Reports from 2017, 2018, 2019, 2020

I am really excited at what we are doing at StreamNative with Apache Pulsar. I still get to work with the amazing ASF open source community and all the great Streaming friends with Apache Flink and Apache NiFi. I am working on a FLiP Stack to demonstrate some cool apps you can build with Flink, Pulsar and Friends. Stay tuned. I will remain involved in the Apache NiFi community and I have a talk on Apache NiFi at ApacheCon later this year.

Please join me for new streaming adventures with Apache Pulsar, Apache Flink and the FLiP(N) Stack!


Upcoming Events 2021

 Upcoming Events 2021

ApacheCon Asia - 06-August-2021

Scenic City Summit - 24-September-2021

ApacheCon 2021 - 21-September-2021 to 23-September-2021

Tuesday 17:10 UTC - Apache NIFi Deep Dive 300 

Tuesday 18:00 UTC - Apache Deep Learning 302 

Wednesday 15:00 UTC - Smart Transit: Real-Time Transit Information with FLaNK 

Wednesday 17:10 UTC - Cracking the Nut, Solving Edge AI with Apache Tools and Frameworks 

Thursday 14:10 UTC - Apache NiFi 101: Introduction and Best Practices 

Big Data Conference EU - 28-September-2021 to 29-September-2021

API World - 26-October-2021 to 28-October-2021

NiFi on Cloudera Data Platform Upgrade - April 2021

CFM 2.1.1 on CDP 7.1.6

There is a new Cloudera release of Apache NiFi now with SAML support.

Apache NiFi
Apache NiFi Registry 

For changes:

Get your download on:

To start researching for the future, take a look at some of the technical preview features around Easy Rules engine and handlers.

Make sure you use the latest possible JDK 8 as there are some bugs out there.   Use a recent version of the JDK like 8u282 or newer.

Size your cluster correctly!  Make sure you have at least 3 nodes.