Skip to main content

Year and Decade End Report : 201*

A Year in Big Data 2019

This has been an amazing year for Big Data, Streaming, Tech, The Cloud, AI, IoT and everything in between.   I got to witness the merger of the two Big Data giants into one unstoppable cloud data machine from the inside.   The sum of the new Cloudera is far greater than just Hortonworks + Cloudera.   It has been a great year working with amazing engineers, leaders, clients, partners, prospects, community members, data scientists, analysts, marketing mavens and everyone I have gotten to see this year.   It's been busy traveling the globe spreading the good word and solving some tough problems.   

In 2019, Edge2AI became something we could teach and implement in a single day to newbies.   The combination of MiNiFi + NiFi + Kafka + KuDu + Cloud is unstoppable.  Once we added Flink later this year, the FLaNK stack became amazing.   I see amazing stuff for this in the 20's.     I got to use new projects like Kudu (awesome), Impala, Cloudera Manager and new tools from the Data in Motion team.   Streams Messaging Manager became the best way to manage, monitor, create, alert on and use Kafka across clusters anywhere.   This is now my favorite way to demo anything.   So much transparency, awesome.   Having the power of Apache Flink is just making any problem solve-able, even those that scale to thousands of nodes.   Running just one node of Flink has been awesome.   I am a Squirrel Dev now!

Strata, DataWorksSummit and NoSQL Day were awesome, but working with charities and non-profits solving real world problems was amazing.     Helping at Nethope is the highlight of my professional year.   I am so thankful to the Cloudera Foundation for having me help.   I am really impressed with the Cloudera Foundation, Nethope and everyone involved.  I am hoping to speak to a few different conferences in 2020, but we'll see where Edge2AI takes me.

There's a lot to wrap up for 2019, so I attempted to put most of it following this break.



For DZone Articles See Here:   https://www.linkedin.com/pulse/my-top-articles-2019-timothy-spann/

It was a great year talking with the tech people of the Princeton area.

Blockchain

Deep Learning

Edge2AI
Enterprise IoT

At other events, I got to join some amazing colleagues spreading the word about cool open source technology.


The Cloudera Forum for CDP in  MINNEAPOLIS.   I got to help out on the amazing launch of Data in Motion on Cloudera.  


I got to lead all-day workshops in NYC, Boston and Washington DC.   I also got to talk about and do demos on NiFi and Phoenix for Cloudera Now online and some followup webinars.   It was great working with Milind, Paul, John and my local Cloudera crew.   Also the DIM Field Team has been amazing, big shout outs to Dan, Abdelkrim, Andre and Vasilis.   Andre made workshops so easy it's insane:   https://github.com/asdaraujo/edge2ai-workshop#lab_1.   Dan's whoville and https://github.com/Chaffelson/nipyapi are next gen tools.   

I got to speak at a few awesome conferences this year.

Dataworks Summit Barcelona (https://www.slideshare.net/bunkertor/the-edge-to-ai-deep-dive-barcelona-meetup-march-2019https://www.datainmotion.dev/2019/03/barcelona-dataworks-summit-march-2019.html



NoSQL Day DC (https://www.datainmotion.dev/2019/05/dataworks-summit-dc-2019-report.html) See my joint talk on Phoenix and NiFi.   I was lucky to have Henry Sowell lead the HBase/Phoenix presentation with me.



Dataworks Summit DC (https://dzone.com/articles/dataworks-summit-and-nosql-day-review-2019-washing) See my talks on NiFi, Blockchain and Deep Learning.   I found a great way to have more amazing people speak at conferences, I had them added as co-speakers to all my talks.  Mehul and John did awesome.    Speaking is much better with a smarter buddy.



Strata NYC (https://github.com/tspannhw/StrataNYC2019) . I helped with some hands-on training sessions on Data in Motion.   We had most of the DIM Field team helping out on our multiple hands-on courses.   We were lucky to have unofficial DIM Field superstar Purnima do a lot of Kafka work for us.



Nethope Conference Puerto Rico
(https://nethopeglobalsummit2019.pathable.co/meetings/rmZ5yXXoJo4EfcaY7)



I wrote a few articles this year:


Here are my slides from 2019:

A Decade End Wrap Up 2010-2019

Blockchain and Cryptocurrency blew up and then blew up.   There's a few solid use cases though and the similarities between early Hadoop and Blockchain are interesting.   You can't count this stuff out.   From HPE to Pivotal to AirisData to Hortonworks to Cloudera, it's been an awesome ride doing Big Data, Spring, Java, Data in Motion, Streaming, Cloud, PaaS, Microservices, Containers, Kafka, NiFi, Spring XD, NodeJS.

I started posting to DZone in 2012 about MongoDB, Spring, NodeJS and such.

https://dzone.com/articles/testing-spring-data-mongodb
https://www.slideshare.net/bunkertor/redis-for-security-data-securityscorecard-jvm-redis-usage
https://www.slideshare.net/bunkertor/brownbag001-spring-ioc-from-2012

I did a ton of Spring stuff on https://github.com/nxbdi and https://github.com/tspannatpivotal.

What's Coming in 2020

  • Cloud Enterprise Data Platforms
  • Hybrid Cloud
  • Streaming with Flink, Kafka, NiFi
  • AI at the Edge with Microcontrollers and Small Devices
  • Voice Data In Queries
  • Event Handler as a Service (Automatic Kafka Message Reading)
  • More Powerful Parameter Based Modular Streaming 
  • Cloud First For Big Data
  • Log Handling Moves to MiNiFi
  • Full AI At The Edge with Deployable Models
  • More Powerful Edge TPU/GPU/VPU
  • Kafka is everywhere
  • Open Source UI Driven Event Engines
  • FLaNK Stack gains popularity
  • FLINK Everywhere


References:








Popular posts from this blog

Migrating Apache Flume Flows to Apache NiFi: Kafka Source to HDFS / Kudu / File / Hive

Migrating Apache Flume Flows to Apache NiFi: Kafka Source to HDFS / Kudu / File / HiveArticle 7 - https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_9.html Article 6 - https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_35.html
Article 5 - 
Article 4 - https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_8.html Article 3 - https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_7.html Article 2 - https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache.html Article 1https://www.datainmotion.dev/2019/08/migrating-apache-flume-flows-to-apache.html Source Code:  https://github.com/tspannhw/flume-to-nifi
This is one possible simple, fast replacement for "Flafka".



Consume / Publish Kafka And Store to Files, HDFS, Hive 3.1, Kudu

Consume Kafka Flow 

 Merge Records And Store As AVRO or ORC
Consume Kafka, Update Records via Machine Learning Models In CDSW And Store to Kudu

Sour…

Exploring Apache NiFi 1.10: Stateless Engine and Parameters

Exploring Apache NiFi 1.10:   Stateless Engine and Parameters Apache NiFi is now available in 1.10!
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12316020&version=12344993

You can now use JDK 8 or JDK 11!   I am running in JDK 11, seems a bit faster.

A huge feature is the addition of Parameters!   And you can use these to pass parameters to Apache NiFi Stateless!

A few lesser Processors have been moved from the main download, see here for migration hints:
https://cwiki.apache.org/confluence/display/NIFI/Migration+Guidance

Release Notes:   https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.10.0

Example Source Code:https://github.com/tspannhw/stateless-examples

More New Features:

ParquetReader/Writer (See:  https://www.datainmotion.dev/2019/10/migrating-apache-flume-flows-to-apache_7.html)Prometheus Reporting Task.   Expect more Prometheus stuff coming.Experimental Encrypted content repository.   People asked me for this one before.Par…

Ingesting Drone Data From DJII Ryze Tello Drones Part 1 - Setup and Practice

Ingesting Drone Data From DJII Ryze Tello Drones Part 1 - Setup and Practice In Part 1, we will setup our drone, our communication environment, capture the data and do initial analysis. We will eventually grab live video stream for object detection, real-time flight control and real-time data ingest of photos, videos and sensor readings. We will have Apache NiFi react to live situations facing the drone and have it issue flight commands via UDP. In this initial section, we will control the drone with Python which can be triggered by NiFi. Apache NiFi will ingest log data that is stored as CSV files on a NiFi node connected to the drone's WiFi. This will eventually move to a dedicated embedded device running MiniFi. This is a small personal drone with less than 13 minutes of flight time per battery. This is not a commercial drone, but gives you an idea of the what you can do with drones. Drone Live Communications for Sensor Readings and Drone Control You must connect to the drone…