Showing posts with label APACHENIFI. Show all posts
Showing posts with label APACHENIFI. Show all posts

FLaNK Stack Weekly for 28 August 2023

 

28-August-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

The 25th was my daughter's birthday, so it was a good weekend. Lots of great things are coming. AI Dev Day in NYC was amazing, over 200 people, lots of speakers and they were so good that I actually learned some LLM, Vector Database and some AI processing. I also got to work with a video crew for some upcoming short items. If you are interested in certain articles, videos, slides or demos please reach out.

cat

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

**This is Issue #100 **

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

https://www.cloudera.com/solutions/dim-developer.html

My latest talk will be streaming on September 13th on NiFi, Kafka, Flink and LLM.

https://www.cloudera.com/about/events/cloudera-now-cdp.html?utm_medium=email&utm_source=newsletter&keyplay=ALL&utm_campaign=FY24-Q3_AMER_Cloudera_Now_XP_WkEmail&cid=701Hr0000025Vu6IAE

Releases

NiFi 1.23.2

Recent Talk

https://www.slideshare.net/bunkertor/aidevday-datainmotion-to-supercharge-ai

https://www.linkedin.com/feed/update/urn:li:activity:7100451771470249984/

Articles

https://medium.com/@tspann/streaming-llm-with-apache-nifi-huggingface-ad2f0d367468

https://kevinbtalbert.github.io/nifi/nifi-splunk/

https://thenewstack.io/comparing-different-vector-embeddings/

https://www.schemastore.org/json/

https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/python/table/udfs/vectorized_python_udfs/

https://medium.com/cloudera-inc/consume-slacks-events-api-with-cloudera-flow-management-49fed7c2a531

https://newsletter.victordibia.com/p/practical-steps-to-reduce-hallucination?utm_campaign=post&utm_medium=web

http://www.tidepool.so/2023/08/17/why-you-probably-dont-need-to-fine-tune-an-llm/

https://dzone.com/articles/integration-testing-of-non-blocking-retries-with-s

https://thenewstack.io/what-do-java-developers-think-of-the-rise-of-genai/

https://medium.com/cloudera-inc/building-an-effective-nifi-flow-queryrecord-cca5ba51afd5

https://medium.com/@deephavendatalabs/a-high-performance-csv-reader-with-type-inference-4bf2e4baf2d1

https://www.alibabacloud.com/blog/all-you-need-to-know-about-pyflink_600306

https://flink.apache.org/2023/08/04/announcing-three-new-apache-flink-connectors-the-new-connector-versioning-strategy-and-externalization/

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

Sept 21, 2023: Sao Paulo, Brazil. Evolve https://br.cloudera.com/about/events/evolve/sao-paulo.html

October 7-10, 2023: Halifax, CA. Community over Code. https://communityovercode.org/

October 8, 2023: Streaming Track, Room 102 https://communityovercode.org/schedule/#Oct8 https://communityovercode.org/schedule-list/#SG007 https://communityovercode.org/schedule-list/#SG011

October 10, 2023: Internet of Things Track, Room 109 https://communityovercode.org/schedule/#Oct10 https://communityovercode.org/schedule-list/#IOT001

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

November 1, 2023: Open Source Finance Forum. Virtual. https://events.linuxfoundation.org/open-source-finance-forum-new-york/ November 2, 2023: Evolve. NYC https://www.cloudera.com/about/events/evolve/new-york.html#register

November 7, 2023: XtremeJ 2023. Virtual. https://xtremej.dev/2023/schedule/

November 8, 2023: Flink Forward, Seattle. https://www.flink-forward.org/seattle-2023

November 22, 2023: Big Data Conference. Hybrid
https://bigdataconference.eu/ https://events.pinetool.ai/3079/#sessions/101077

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

Tools

Tool to validate Avro Schemas Online! http://avro.tarantool.org/#

© 2020-2023 Tim Spann

FLaNK Stack Weekly for 10 July 2023

 

10-July-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

**This is Issue #93 **

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Videos

https://www.youtube.com/watch?v=7mbxJxjGj3w&ab_channel=PierreVillard https://www.youtube.com/watch?v=kvJx8vQnCNE https://www.youtube.com/watch?v=CPK6gWPgrzg

Talks

https://www.slideshare.net/bunkertor/meetup-streaming-data-pipeline-development-258709707

https://www.slideshare.net/bunkertor/big-data-fest-building-modern-data-streaming-apps

https://www.youtube.com/live/1xFha8va7pg?feature=share

Articles

https://dzone.com/articles/streaming-change-data-capture-data-two-ways

Documentation

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

July 19, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

https://github.com/cloudera/CML_AMP_LLM_Chatbot_Augmented_with_Enterprise_Data/tree/main

NiFi Code

https://github.com/georgevetticaden/evernote-ai-chatbot

Tools

© 2020-2023 Tim Spann

FLaNK Stack Weekly for 26 June 2023

 

26-June-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

My friend wrote an awesome new book on streaming, I highly recommend picking up a copy!

https://leanpub.com/streamprocessingwithapacheflink/c/ucQ5dLcZYAo2

Join me in person for steak & stack or virtually for FLaNK Stack

https://www.meetup.com/futureofdata-princeton/events/292976004/

meetup

Wednesday, June 28, 2023 at 6:00 PM to Wednesday, June 28, 2023 at 8:00 PM EDT Add to calendar The Capital Grille 310 W Wisconsin Ave · Milwaukee, WI

Also live streamed to Youtube

This will be a hybrid event with a Zoom. The in-person event will be in Milwaukee.

In this interactive session, Tim will lead participants through how to best build streaming data pipelines. He will cover how to build applications from some common use cases and highlight tips, tricks, best practices and patterns. He will show how to build the easy way and then dive deep into the underlying open source technologies including Apache NiFi, Apache Flink, Apache Kafka and Apache Iceberg. If you wish to follow along, please download open source projects beforehand. You can also download this helpful streaming platform: https://docs.cloudera.com/csp-ce/latest/installation/topics/csp-ce-installing-ce.html All source code and slides will be shared for those interested in building their own FLaNK Apps. https://www.flankstack.dev/

https://www.thecapitalgrille.com/locations/wi/milwaukee/milwaukee/8027

cunkflank

Hardware For FLaNK

The amazing team at Ampere Computing sent us a 2U Mt Jade.

https://amperecomputing.com/en/systems/altra/2u-mt-jade-2s-nvme

We will be running some AI, IoT, MiNiFi, NiFi, Kafka, Flink, Pulsar, Spark, Iceberg, Ozone, HBase, Kudu, Hive, Impala, Jupyter and more workloads here.

Updates

CDF-PC 2.5 on CDP Public Cloud

https://docs.cloudera.com/dataflow/cloud/deploy-flows/topics/cdf-flow-deployment-autoscaling.html

New Advanced UIs:

  • The Flow Designer now supports the advanced configuration UI for UpdateAttribute.
  • The Flow Designer now supports the advanced configuration UI for JoltTransformJson.
  • New Canvas navigation: The Flow Designer now supports Birdseye and Zoom controls.
  • New troubleshooting: The Flow Designer now supports Processor Diagnostics with an active Test Session.
  • Multi-Select: The Flow Designer now supports multi-selection on the canvas and bulk actions for Start, Stop, Enable, Disable, Move, Change parent group, Copy/Paste, and Delete.

New ReadyFlows for this release:

  • CDW Ingest
  • CDP Kafka to Snowflake
  • Slack to S3
  • Updated Confluent Cloud to Snowflake using new Snowpipe processors

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

**This is Issue #91 **

You may notice a version jump, Linked in says we had 89 already, so I am assuming two other articles got assimilated. I will go with this, since 90 is a better number.

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Courses

https://www.cloudera.com/about/training/courses/apache-nifi-anti-patterns.html

Videos

https://www.youtube.com/watch?v=H1SYOuLcUTI&ab_channel=Ververica

https://www.youtube.com/watch?app=desktop&v=8cZJ9CyLYyI

Conference Videos

Hail Hydrate! From Stream to Lake https://www.youtube.com/watch?v=IBpqa8re--o&ab_channel=PowerShell.org

Articles

https://medium.com/@tspann/ingesting-events-into-dockerized-ibm-db2-jdbc-with-apache-nifi-f0ca452d1351

https://a16z.com/2023/06/20/emerging-architectures-for-llm-applications/

https://dzone.com/articles/apache-nifi-10-cheatsheet

https://www.linkedin.com/posts/excalidraw_re-keying-a-kafka-topic-activity-7077942003837100033-KfnM/

https://medium.com/@tspann/functions-anywhere-faas-ee92ecedb248

Events

June 26-28, 2023: NLIT Summit. Milwaukee.
https://www.fbcinc.com/e/nlit/default.aspx

June 28, 2023: NiFi Meetup. Milwaukee and Hybrid. https://www.meetup.com/futureofdata-princeton/events/292976004/

meetup

July 19, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

https://github.com/polyzos/stream-processing-with-apache-flink

NiFi Code

Tools

© 2020-2023 Tim Spann