FLaNK Stack Weekly for 28 August 2023

 

28-August-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

The 25th was my daughter's birthday, so it was a good weekend. Lots of great things are coming. AI Dev Day in NYC was amazing, over 200 people, lots of speakers and they were so good that I actually learned some LLM, Vector Database and some AI processing. I also got to work with a video crew for some upcoming short items. If you are interested in certain articles, videos, slides or demos please reach out.

cat

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

**This is Issue #100 **

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

https://www.cloudera.com/solutions/dim-developer.html

My latest talk will be streaming on September 13th on NiFi, Kafka, Flink and LLM.

https://www.cloudera.com/about/events/cloudera-now-cdp.html?utm_medium=email&utm_source=newsletter&keyplay=ALL&utm_campaign=FY24-Q3_AMER_Cloudera_Now_XP_WkEmail&cid=701Hr0000025Vu6IAE

Releases

NiFi 1.23.2

Recent Talk

https://www.slideshare.net/bunkertor/aidevday-datainmotion-to-supercharge-ai

https://www.linkedin.com/feed/update/urn:li:activity:7100451771470249984/

Articles

https://medium.com/@tspann/streaming-llm-with-apache-nifi-huggingface-ad2f0d367468

https://kevinbtalbert.github.io/nifi/nifi-splunk/

https://thenewstack.io/comparing-different-vector-embeddings/

https://www.schemastore.org/json/

https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/python/table/udfs/vectorized_python_udfs/

https://medium.com/cloudera-inc/consume-slacks-events-api-with-cloudera-flow-management-49fed7c2a531

https://newsletter.victordibia.com/p/practical-steps-to-reduce-hallucination?utm_campaign=post&utm_medium=web

http://www.tidepool.so/2023/08/17/why-you-probably-dont-need-to-fine-tune-an-llm/

https://dzone.com/articles/integration-testing-of-non-blocking-retries-with-s

https://thenewstack.io/what-do-java-developers-think-of-the-rise-of-genai/

https://medium.com/cloudera-inc/building-an-effective-nifi-flow-queryrecord-cca5ba51afd5

https://medium.com/@deephavendatalabs/a-high-performance-csv-reader-with-type-inference-4bf2e4baf2d1

https://www.alibabacloud.com/blog/all-you-need-to-know-about-pyflink_600306

https://flink.apache.org/2023/08/04/announcing-three-new-apache-flink-connectors-the-new-connector-versioning-strategy-and-externalization/

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

Sept 21, 2023: Sao Paulo, Brazil. Evolve https://br.cloudera.com/about/events/evolve/sao-paulo.html

October 7-10, 2023: Halifax, CA. Community over Code. https://communityovercode.org/

October 8, 2023: Streaming Track, Room 102 https://communityovercode.org/schedule/#Oct8 https://communityovercode.org/schedule-list/#SG007 https://communityovercode.org/schedule-list/#SG011

October 10, 2023: Internet of Things Track, Room 109 https://communityovercode.org/schedule/#Oct10 https://communityovercode.org/schedule-list/#IOT001

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

November 1, 2023: Open Source Finance Forum. Virtual. https://events.linuxfoundation.org/open-source-finance-forum-new-york/ November 2, 2023: Evolve. NYC https://www.cloudera.com/about/events/evolve/new-york.html#register

November 7, 2023: XtremeJ 2023. Virtual. https://xtremej.dev/2023/schedule/

November 8, 2023: Flink Forward, Seattle. https://www.flink-forward.org/seattle-2023

November 22, 2023: Big Data Conference. Hybrid
https://bigdataconference.eu/ https://events.pinetool.ai/3079/#sessions/101077

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

Tools

Tool to validate Avro Schemas Online! http://avro.tarantool.org/#

© 2020-2023 Tim Spann

FLaNK Stack Weekly for 21 August 2023

 

21-August-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

**This is Issue #99 **

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

https://www.cloudera.com/solutions/dim-developer.html

Releases

NiFi 1.23.1 https://cwiki.apache.org/confluence/display/NIFI/Release+Notes#ReleaseNotes-Version1.23.1

Throwback

https://medium.com/@tspann/building-a-travel-advisory-app-with-apache-nifi-in-k8-969b44c84958

disk

Videos

https://youtube.com/shorts/JN6EjoAXNl0?feature=share

https://youtube.com/shorts/FCNNfgFs5v4?feature=share

https://youtube.com/shorts/YRqWO4MltC8?feature=share

https://youtube.com/shorts/RpOVE-8DSFk?feature=share

https://www.youtube.com/watch?v=oqaT7FDd0Fc&ab_channel=Cloudera%2CInc.

https://www.youtube.com/watch?v=YfWdW6KauZs&ab_channel=Cloudera%2CInc.

Articles

https://community.cloudera.com/t5/Support-Questions/NiFi-Site-to-Site-example/td-p/375188

https://www.infoq.com/presentations/bicycle-ai-gpt-4-tools

https://www.datanami.com/this-just-in/ibm-to-make-llama-2-available-within-its-watsonx-ai-and-data-platform/

https://huggingface.co/bigscience/bloom

https://bigscience.huggingface.co/

https://iceberg.apache.org/hive-quickstart/

https://thenewstack.io/comparing-different-vector-embeddings/

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

August 23, 2023: NYC. AI. https://www.aicamp.ai/event/eventdetails/W2023082314

September 13, 2023: Cloudera Now 23: The Open Data Lakehouse for Trusted AI. https://www.cloudera.com/about/events/cloudera-now-cdp.html

September 21, 2023: Cloudera Evolve. Sao Paulo, Brazil. https://br.cloudera.com/about/events/evolve/sao-paulo.html#register

October 7-10, 2023: Halifax, CA. Community over Code. https://communityovercode.org/

October 8, 2023: Streaming Track, Room 102 https://communityovercode.org/schedule/#Oct8 https://communityovercode.org/schedule-list/#SG007 https://communityovercode.org/schedule-list/#SG011

October 10, 2023: Internet of Things Track, Room 109 https://communityovercode.org/schedule/#Oct10 https://communityovercode.org/schedule-list/#IOT001

October 18, 2023: 2-Hours to Data Innovation: Data Flow https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

November 2, 2023: Evolve. NYC https://www.cloudera.com/about/events/evolve/new-york.html#register

November 8, 2023: Flink Forward, Seattle. https://www.flink-forward.org/seattle-2023

November 22, 2023: Big Data Conference. Hybrid
https://bigdataconference.eu/

Cloudera Events https://www.cloudera.com/about/events.html

More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

https://github.com/tspannhw/FLaNK-HuggingFace-BLOOM-LLM

Tools

© 2020-2023 Tim Spann

FLaNK Stack Weekly for 14 August 2023

14-August-2023

FLiPN-FLaNK Stack Weekly

Tim Spann @PaaSDev

https://www.threads.net/@tspannhw

https://medium.com/@tspann/subscribe

A lot is going on and it’s starting the fast rush towards Fall when there are Flink, Kafka, Apache and other conferences through out North America.

Get your new Apache NiFi for Dummies!

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

https://ossinsight.io/analyze/tspannhw

CODE + COMMUNITY

Please join my meetup group NJ/NYC/Philly/Virtual.

http://www.meetup.com/futureofdata-princeton/

https://www.meetup.com/futureofdata-newyork/

https://www.meetup.com/futureofdata-philadelphia/

*This is Issue #98 *

https://github.com/tspannhw/FLiPStackWeekly

https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

https://www.cloudera.com/solutions/dim-developer.html

Releases

EFM 1.6.0
https://docs.cloudera.com/cem/1.6.0/getting-started/topics/cem-component-support.html

CEM MiNiFi C++ Agent — 1.23.06
https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-cpp/topics/cem-minifi-cpp-agent-updates.html

CEM MiNiFi Java Agent — 1.23.04
https://docs.cloudera.com/cem/1.6.0/release-notes-minifi-java/topics/cem-minifi-java-agent-updates.html

Docs

https://docs.cloudera.com/cem/1.6.0/rest-api-reference/index.html

https://leanpub.com/streamprocessingwithapacheflink/c/ucQ5dLcZYAo2?utm_source=substack&utm_medium=email

https://docs.cloudera.com/cem/1.6.0/using-cem/topics/cem-agent-deployer-securing-agents.html

https://docs.cloudera.com/cem/latest/installation/topics/cem-set-encryption-password.html

Videos

https://www.youtube.com/watch?v=zEGffUz1jKo

https://www.youtube.com/watch?v=rQo3Pk5smz8

https://www.youtube.com/watch?v=0G98z_fs_SQ&t=605s&ab_channel=DataScienceFestival

https://www.youtube.com/watch?v=JdsY5p1GZ38&t=29s&ab_channel=DatainMotion

https://www.youtube.com/watch?v=nuS3X5DxFWM&ab_channel=DatainMotion

Articles

https://medium.com/@tspann/using-apache-nifi-to-backup-and-restore-minifi-flows-from-cloudera-efm-87f303b56ebd

https://medium.com/@tspann/no-code-sentiment-analysis-with-hugging-face-and-apache-nifi-for-article-summaries-cf06d1df1283

https://medium.com/@tspann/hbase-to-hbase-via-apache-nifi-d3d1d674eab2

https://www.playtika-blog.com/playtika-ai/how-playtika-achieved-ai-automation-customer-service-with-apache-nifi-part-2/

https://docs.cloudera.com/cem/1.6.0/using-minifi-as-log-collector-pod-in-kubernetes/topics/cem-using-minifi-as-log-collector-pod-in-kubernetes.html

https://docs.cloudera.com/cem/1.6.0/using-scripting/topics/cem-script-initial-setup.html#cem-using-script-to-integrate-custom-code

https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32

https://medium.com/@samuel.vanackere/linked-data-event-streams-explained-in-8-minutes-e1c76d077bb9

https://medium.com/geekculture/decision-making-with-linked-data-event-streams-and-powerbi-5cd8379d32

https://hilla.dev/blog/ai-chatbot-in-java/

https://www.linkedin.com/posts/nicholasrenotte_watsonx-llms-mlops-activity-7093359957890240512-f8RZ/

https://cloudinfrastructure.substack.com/p/introducing-the-redpoint-open-source

https://www.loicmathieu.fr/wordpress/en/informatique/java-21-quoi-de-neuf/

https://litellm.ai/

https://semiconductor.samsung.com/news-events/tech-blog/samsung-announces-innovations-to-enhance-memory-customer-experience-in-data-centric-era-at-fms-2023/

https://kevinbtalbert.github.io/iceberg/nifi/nifi-iceberg/

Free Stuff

For anyone who needs to upgrade Java or escape from potential liabilities, this is the guide. It’s also provides some helpful insights for any Java developer or anyone developing on-top of current or future JVMs.
https://www.azul.com/openjdk-migration-for-dummies/

https://www.cloudera.com/campaign/apache-nifi-for-dummies.html

Throw Back Articles

https://github.com/apache/kudu/blob/master/examples/quickstart/impala/README.adoc
https://medium.com/@nifi.notes/building-an-effective-nifi-flow-replacetext-60a6016d378c
https://community.cloudera.com/t5/Community-Articles/Running-DNS-and-Domain-Scanning-Tools-From-Apache-NiFi/ta-p/248484
https://community.cloudera.com/t5/Community-Articles/Using-Cloudera-Data-Science-Workbench-with-Apache-NiFi-and/ta-p/249469
https://community.cloudera.com/t5/Community-Articles/Scanning-Documents-into-Data-Lakes-via-Tesseract-MQTT-Python/ta-p/248492
https://community.cloudera.com/t5/Community-Articles/Adding-Stanford-CoreNLP-To-Big-Data-Pipelines-Apache-NiFi-1/ta-p/249378
https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-for-Speech-Processing-Speech-to-Text-with/ta-p/249242
https://community.cloudera.com/t5/Community-Articles/Ingesting-Flight-Data-ADS-B-USB-Receiver-with-Apache-NiFi-1/ta-p/247940
https://community.cloudera.com/t5/Community-Articles/Integrating-lucene-geo-gazetteer-For-Geo-Parsing-with-Apache/ta-p/247993
https://community.cloudera.com/t5/Community-Articles/Creating-WordClouds-From-DataFlows-with-Apache-NiFi-and/ta-p/246605
https://community.cloudera.com/t5/Community-Articles/NIFI-1-x-For-Automatic-Music-Playing-Pipelines/ta-p/247994
https://community.cloudera.com/t5/Community-Articles/Using-Apache-NiFi-with-Apache-MXNet-GluonCV-for-YOLO-3-Deep/ta-p/248979
https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-1-Apache-NiFi/ta-p/248265
https://community.cloudera.com/t5/Community-Articles/Monitoring-Energy-Usage-Utilizing-Apache-NiFi-Python-Apache/ta-p/247525
https://community.cloudera.com/t5/Community-Articles/Using-Command-Line-Security-Tools-from-Apache-NiFi/ta-p/248158
https://community.cloudera.com/t5/Community-Articles/Apache-NiFi-Processor-for-Apache-MXNet-SSD-Single-Shot/ta-p/249240
https://community.cloudera.com/t5/Community-Articles/Ingesting-Apache-MXNet-Gluon-Deep-Learning-Results-Via-MQTT/ta-p/248544
https://community.cloudera.com/t5/Community-Articles/Updating-The-Apache-OpenNLP-Community-Apache-NiFi-Processor/ta-p/248398
https://community.cloudera.com/t5/Community-Articles/Integration-Apache-OpenNLP-1-8-4-into-Apache-NiFi-1-5-For/ta-p/248010
https://community.cloudera.com/t5/Community-Articles/Tracking-Phone-Location-for-Android-and-IoT-with-OwnTracks/ta-p/244875
https://community.cloudera.com/t5/Community-Articles/Ingesting-Drone-Data-From-Ryze-Tello-Part-1-Setup-and/ta-p/249422
https://community.cloudera.com/t5/Community-Articles/Ingesting-RDBMS-Data-As-New-Tables-Arrive-Automagically-into/ta-p/246214
https://community.cloudera.com/t5/Community-Articles/Incrementally-Streaming-RDBMS-Data-to-Your-Hadoop-DataLake/ta-p/247927
https://community.cloudera.com/t5/Community-Articles/Ingesting-and-Analyzing-Street-Camera-Data-from-Major-US/ta-p/249194
https://community.cloudera.com/t5/Community-Articles/Basic-Image-Processing-and-Linux-Utilities-As-Part-of-a-Big/ta-p/249121
https://community.cloudera.com/t5/Community-Articles/Hosting-and-Ingesting-Data-From-Web-Pages-Desktop-and-Mobile/ta-p/244575
https://community.cloudera.com/t5/Community-Articles/QADCDC-Our-how-to-ingest-some-database-tables-to-Hadoop-Very/ta-p/245229
https://community.cloudera.com/t5/Community-Articles/Tracking-Air-Quality-with-HDP-and-HDF-Part-2-Indoor-Air/ta-p/249471
https://community.cloudera.com/t5/Community-Articles/Streaming-Ingest-of-Google-Sheets-with-HDF-2-0/ta-p/247764
https://community.cloudera.com/t5/Community-Articles/Ingesting-Golden-Gate-Records-From-Apache-Kafka-and/ta-p/247557
https://community.cloudera.com/t5/Community-Articles/Data-Processing-Pipeline-Parsing-PDFs-and-Identifying-Names/ta-p/249105
https://community.cloudera.com/t5/Community-Articles/Using-A-TensorFlow-quot-Person-Blocker-quot-With-Apache-NiFi/ta-p/248141
https://community.cloudera.com/t5/Community-Articles/Su-Su-Sussudio-Sudoers-Log-Parsing-with-Apache-NiFi/ta-p/249461
https://community.cloudera.com/t5/Community-Articles/Integrating-IBM-Watson-Machine-Learning-APIs-with-Apache/ta-p/247545
https://community.cloudera.com/t5/Community-Articles/Simple-Change-Data-Capture-CDC-with-SQL-Selects-via-Apache/ta-p/308376
https://community.cloudera.com/t5/Community-Articles/Deep-Learning-IoT-Workflows-with-Raspberry-Pi-MQTT-MXNet/ta-p/249456
https://community.cloudera.com/t5/Community-Articles/Parsing-Web-Pages-for-Images-with-Apache-NiFi/ta-p/248415
https://community.cloudera.com/t5/Community-Articles/Trigger-SonicPi-Music-Via-Apache-NiFi/ta-p/248587
https://community.cloudera.com/t5/Community-Articles/Using-Parsey-McParseFace-Google-TensorFlow-Syntaxnet-From/ta-p/246337
https://community.cloudera.com/t5/Community-Articles/Ingesting-osquery-Into-Apache-Phoenix-using-Apache-NiFi/ta-p/249308
https://community.cloudera.com/t5/Community-Articles/Converting-PowerPoint-Presentations-into-French-from-English/ta-p/248974
https://community.cloudera.com/t5/Community-Articles/Posting-Images-with-Apache-NiFi-1-7-and-a-Custom-Processor/ta-p/249017
https://community.cloudera.com/t5/Community-Articles/Parsing-Any-Document-with-Apache-NiFi-1-5-with-Apache-Tika/ta-p/247672

Events

https://attend.cloudera.com/ameropendatalakehousewithcdpon?lid=7vxyhds3tlv7

August 23, 2023: NYC. AI.
https://www.aicamp.ai/event/eventdetails/W2023082314

September 26–27, 2023: Current Event. San Jose, California.
https://www.confluent.io/events/current/

October 7–10, 2023: Halifax, CA. Community over Code.
https://communityovercode.org/

October 8, 2023: Streaming Track, Room 102
https://communityovercode.org/schedule/#Oct8
https://communityovercode.org/schedule-list/#SG007
https://communityovercode.org/schedule-list/#SG011

October 10, 2023: Internet of Things Track, Room 109
https://communityovercode.org/schedule/#Oct10
https://communityovercode.org/schedule-list/#IOT001

October 18, 2023: 2-Hours to Data Innovation: Data Flow
https://www.cloudera.com/about/events/hands-on-lab-series-2-hours-to-data-innovation.html

November 2, 2023: Evolve. NYC
https://www.cloudera.com/about/events/evolve/new-york.html#register

November 8, 2023: Flink Forward, Seattle.
https://www.flink-forward.org/seattle-2023

November 22, 2023: Big Data Conference. Hybrid

https://bigdataconference.eu/

Cloudera Events
https://www.cloudera.com/about/events.html

More Events:
https://www.linkedin.com/pulse/schedule-2023-tim-spann-/

Code

Tools

© 2020–2023 Tim Spann