Skip to main content

Posts

Showing posts with the label apache atlas

Using Cloudera Data Platform with Flow Management and Streams on Azure

Using Cloudera Data Platform with Flow Management and Streams on Azure Today I am going to be walking you through using Cloudera Data Platform (CDP) with Flow Management and Streams on Azure Cloud.  To see a streaming demo video, please join my webinar (or see it on demand) at  Streaming Data Pipelines with CDF in Azure .  I'll share some additional how-to videos on using Apache NiFi and Apache Kafka in Azure very soon.    Apache NiFi on Azure CDP Data Hub Sensors to ADLS/HDFS and Kafka In the above process group we are using QueryRecord to segment JSON records and only pick ones where the Temperature in Fahrenheit is over 80 degrees then we pick out a few attributes to display from the record and send them to a slack channel. To become a Kafka Producer you set a Record Reader for the type coming in, this is JSON in my case and then set a Record Writer for the type to send to the  sensors  topic.    In this case we kept it as JSON, but we could convert to AVRO.   I usually do that

Commonly Used TCP/IP Ports in Streaming

Cloudera CDF and HDF Ports NiFi and Friends FLaNK Extended Stack Note:  All of these ports can be changed by administrators or in version updates.   Also if you are running Apache Knox like in Cloudera Data Platform Public Cloud, these ports may be changed or hidden.   This is just based on a version of CDF I am running and defaults in.   This does not include standard Cloudera ports for Cloudera Manager, Hadoop, Atlas, Ranger and other necessary and fun services. Cloudera Flow Management (CFM Powered by Apache NiFi) Cloudera NiFi HTTP:    8080 or 9090 Cloudera NiFi HTTPS:  8443 or 9443 Cloudera NiFi RIP Socket: 10443 or 50999 Cloudera NiFi Node Protocol: 11443 Cloudera NiFi Load Balancing:  6342 Cloudera NiFi Registry: 18080 Cloudera NiFi Registry SSL: 18433 Cloudera NiFi Certificate Authority:  10443 Cloudera Edge Flow Management (CEM Powered by Apache NiFi - MiNiFi) Cloudera EFM HTTP:  10080 Cloudera EFM CoAP:  8989 Cloudera Stream Processing (CSP Powered by Apache Kafka) Cloudera K

Harnessing the Data Lifecycle for Customer Experience Optimization: Streaming Classifications On Twitter Streams

Harnessing the Data Lifecycle for Customer Experience Optimization: Streaming Classifications For a deeper dive see this past webinar: available here. In the use case solved for this webinar, I am a Streaming Engineer at an airline, CloudAir.   I need to find, filter and clean Twitter streams then perform sentiment analysis. Score Models in the Stream to Act As the Streaming Engineer at CloudAIR I am responsible for ingesting data from thousands of sources, operationalizing machine learning models as part of our streams, running real-time ELT/ETL processes and building event processing systems running from devices, servers and edge nodes. For today’s use case, one of our ML engineers had given me a model that was deployed into one of our production Cloudera Machine Learning (CML) environments. I logged into Cloudera Data Platform (CDP), found the model, tested it, and then extracted the information I need to add this model to our streaming ingest flow f