Skip to main content


Populating Your Secure Cloud Data Estates

Populating Your Secure Cloud Data Estates Hydrating Your Clean Cloud Data Lake I am hard pressed to keep up with Data Store + Query terminology du jour.    Was it Data Lake House?   All these giant bodies of water mostly stored in buckets (S3)?    I agree there are lots of nuances and many different query engines on top of those various means for storing that data.   I don't think everytime we add a twist we need to add increasingly silly terms on top.   Is it to confuse users?  developers?  data engineers?  companies?   executives?   Perhaps if we change our data warehouse name again we can get them to buy the same thing again. Clearly it can't be one size fit all for all this different things?   I know a lot of companies of various types and sizes and most don't approach the size of the data that companies like Netflix and LinkedIn have.   I really like their innovation, but often those projects get released and then wither in obscurity. A few projects look really good: A

Cloudera SQL Stream Builder (SSB) - Update Your FLaNK Stack

Cloudera SQL Stream Builder (SSB) Released! CSA 1.3.0 is now available with Apache Flink 1.12 and SQL Stream Builder !    Check out this white paper for some details .    You can get full details on the Stream Processing and Analytics available from Cloudera here . This is awesome way to query Kafka topics with continuous SQL that is deployed to scalable Flink nodes in YARN or K8.   We can also easily define functions in JavaScript to enhance, enrich and augment our data streams.   No Java to write, no heavy deploys or build scripts, we can build, test and deploy these advanced streaming applications all from your secure browser interface. References: https://do