Simple Apache NiFi Operations Dashboard

Simple Apache NiFi Operations Dashboard
This is an evolving work in progress, please get involved everything is open source. @milind pandit and I are working on a project to build something useful for teams to analyze their flows, current cluster state, start and stop flows and have a rich one look dashboard.
There's a lot of data provided by Apache NiFi and related tools to aggregate, sort, categorize, search and eventually do machine learning analytics on.
There are a lot of tools that come out of the box that solve parts of these problems. Ambari Metrics, Grafana and Log Search provide a ton of data and analysis abilities. You can find all your errors easily in Log Search and see nice graphs of what is going on in Ambari Metrics and Grafana.
What is cool with Apache NiFi is that is has SitetoSite tasks for sending all the provenance, analytics, metrics and operational data you need to wherever you want it. That includes to Apache NiFi! This is Monitoring Driven Development (MDD).

Monitoring Driven Development (MDD)
In this little proof of concept work, we grab some of these flows process them in Apache NiFi and then store them in Apache Hive 3 tables for analytics. We should probably push the data to HBase for aggregates and Druid for time series. We will see as this expands.
There are also other data access options including the NiFi REST API and the NiFi Python APIs.
Boostrap Notifier
Reporting Tasks
  • AmbariReportingTask (global, per process group)
  • MonitorDiskUsage(Flowfile, content, provenance repositories)
  • MonitorMemory
Monitor Disk Usage
MonitorActivity
See:
These are especially useful for doing things like purging connections.
Purge it!
  • nipyapi.canvas.purge_connection(con_id)
  • nipyapi.canvas.purge_process_group(process_group, stop=False)
  • nipyapi.canvas.delete_process_group(process_group, force=True, refresh=True)


Use Cases
Example Metrics Data
  1. [ {
  2. "appid" : "nifi",
  3. "instanceid" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a",
  4. "hostname" : "princeton1.field.hortonworks.com",
  5. "timestamp" : 1539411679652,
  6. "loadAverage1min" : 0.93,
  7. "availableCores" : 16,
  8. "FlowFilesReceivedLast5Minutes" : 14,
  9. "BytesReceivedLast5Minutes" : 343779,
  10. "FlowFilesSentLast5Minutes" : 0,
  11. "BytesSentLast5Minutes" : 0,
  12. "FlowFilesQueued" : 59952,
  13. "BytesQueued" : 294693938,
  14. "BytesReadLast5Minutes" : 241681,
  15. "BytesWrittenLast5Minutes" : 398753,
  16. "ActiveThreads" : 2,
  17. "TotalTaskDurationSeconds" : 273,
  18. "TotalTaskDurationNanoSeconds" : 273242860763,
  19. "jvmuptime" : 224997,
  20. "jvmheap_used" : 5.15272616E8,
  21. "jvmheap_usage" : 0.9597700387239456,
  22. "jvmnon_heap_usage" : -5.1572632E8,
  23. "jvmthread_statesrunnable" : 11,
  24. "jvmthread_statesblocked" : 2,
  25. "jvmthread_statestimed_waiting" : 26,
  26. "jvmthread_statesterminated" : 0,
  27. "jvmthread_count" : 242,
  28. "jvmdaemon_thread_count" : 125,
  29. "jvmfile_descriptor_usage" : 0.0709,
  30. "jvmgcruns" : null,
  31. "jvmgctime" : null
  32. } ]
Example Status Data
  1. {
  2. "statusId" : "a63818fe-dbd2-44b8-af53-eaa27fd9ef05",
  3. "timestampMillis" : "2018-10-18T20:54:38.218Z",
  4. "timestamp" : "2018-10-18T20:54:38.218Z",
  5. "actorHostname" : "princeton1.field.hortonworks.com",
  6. "componentType" : "RootProcessGroup",
  7. "componentName" : "NiFi Flow",
  8. "parentId" : null,
  9. "platform" : "nifi",
  10. "application" : "NiFi Flow",
  11. "componentId" : "7c84501d-d10c-407c-b9f3-1d80e38fe36a",
  12. "activeThreadCount" : 1,
  13. "flowFilesReceived" : 1,
  14. "flowFilesSent" : 0,
  15. "bytesReceived" : 1661,
  16. "bytesSent" : 0,
  17. "queuedCount" : 18,
  18. "bytesRead" : 0,
  19. "bytesWritten" : 1661,
  20. "bytesTransferred" : 16610,
  21. "flowFilesTransferred" : 10,
  22. "inputContentSize" : 0,
  23. "outputContentSize" : 0,
  24. "queuedContentSize" : 623564,
  25. "activeRemotePortCount" : null,
  26. "inactiveRemotePortCount" : null,
  27. "receivedContentSize" : null,
  28. "receivedCount" : null,
  29. "sentContentSize" : null,
  30. "sentCount" : null,
  31. "averageLineageDuration" : null,
  32. "inputBytes" : null,
  33. "inputCount" : 0,
  34. "outputBytes" : null,
  35. "outputCount" : 0,
  36. "sourceId" : null,
  37. "sourceName" : null,
  38. "destinationId" : null,
  39. "destinationName" : null,
  40. "maxQueuedBytes" : null,
  41. "maxQueuedCount" : null,
  42. "queuedBytes" : null,
  43. "backPressureBytesThreshold" : null,
  44. "backPressureObjectThreshold" : null,
  45. "isBackPressureEnabled" : null,
  46. "processorType" : null,
  47. "averageLineageDurationMS" : null,
  48. "flowFilesRemoved" : null,
  49. "invocations" : null,
  50. "processingNanos" : null
  51. }

Example Failure Data
  1. [ {
  2. "objectId" : "34c3249c-4a42-41ce-b94e-3563409ad55b",
  3. "platform" : "nifi",
  4. "project" : null,
  5. "bulletinId" : 28321,
  6. "bulletinCategory" : "Log Message",
  7. "bulletinGroupId" : "0b69ea51-7afb-32dd-a7f4-d82b936b37f9",
  8. "bulletinGroupName" : "Monitoring",
  9. "bulletinLevel" : "ERROR",
  10. "bulletinMessage" : "QueryRecord[id=d0258284-69ae-34f6-97df-fa5c82402ef3] Unable to query StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40: org.apache.nifi.processor.exception.ProcessException: Failed to read next record in stream for StandardFlowFileRecord[uuid=cd305393-f55a-40f7-8839-876d35a2ace1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1539633295746-10, container=default, section=10], offset=95914, length=322846],offset=0,name=783936865185030,size=322846] due to -40",
  11. "bulletinNodeAddress" : null,
  12. "bulletinNodeId" : "91ab706b-5d92-454e-bc7a-6911d155fdca",
  13. "bulletinSourceId" : "d0258284-69ae-34f6-97df-fa5c82402ef3",
  14. "bulletinSourceName" : "QueryRecord",
  15. "bulletinSourceType" : "PROCESSOR",
  16. "bulletinTimestamp" : "2018-10-18T20:54:39.179Z"
  17. } ]

Apache Hive 3 Tables
  1. CREATE EXTERNAL TABLE IF NOT EXISTS failure (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC
  2. LOCATION '/failure';
  3.  
  4. CREATE EXTERNAL TABLE IF NOT EXISTS bulletin (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC
  5. LOCATION '/error';
  6.  
  7.  
  8. CREATE EXTERNAL TABLE IF NOT EXISTS memory (objectId STRING, platform STRING, project STRING, bulletinId BIGINT, bulletinCategory STRING, bulletinGroupId STRING, bulletinGroupName STRING, bulletinLevel STRING, bulletinMessage STRING, bulletinNodeAddress STRING, bulletinNodeId STRING, bulletinSourceId STRING, bulletinSourceName STRING, bulletinSourceType STRING, bulletinTimestamp STRING) STORED AS ORC
  9. LOCATION '/memory'
  10. ;
  11.  
  12.  
  13. // backpressure
  14. CREATE EXTERNAL TABLE IF NOT EXISTS status (statusId STRING, timestampMillis BIGINT, `timestamp` STRING, actorHostname STRING, componentType STRING, componentName STRING, parentId STRING, platform STRING, `application` STRING, componentId STRING, activeThreadCount BIGINT, flowFilesReceived BIGINT, flowFilesSent BIGINT, bytesReceived BIGINT, bytesSent BIGINT, queuedCount BIGINT, bytesRead BIGINT, bytesWritten BIGINT, bytesTransferred BIGINT, flowFilesTransferred BIGINT, inputContentSize BIGINT, outputContentSize BIGINT, queuedContentSize BIGINT, activeRemotePortCount BIGINT, inactiveRemotePortCount BIGINT, receivedContentSize BIGINT, receivedCount BIGINT, sentContentSize BIGINT, sentCount BIGINT, averageLineageDuration BIGINT, inputBytes BIGINT, inputCount BIGINT, outputBytes BIGINT, outputCount BIGINT, sourceId STRING, sourceName STRING, destinationId STRING, destinationName STRING, maxQueuedBytes BIGINT, maxQueuedCount BIGINT, queuedBytes BIGINT, backPressureBytesThreshold BIGINT, backPressureObjectThreshold BIGINT, isBackPressureEnabled STRING, processorType STRING, averageLineageDurationMS BIGINT, flowFilesRemoved BIGINT, invocations BIGINT, processingNanos BIGINT) STORED AS ORC
  15. LOCATION '/status';
  16.  
  17.  
  18.  
  19.  
  20.  
  21.  
  22.  
  23.