Monitoring Mac Laptops With Apache NiFi and osquery

 Monitoring Mac Laptops With Apache NiFi and osquery

The other way is pass a SQL query to osquery interpreter (ala osqueryi --json "SELECT * FROM $1") and get the query results back as JSON.

We can tail the main file (/var/log/osquery/osqueryd.results.log) and send the JSON to be used at scale as events.  We can also grab any and all osquery logs like INFO, WARN and ERROR via osquery.+.

Either download or brew cask install.

I setup a simple configuration here: (


  "options": {

    "config_plugin": "filesystem",

    "logger_plugin": "filesystem",

    "logger_path": "/var/log/osquery",

    "disable_logging": "false",

    "disable_events": "false",

    "database_path": "/var/osquery/osquery.db",

    "utc": "true"


  "schedule": {

    "system_info": {

      "query": "SELECT hostname, cpu_brand, physical_memory FROM system_info;",

      "interval": 3600



  "decorators": {

    "load": [

      "SELECT uuid AS host_uuid FROM system_info;",

      "SELECT user AS username FROM logged_in_users ORDER BY time DESC LIMIT 1;"



  "packs": {

       "osquery-monitoring": "/var/osquery/packs/osquery-monitoring.conf",

     "incident-response": "/var/osquery/packs/incident-response.conf",

     "it-compliance": "/var/osquery/packs/it-compliance.conf",

       "osx-attacks": "/var/osquery/packs/osx-attacks.conf",

       "vuln-management": "/var/osquery/packs/vuln-management.conf",

       "hardware-monitoring": "/var/osquery/packs/hardware-monitoring.conf",

     "ossec-rootkit": "/var/osquery/packs/ossec-rootkit.conf"



We then turn JSON osquery records into records that can be used for routing, queries, aggregates and ultimately pushing it to Impala/Kudu for rich Cloudera Visual Apps and to Kafka as Schema Aware AVRO to use in Kafka Connect as well as a live continuous query feed to Flink SQL streaming analytic applications.

We could also have osquery push directly to Kafka, but since I am often disconnected from a Kafka server, in offline mode or just want a local buffer for these events lets use Apache NiFi which can run as a single 2GB node on my machine.   I can also do local processing of the data and some local alerting if needed.

Once you have the data from one or million machines you can do log aggregation, anomaly detection, predictive maintenance or whatever else you might need to do.   Sending this data to Cloudera Data Platform in AWS or Azure and having CML and Visual Apps to store, analyze, report, query, build apps, build pipelines and ultimately build production machine learning flows on really makes this a simple example of how to take any data and bring it into a full data platform.