Integration Testing for Apache NiFi Development

There are many ways to generate decent data to work with.

One good testing option is using REST APIs.

https://mockaroo.com/

https://randomuser.me/api

Another option is using a generator to generate CSV or JSON files.

https://www.generatedata.com/

There are also external data generators.

https://www.tomaszezula.com/2016/12/04/proxy-log-generator-to-load-test-nifi/

https://github.com/FINRAOS/DataGenerator#quick-start

There are also NiFi processors for generating data.

https://github.com/hashmapinc/nifi-simulator-bundle

The most common way to test your flows is with the GenerateFlowFile processor which lets you send valid flow files into your flow at a schedule or in rapid fire secession.

https://www.xenonstack.com/blog/test-driven-development-big-data/

For my example, I am using NiFi Expression Language to generate some data.




Example Expression Language in GenerateFlowFile

{"id": "${now():format("yyyyMMddHHmmss")}_${UUID()}_${thread()}",
"te": "0.${random():mod(100000):plus(1)}",
"diskusage": "${math("random")}.3 MB",
"memory": ${random():mod(95):plus(10)},
"cpu": ${nextInt()}.${random():mod(99):plus(1)},
"host": "${ip()}/${hostname(true)}",
"temperature": "${random():mod(60):plus(60)}",
"macaddress": "${UUID()}",
"end": "${random():mod(100000000000000):plus(1)}",
"systemtime": "${now():format("MM/dd/yyyy HH:mm:ss", "EST")}"}

Example JSON Produced

{"id": "20190425131936_f061de76-edaf-4d9e-a144-2aeff2b1576a_Timer-Driven Process Thread-3",
"te": "0.28235",
"diskusage": "0.05997607531046045.3 MB",
"memory": 58,
"cpu": 0.52,
"host": "192.168.1.193/192.168.1.193",
"temperature": "136",
"macaddress": "db00aef2-b242-4483-a552-223d74133aa5",
"end": "18296140941736",
"systemtime": "04/25/2019 12:19:36"}


References

https://www.nifi.rocks/developing-a-custom-apache-nifi-processor-unit-tests-partI/

https://community.hortonworks.com/questions/151190/generate-sequence-number-in-apache-nifi.html

https://datamelt.weebly.com/blog/nifi-processor-generatedata

https://github.com/uwegeercken/nifi_processors

https://medium.com/hashmapinc/its-here-an-apache-nifi-simulator-for-generating-random-and-realistic-time-series-data-d7e463aa5e78

https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.GenerateFlowFile/

https://github.com/tspannhw?tab=overview&from=2019-03-01&to=2019-03-31