Ingesting All The Weather Data With Apache NiFi


Ingesting All The Weather Data With Apache NiFi



Step By Step NiFi Flow

  1. GenerateFlowFile - build a schedule matching when NOAA updates weather
  2. InvokeHTTP - download all weather ZIP
  3. CompressContent - decompress ZIP
  4. UnpackContent - extract files from ZIP
  5. *RouteOnAttribute - just give us ones that are airports (${filename:startsWith('K')}). optional.
  6. *QueryRecord - XMLReader to JsonRecordSetWriter.   Query:  SELECT * FROM FLOWFILE WHERE NOT location LIKE '%Unknown%'.  This is to remove some locations that are not identified.  optional.
  7. Send it somewhere for storage.   Could put PutKudu, PutORC, PutHDFS, PutHiveStreaming, PutHbaseRecord, PutDatabaseRecord, PublishKafkaRecord2* or others.








URL For All US Data

invokehttp.request.url
https://w1.weather.gov/xml/current_obs/all_xml.zip



Example Record As Converted JSON

[ {
  "credit" : "NOAA's National Weather Service",
  "credit_URL" : "http://weather.gov/",
  "image" : {
    "url" : "http://weather.gov/images/xml_logo.gif",
    "title" : "NOAA's National Weather Service",
    "link" : "http://weather.gov"
  },
  "suggested_pickup" : "15 minutes after the hour",
  "suggested_pickup_period" : 60,
  "location" : "Stanley Municipal Airport, ND",
  "station_id" : "K08D",
  "latitude" : 48.3008,
  "longitude" : -102.4064,
  "observation_time" : "Last Updated on Jul 10 2020, 9:55 am CDT",
  "observation_time_rfc822" : "Fri, 10 Jul 2020 09:55:00 -0500",
  "weather" : "Fair",
  "temperature_string" : "66.0 F (19.0 C)",
  "temp_f" : 66.0,
  "temp_c" : 19.0,
  "relative_humidity" : 83,
  "wind_string" : "South at 6.9 MPH (6 KT)",
  "wind_dir" : "South",
  "wind_degrees" : 180,
  "wind_mph" : 6.9,
  "wind_kt" : 6,
  "pressure_in" : 30.03,
  "dewpoint_string" : "60.8 F (16.0 C)",
  "dewpoint_f" : 60.8,
  "dewpoint_c" : 16.0,
  "visibility_mi" : 10.0,
  "icon_url_base" : "http://forecast.weather.gov/images/wtf/small/",
  "two_day_history_url" : "http://www.weather.gov/data/obhistory/K08D.html",
  "icon_url_name" : "skc.png",
  "ob_url" : "http://www.weather.gov/data/METAR/K08D.1.txt",
  "disclaimer_url" : "http://weather.gov/disclaimer.html",
  "copyright_url" : "http://weather.gov/disclaimer.html",
  "privacy_policy_url" : "http://weather.gov/notice.html"
} ]


Source Code

Resources