Tag Archives: Data Flow

Apache NiFi on Hortonworks HDF Verses … Microsoft Flow?

Attended a technical discussion last night on Apache NiFi and Hortonworks HDF,  a Meetup @ Honeywell, a Hortonworks client.

Excellent presentations from the Hortonworks team for “NiFi on HDF” solutions architecture and best practices. Powerful solution to process and distribute data in real-time, any data, and in large quantities with resiliency.   It’s no wonder why the US NSA originally developed the ability to consume data in real-time, manipulate it, and then send it on it’s way.  However, recognizing the commercial applications (benevolent wisdom?), the NSA released the product as open-source software, via its technology transfer program.

As a tangent,  among other things, I’m currently exploring the capabilities of “Microsoft Flow“, which has recently been promoted to GA from their ‘Preview Release’.  One resonating question came to mind during the presentations last night:

At it’s peak maturity (not yet), can Microsoft Flow successfully compete with Apache NiFi on Hortonworks HDF?

Discussion Points:

  • The NiFi / HDF solution manages data flows in real-time.  The Microsoft Flow architecture seems to fall short in this capacity. Is it on the product road map for Flow?  Is it a capability Microsoft wants to have?
  • There a bit of architecture / infrastructure on the Hortonworks HDF side, which enables the solution as a whole to be able to ingest, process, and push the data in real-time.   Not sure Microsoft Flow is currently engineered on the back end to handle the throughput.
  • The current Microsoft Flow UI may need to be updated to handle this ‘slightly altered’ paradigm of real-time content consumption and distribution.

The comparison between Microsoft Flow and NiFi on HDF may be a huge stretch for comparison.