Excellent presentations from the Hortonworks team for “NiFi on HDF” solutions architecture and best practices. Powerful solution to process and distribute data in real-time, any data, and in large quantities with resiliency. It’s no wonder why the US NSA originally developed the ability to consume data in real-time, manipulate it, and then send it on it’s way. However, recognizing the commercial applications (benevolent wisdom?), the NSA released the product as open-source software, via its technology transfer program.
As a tangent, among other things, I’m currently exploring the capabilities of “Microsoft Flow“, which has recently been promoted to GA from their ‘Preview Release’. One resonating question came to mind during the presentations last night:
At it’s peak maturity (not yet), can Microsoft Flow successfully compete with Apache NiFi on Hortonworks HDF?
Discussion Points:
The NiFi / HDF solution manages data flows in real-time. The Microsoft Flow architecture seems to fall short in this capacity. Is it on the product road map for Flow? Is it a capability Microsoft wants to have?
There a bit of architecture / infrastructure on the Hortonworks HDF side, which enables the solution as a whole to be able to ingest, process, and push the data in real-time. Not sure Microsoft Flow is currently engineered on the back end to handle the throughput.
The current Microsoft Flow UI may need to be updated to handle this ‘slightly altered’ paradigm of real-time content consumption and distribution.
The comparison between Microsoft Flow and NiFi on HDF may be a huge stretch for comparison.
Serverless computing is a cloud computing code execution model in which the cloud provider fully manages starting and stopping virtual machines as necessary to serve requests, and requests are billed by an abstract measure of the resources required to satisfy the request, rather than per virtual machine, per hour.[97] Despite the name, it does not actually involve running code without servers.[97] Serverless computing is so named because the business or person that owns the system does not have to purchase, rent or provision servers or virtual machines for the back-end code to run .
Based on your application Use Case(s), Cloud Serverless Computing architecture may reduce ongoing costs for application usage, and provide scalability on demand without the Cloud Server Instance management overhead, i.e. costs and effort.
Note: Cloud Serverless Computing is used interchangeability with Functions as a service (FaaS) which makes sense from a developer’s standpoint as they are coding Functions (or Methods), and that’s the level of abstraction.
Create automated workflows between apps and services to get notifications, synchronize files, collect data, and more. Although not the traditional Serverless Computing implementation, it’s the quickest way to perform application services without having to procure the application servers. Depending on your microservices (connectors + templates) definitions, you may not need to write a single line of code, and could all be done through the Flow console.
Connectors are “enablers” to connect to [data] sources in order to extract or insert data, typically one Connector per service, such as Twitter.
Templates utilize Connectors, and enable workflow designers to build business process workflows. Execution of the manufactured workflows performs the activities either Event trigger driven, or ADHOC / manual execution through the portal or through the Microsoft Flow mobile apps.
154 Service Connectors Exist. Several “Premium” connectors require monthly nominal fee (5 USD). For example, using the Oracle Database Connecter empowers the workflow designer insert, update, select, and delete rows in a table.
Automating business processes by designing workflows to turn repetitive tasks into multi-step workflows
Microsoft Flow Pricing
As listed below, there are three tiers, which includes a free tier for personal use or exploring the platform for your business. The pay Flow plans seem ridiculously inexpensive based on what business workflow designers receive for the 5 USD or 15 USD per month. Microsoft Flow has abstracted building workflows so almost anyone can build application workflows or automate business manual workflows leveraging almost any of the popular applications on the market.
It doesn’t seem like 3rd party [data] Connectors and Template creators receive any direct monetary value from the Microsoft Flow platform. Although workflow designers and business owners may be swayed to purchase 3rd party product licenses for the use of their core technology.
Process events with a serverless code architecture. An event-based serverless compute experience to accelerate development. Scale based on demand and pay only for the resources you consume.
Properly designed microservices have a single responsibility and can independently scale. With traditional applications being broken up into 100s of microservices, traditional platform technologies can lead to significant increase in management and infrastructure costs. Google Cloud Platform’s serverless products mitigates these challenges and help you create cost-effective microservices.
AWS provides a set of fully managed services that you can use to build and run serverless applications. You use these services to build serverless applications that don’t require provisioning, maintaining, and administering servers for backend components such as compute, databases, storage, stream processing, message queueing, and more. You also no longer need to worry about ensuring application fault tolerance and availability. Instead, AWS handles all of these capabilities for you, allowing you to focus on product innovation and get faster time-to-market. It’s important to note that Amazon was the first contender in this space with a 2014 product launch.
Execute code on demand in a highly scalable serverless environment. Create and run event-driven apps that scale on demand.
Focus on essential event-driven logic, not on maintaining servers
Integrate with a catalog of services
Pay for actual usage rather than projected peaks
The OpenWhisk serverless architecture accelerates development as a set of small, distinct, and independent actions. By abstracting away infrastructure, OpenWhisk frees members of small teams to rapidly work on different pieces of code simultaneously, keeping the overall focus on creating user experiences customers want.
What’s Next?
Serverless Computing is a decision that needs to be made based on the usage profile of your application. For the right use case, serverless computing is an excellent choice that is ready for prime time and can provide significant cost savings.
There’s an excellent article, recently published July 16th, 2017 by Moshe Kranc called, “Serverless Computing: Ready for Prime Time” which at a high level can help you determine if your application is a candidate for Serverless Computing.
It looks like Microsoft created a generic workflow platform, product independent.
Microsoft has software solutions, like MS Outlook with an [email] rules engine built into Outlook. SharePoint has a workflow solution within the Sharepoint Platform, typically governing the content flowing through it’s system.
Microsoft Flow is a different animal. It seems like Microsoft has built a ‘generic’ rules engine for processing almost any event. The Flow product:
Start using the product from one of two areas: a) “My Flows” where I may view existing and create new [work]flows. b) “Activity”, that shows “Notifications” and “Failures”
Select “My Flows”, and the user may “Create [a workflow] from Blank”, or “Browse Templates”. MSFT existing set of templates were created by Microsoft, and also by a 3rd party implying a marketplace.
Select “Create from Blank” and the user has a single drop down list of events, a culmination events across Internet products. There is an implication there could be any product, and event “made compatible” with MSFT Flows.
The drop down list of events has a format of “Product – Event”. As the list of products and events grow, we should see at least two separate drop down lists, one for products, and a sub list for the product specific events.
Several Example Events Include:
“Dropbox – When a file is created”
“Facebook – When there is a new post to my timeline”
“Project Online – When a new task is created”
“RSS – When a feed item is published”
“Salesforce – When an object is created”
The list of products as well as there events may need a business analyst to rationalize the use cases.
Once an Event is selected, event specific details may be required, e.g. Twitter account details, or OneDrive “watch” folder
Next, a Condition may be added to this [work]flow, and may be specific to the Event type, e.g. OneDrive File Type properties [contains] XYZ value. There is also an “advanced mode” using a conditional scripting language.
There is “IF YES” and “IF NO” logic, which then allows the user to select one [or more] actions to perform
Several Action Examples Include:
“Excel – Insert Rows”
“FTP – Create File”
“Google Drive – List files in folder”
“Mail – Send email”
“Push Notification – Send a push notification”
Again, it seems like an eclectic bunch of Products, Actions, and Events strung together to have a system to POC.
The Templates list, predefined set of workflows that may be of interest to anyone who does not want to start from scratch. The UI provides several ways to filter, list, and search through templates.
Applicable to everyday life, from an individual home user, small business, to the enterprise. At this stage the product seems in Beta at best, or more accurately, just after clickable prototype. I ran into several errors trying to go through basic use cases, i.e. adding rules.
Despite the “Preview” launch, Microsoft has showed us the power in [work]flow processing regardless of the service platform provider, e.g. Box, DropBox, Facebook, GitHub, Instagram, Salesforce, Twitter, Google, MailChimp, …
Microsoft may be the glue to combine service providers who may / expose their services to MSFT Flow functionality.
e.g. Language:Translation; E.g.2. Visual Recognition;
WordPress – Create a Post
New text file dropped in specific folder on Box, DropBox, etc. being ‘monitored’ by MSFT flow [?] Additional code required by user for ‘polling’ capabilities
OR new text file attached, and emailed to specific email account folder ‘watched’ by MSFT Flow.
Event triggers – Automatic read of new text file
stylizing may occur if HTML coding used
Action – Post to a Blog
‘ANY’ Event occurs, a custom message is sent using Skype for a single or group of Skype accounts;
On several ‘eligible’ events, such as “File Creation” into Box, the file (or file shared URL) may be sent to the Skype account.
‘ANY’ Event occurs, a custom mobile text message is sent to a single or group of phone numbers.
Event occurs for “File Creation” e.g. into Box; after passing a “Condition”, actions occur:
IBM Watson Cognitive API, Text to Speech, occurs, and the product of the action is placed in the same Box folder.
Action: Using Microsoft Edge (powered by MSN), in the “My news feed” tab, enable action to publish “Cards”, such as app notifications
Challenges \ Opportunities \ Unknowns
3rd party companies existing, published [cloud; web service] APIs may not even need any modification to integrate with Microsoft Flow; however, business approval may be required to use the API in this manner,
It is unclear re: Flow Templates need to be created by the product owner, e.g. Telestream, or knowledgeable third party, following the Android, iOS, and/or MSFT Mobile Apps model.
It is unclear if the MSFT Flow app may be licensed individually in the cloud, within the 365 cloud suite, or offered for Home and\or Business?
Smart Solutions
Definition: Product Owner (PO)
The Product Owner (PO) is a member of the Agile Team responsible for defining Stories and prioritizing the Team Backlog to streamline the execution of program priorities while maintaining the conceptual and technical integrity of the Features or components for the team.