Occasionally, when a thought gets bubbled up in my brain, I pop open Twitter, and tweet the thought. In some cases, the fleeting idea seems larger than a tweet, so I open up WordPress, and start a post. I may save it and come back to add content to the post. I’ll come back to the post, and say to myself, what was I thinking, and don’t pursue publishing post. Here’s the list of blog posts that I drafted this year, but decided for one reason or another, I wouldn’t post it.
When people think of Data Loss Prevention, we usually think of Endpoint protection, such as Symantec Endpoint Security solution, preventing the upload of data to web sites, or downloaded to a USB device. The data being “illegally” transferred typically conforms to a particular pattern such as Personal Identifiable Information (PII), i.e. Social Security numbers.
Using a client for local monitoring of the endpoint, the agent detects the transfer of information as a last line of defense for external distribution. EndPoint solutions could monitor suspicious activity and/or proactively cancel the data transfer in progress.
Moving closer to the source of the data loss, monitoring databases filled with Personal Identifying Information (PII) has its advantages and disadvantages. One may argue there is no data loss until the employee attempts to export the data outside the corporate network, and the data is in-flight. In addition, extracted PII data may be “properly utilized” within the corporate network for analysis.
There is a database solution that provides similar “endpoint” monitoring and protection, e.g. identifying PII data extraction, with real-time query cancellation upon detection, leveraging “out of the box” data patterns, Teleran Technologies. Teleran supports relational databases such as Oracle, and Microsoft SQL Server, both on-prem, and cloud solutions.
Updates in Data Management Policies
Identifying the data loss points of origination provides opportunities to update the gaps in data management policy and the implementation of additional controls over data. Data classification is done dynamically based on common data mask structures. Users may build additional rules to cover custom structures. So, for example, a business analyst executes a query against a database that appears to fit predefined data masks, such as SSN, the query may be canceled before it’s even executed, and/or this “suspicious” activity can be flagged for the Chief Information Officer and/or Chief Security Officer (CSO)
Bar none, I’ve seen only one firm that defends a company’s data assets closer to the probable leak of information, the database, Teleran Technologies, See what they have to offer your organization for data protection and compliance.
Prevalent Remote Work Changes Endpoint Strategy
Endpoints in our corporate environments of prevalent remote working may highlight the need that relying on endpoints may be too late to enforce data protection. We may need to bring potential data loss detection into the inner sanctum of the corporate networks and need prevention closer to the source of data being extracted. How are “semi-trusted” third parties such as staff augmentation from offshore dealt?
Endpoint DLP – Available Breach Tactics
Endpoint DLP may capture and contain attempts to extract PII data, for example, parsing text files for SSNs, or other data masks. However, there are ways around the transfer detection, making it lofty to identify, such as screen captures of data, converting from text into images. Some Endpoint providers boast about their Optical Character Recognition (OCR), however, turning on this feature may produce many false positives, too many to sift through in monitoring, and unmanageable to control. The best DLP defense is to monitor and control closer to the data source, and perhaps, flag data requests from employees, e.g. after SELECT statement entered, UI Pops up a “Reason for Request?” if PII extraction is identified in real-time, with auditable events that can flow into Splunk.
The problem is more widespread then highlighted in the article. It’s not just these high profile companies using “public domain” images to annotate with facial recognition notes and training machine learning (ML) models. Anyone can scan the Internet for images of people, and build a vast library of faces. These faces can then be used to train ML models. In fact, using public domain images from “the Internet” will cut across multiple data sources, not just Flickr, which increases the sample size, and may improve the model.
The rules around the uses of “Public Domain” image licensing may need to be updated, and possibly a simple solution, add a watermark to any images that do not have permission to be used for facial recognition model training. All image processors may be required to include a preprocessor to detect the watermark in the image, and if found, skip the image from being included in the training of models.
Voice mail is so LAST Century. It’s a static communications interface to address your incoming phone calls. It’s a dinosaur in terms of communications protocol. Yes, a digital assistant, or chat bots should “field” your incoming calls, providing your callers a higher level of service.
Business or Personal?
Why not both? There are use cases which highlight the value of a Digital Assistant answering your phone calls when you’re unavailable.
Trusted Friends and Business Pins
Level of available services may change based upon the level of trusted access, such as:
- Friends Seeking Your Availability for a Hockey Game Next Week
- Business Partners Sharing Information access such as invoices
Untrusted Caller Access
- The Vetting of Unsolicited Calls, such as robocalls
Defining and Default Dialogs
Users can define dialogs through drop and drag workflow diagram tools making it easy to “build” conversations / dialogs flows. In addition, out of the box flows can provide administrators with opportunities and discover the ways in which AI digital assistant may be leveraged.
Canned / Default dialog templates to handle the most common dialogs / workflows will empower users to the implement rapidly.
Any Acquisitions in the Pipeline?
Are the big names in the Digital Assistant space looking to partner or acquire tools that can easily transform workflows to be leveraged by digital assistant?
- IBM’s Conversations – chatbot dialog definition tool
- Interactive Voice Response (IVR) solutions
APIs available on Mobile OS SDKs?
Are the components available for third party product companies to extend the Mobile OS capabilities as of now? Or are the mobile OS companies the only ones in a possession of performing these upgrades?
Relational Database Solutions “In a Box”
Several of the relational database software vendors, such as IBM, Oracle, and Teradata have developed proprietary data warehouse software to be tightly coupled with server hardware to maximize performance. These solutions have been developed and refined as “on-prem” solutions for many years.
We’ve seen the rise of “Database (DW) as a Service” from companies like Amazon, who sell Redshift services.
Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. It allows you to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. Most results come back in seconds.
RDB Complex Software/Hardware Maintenance
In recent times, the traditional relational database software vendors shifted gears to become service providers offering maximum performance from a solution hosted by them, the vendor, in the Cloud. On the positive side, the added complexity of configuring and tuning a blended software/hardware data warehouse has been shifted from the client’s team resources such as Database Administrators (DBAs), Network Administrators, Unix/Windows Server Admins,… to the database software service provider. The complexity of tuning for scalability, and other maintenance challenges shifts to the software vendor’s expertise, if that’s the abstraction you select. There is some ambiguity in the delineation of responsibilities with the RDBMS vendor’s cloud offerings.
Total Cost of Ownership
Quantifying the total cost of ownership of a solution may be a bit tricky, especially if you’re trying to quantify the RDBMS hybrid software/hardware “on-prem” solution versus the same or similar capabilities brought to the client via “Database (DW) as a Service”.
“On-Prem”, RDB Client Hosted Solution
Several factors need to be considered when selecting ANY software and/or Hardware to be hosted at the client site.
- Infrastructure “when in Rome”
- Organizations have a quantifiable cost related to hosting physical or virtual servers in the client’s data center and may be boiled down to a number that may include things like HVAC, or new rack space.
- Resources used to maintain/monitor DC usage, there may be an abstracted/blended figure.
- Database Administrators maintain and monitor RDB solutions.
- Activities may range from RDB patches/upgrades to resizing/scaling the DB storage “containers”.
- Application Database Admins/Developers may be required to maintain the data warehouse architecture, such as new requirements, e.g. creating aggregate tables for BI analysis.
- Network Administrators
- Firewalls, VPN
- Port Scanning
- Windows/Unix Server Administrators
- OS Patches
Trying to correlate these costs in some type of “Apples to Apples” comparison to the “Data Warehouse as a Service” may require accountants and technical folks to do extensive financial modeling to make the comparison. Vendors, such as Oracle, offer fully managed services to the opposite end of the spectrum, the “Bare Metal”, essentially the “Infra as a Service.” The Oracle Exadata solution can be a significant investment depending on the investment in redundancy and scalability leveraging Oracle Real Application Clusters (RAC).
Support and Staffing Models for DW Cloud Vendors
In order for the traditional RDB software vendors to accommodate a “Data Warehouse as a Service” model, they may need to significantly increase staff for a variety of technical disciplines, as outlined above with the Client “On-Prem” model. A significant ramp-up of staff and the organizational challenges of developing and implementing a support model based on a variety of factors may have relational database vendors ask: Should they leverage a top tier consulting agency such as Accenture, or Deloitte to define, implement, and refine a managed service? It’s certainly a tall order to go from a software vendor to offering large scale services. With corporate footprints globally and positive track records implementing managed services of all types, it’s an attractive proposition for both the RDB vendor and the consulting agency who wins the bid. Looking at the DW Service billing models don’t seem sensical on some level. Any consulting agency who implements a DW managed service would be responsible to ensure ROI both for the RDS vendor and their clients. It may be opaque to the end client leveraging the Data Warehouse as a Service, but certainly, the quality of service provided should be nothing less than if implemented by the RDB vendor itself. If the end game for the RDB vendor is for the consulting agency to implement, and mature the service then at some point bring the service in-house, it could help to keep costs down while maturing the managed service.
Here are URLs for reference to understand the capabilities that are realized through Oracle’s managed services.
Note: The opinions shared here are my own.
What is Cloud Serverless Computing?
Create automated workflows between apps and services to get notifications, synchronize files, collect data, and more. Although not the traditional Serverless Computing implementation, it’s the quickest way to perform application services without having to procure the application servers. Depending on your microservices (connectors + templates) definitions, you may not need to write a single line of code, and could all be done through the Flow console.
- Connectors are “enablers” to connect to [data] sources in order to extract or insert data, typically one Connector per service, such as Twitter.
- Templates utilize Connectors, and enable workflow designers to build business process workflows. Execution of the manufactured workflows performs the activities either Event trigger driven, or ADHOC / manual execution through the portal or through the Microsoft Flow mobile apps.
- 154 Service Connectors Exist. Several “Premium” connectors require monthly nominal fee (5 USD). For example, using the Oracle Database Connecter empowers the workflow designer insert, update, select, and delete rows in a table.
- Automating business processes by designing workflows to turn repetitive tasks into multi-step workflows
Microsoft Flow Pricing
As listed below, there are three tiers, which includes a free tier for personal use or exploring the platform for your business. The pay Flow plans seem ridiculously inexpensive based on what business workflow designers receive for the 5 USD or 15 USD per month. Microsoft Flow has abstracted building workflows so almost anyone can build application workflows or automate business manual workflows leveraging almost any of the popular applications on the market.
It doesn’t seem like 3rd party [data] Connectors and Template creators receive any direct monetary value from the Microsoft Flow platform. Although workflow designers and business owners may be swayed to purchase 3rd party product licenses for the use of their core technology.
Process events with a serverless code architecture. An event-based serverless compute experience to accelerate development. Scale based on demand and pay only for the resources you consume.
Properly designed microservices have a single responsibility and can independently scale. With traditional applications being broken up into 100s of microservices, traditional platform technologies can lead to significant increase in management and infrastructure costs. Google Cloud Platform’s serverless products mitigates these challenges and help you create cost-effective microservices.
AWS provides a set of fully managed services that you can use to build and run serverless applications. You use these services to build serverless applications that don’t require provisioning, maintaining, and administering servers for backend components such as compute, databases, storage, stream processing, message queueing, and more. You also no longer need to worry about ensuring application fault tolerance and availability. Instead, AWS handles all of these capabilities for you, allowing you to focus on product innovation and get faster time-to-market. It’s important to note that Amazon was the first contender in this space with a 2014 product launch.
Execute code on demand in a highly scalable serverless environment. Create and run event-driven apps that scale on demand.
- Focus on essential event-driven logic, not on maintaining servers
- Integrate with a catalog of services
- Pay for actual usage rather than projected peaks
The OpenWhisk serverless architecture accelerates development as a set of small, distinct, and independent actions. By abstracting away infrastructure, OpenWhisk frees members of small teams to rapidly work on different pieces of code simultaneously, keeping the overall focus on creating user experiences customers want.
Serverless Computing is a decision that needs to be made based on the usage profile of your application. For the right use case, serverless computing is an excellent choice that is ready for prime time and can provide significant cost savings.
There’s an excellent article, recently published July 16th, 2017 by Moshe Kranc called, “Serverless Computing: Ready for Prime Time” which at a high level can help you determine if your application is a candidate for Serverless Computing.
Excellent article by .
Amazon’s Echo and Google’s Home are the two most compelling products in the new smart-speaker market. It’s a fascinating space to watch, for it is of substantial strategic importance to both companies as well as several more that will enter the fray soon. Why is this? Whatever device you outfit your home with will influence many downstream purchasing decisions, from automation hardware to digital media and even to where you order dog food. Because of this strategic importance, the leading players are investing vast amounts of money to make their product the market leader.
These devices have a broad range of functionality, most of which is not discussed in this article. As such, it is a review not of the devices overall, but rather simply their function as answer engines. You can, on a whim, ask them almost any question and they will try to answer it. I have both devices on my desk, and almost immediately I noticed something very puzzling: They often give different answers to the same questions. Not opinion questions, you understand, but factual questions, the kinds of things you would expect them to be in full agreement on, such as the number of seconds in a year.
How can this be? Assuming they correctly understand the words in the question, how can they give different answers to the same straightforward questions? Upon inspection, it turns out there are ten reasons, each of which reveals an inherent limitation of artificial intelligence as we currently know it…
Addendum to the Article:
As someone who has worked with Artificial Intelligence in some shape or form for the last 20 years, I’d like to throw in my commentary on the article.
- Human Utterances and their Correlation to Goal / Intent Recognition. There are innumerable ways to ask for something you want. The ‘ask’ is a ‘human utterance’ which should trigger the ‘goal / intent’ of what knowledge the person is requesting. AI Chat Bots, digital agents, have a table of these utterances which all roll up to a single goal. Hundreds of utterances may be supplied per goal. In fact, Amazon has a service, Mechanical Turk, the Artificial Artificial Intelligence, which you may “Ask workers to complete HITs – Human Intelligence Tasks – and get results using Mechanical Turk”. They boast access to a global, on-demand, 24 x 7 workforce to get thousands of HITs completed in minutes. There are also ways in which the AI Digital Agent may ‘rephrase’ what the AI considers utterances that are closely related. Companies like IBM look toward human recognition, accuracy of comprehension as 95% of the words in a given conversation. On March 7, IBM announced it had become the first to hone in on that benchmark, having achieved a 5.5% error rate.
- Algorithmic ‘weighted’ Selection verses Curated Content. It makes sense based on how these two companies ‘grew up’, that Amazon relies on their curated content acquisitions such as Evi, a technology company which specialises in knowledge base and semantic search engine software. Its first product was an answer engine that aimed to directly answer questions on any subject posed in plain English text, which is accomplished using a database of discrete facts. “Google, on the other hand, pulls many of its answers straight from the web. In fact, you know how sometimes you do a search in Google and the answer comes up in snippet form at the top of the results? Well, often Google Assistant simply reads those answers.” Truncated answers equate to incorrect answers.
- Instead of a direct Q&A style approach, where a human utterance, question, triggers an intent/goal [answer], a process by which ‘clarifying questions‘ maybe asked by the AI digital agent. A dialog workflow may disambiguate the goal by narrowing down what the user is looking for. This disambiguation process is a part of common technique in human interaction, and is represented in a workflow diagram with logic decision paths. It seems this technique may require human guidance, and prone to bias, error and additional overhead for content curation.
- Who are the content curators for knowledge, providing ‘factual’ answers, and/or opinions? Are curators ‘self proclaimed’ Subject Matter Experts (SMEs), people entitled with degrees in History? or IT / business analysts making the content decisions?
- Questions requesting opinionated information may vary greatly between AI platform, and between questions within the same AI knowledge base. Opinions may offend, be intentionally biased, sour the AI / human experience.
The AI personal assistant with the “most usage” spanning connectivity across all smart devices, will be the anchor upon which users will gravitate to control their ‘automated’ lives. An Amazon commercial just aired which depicted a dad with his daughter, and the daughter was crying about her boyfriend who happened to be in the front yard yelling for her. The dad says to Amazon’s Alexa, sprinklers on, and yes, the boyfriend got soaked.
What is so special about top spot for the AI Personal Assistant? Controlling the ‘funnel’ upon which all information is accessed, and actions are taken means the intelligent ability to:
- Serve up content / information, which could then be mixed in with advertisements, or ‘intelligent suggestions’ based on historical data, i.e. machine learning.
- Proactive, suggestive actions may lead to sales of goods and services. e.g. AI Personal Assistant flags potential ‘buys’ from eBay based on user profiles.
Three main sources of AI Personal Assistant value add:
- A portal to the “outside” world; E.g. If I need information, I wouldn’t “surf the web” I would ask Cortana to go “Research” XYZ; in the Business Intelligence / data warehousing space, a business analyst may need to run a few queries in order to get the information they wanted. In the same token, Microsoft Cortana may come back to you several times to ask “for your guidance”
- An abstraction layer between the user and their apps; The user need not ‘lift a finger’ to any app outside the Personal Assistant with noted exceptions like playing a game for you.
- User Profiles derived from the first two points; I.e. data collection on everything from spending habits, or other day to day rituals.
Proactive and chatty assistants may win the “Assistant of Choice” on all platforms. Being proactive means collecting data more often then when it’s just you asking questions ADHOC. Proactive AI Personal Assistants that are Geo Aware may may make “timely appropriate interruptions”(notifications) that may be based on time and location. E.g. “Don’t forget milk” says Siri, as your passing the grocery store. Around the time I leave work Google maps tells me if I have traffic and my ETA.
It’s possible for the [non-native] AI Personal Assistant to become the ‘abstract’ layer on top of ANY mobile OS (iOS, Android), and is the funnel by which all actions / requests are triggered.
Microsoft Corona has an iOS app and widget, which is wrapped around the OS. Tighter integration may be possible but not allowed by the iOS, the iPhone, and the Apple Co. Note: Google’s Allo does not provide an iOS widget at the time of this writing.
Antitrust violation by mobile smartphone maker Apple: iOS must allow for the ‘substitution’ of a competitive AI Personal Assistant to be triggered in the same manner as the native Siri, “press and hold home button” capability that launches the default packaged iOS assistant Siri.
Reminiscent of the Microsoft IE Browser / OS antitrust violations in the past.
Holding the iPhone Home button brings up Siri. There should be an OS setting to swap out which Assistant is to be used with the mobile OS as the default. Today, the iPhone / iPad iOS only supports “Siri” under the Settings menu.
ANY AI Personal assistant should be allowed to replace the default OS Personal assistant from Amazon’s Alexa, Microsoft’s Cortana to any startup company with expertise and resources needed to build, and deploy a Personal Assistant solution. Has Apple has taken steps to tightly couple Siri with it’s iOS?
AI Personal Assistant ‘Wish” list:
- Interactive, Voice Menu Driven Dialog; The AI Personal Assistant should know what installed [mobile] apps exist, as well as their actionable, hierarchical taxonomy of feature / functions. The Assistant should, for example, ask which application the user wants to use, and if not known by the user, the assistant should verbally / visually list the apps. After the user selects the app, the Assistant should then provide a list of function choices for that application; e.g. “Press 1 for “Play Song”
- The interactive voice menu should also provide a level of abstraction when available, e.g. User need not select the app, and just say “Create Reminder”. There may be several applications on the Smartphone that do the same thing, such as Note Taking and Reminders. In the OS Settings, under the soon to be NEW menu ‘ AI Personal Assistant’, a list of installed system applications compatible with this “AI Personal Assistant” service layer should be listed, and should be grouped by sets of categories defined by the Mobile OS.
- Capability to interact with IoT using user defined workflows. Hardware and software may exist in the Cloud.
- Ever tighter integration with native as well as 3rd party apps, e.g. Google Allo and Google Keep.
Apple could already be making the changes as a natural course of their product evolution. Even if the ‘big boys’ don’t want to stir up a hornet’s nest, all you need is VC and a few good programmers to pick a fight with Apple.