The problem is more widespread then highlighted in the article. It’s not just these high profile companies using “public domain” images to annotate with facial recognition notes and training machine learning (ML) models. Anyone can scan the Internet for images of people, and build a vast library of faces. These faces can then be used to train ML models. In fact, using public domain images from “the Internet” will cut across multiple data sources, not just Flickr, which increases the sample size, and may improve the model.
The rules around the uses of “Public Domain” image licensing may need to be updated, and possibly a simple solution, add a watermark to any images that do not have permission to be used for facial recognition model training. All image processors may be required to include a preprocessor to detect the watermark in the image, and if found, skip the image from being included in the training of models.
Animals could provide a new source of training data for AI systems.
To train AI to think like a dog, the researchers first needed data. They collected this in the form of videos and motion information captured from a single dog, a Malamute named Kelp. A total of 380 short videos were taken from a GoPro camera mounted to the dog’s head, along with movement data from sensors on its legs and body.
They captured a dog going about its daily life — walking, playing fetch, and going to the park.
Researchers analyzed Kelp’s behavior using deep learning, an AI technique that can be used to sift patterns from data, matching the motion data of Kelp’s limbs and the visual data from the GoPro with various doggy activities.
The resulting neural network trained on this information could predict what a dog would do in certain situations. If it saw someone throwing a ball, for example, it would know that the reaction of a dog would be to turn and chase it.
The predictive capacity of their AI system was very accurate, but only in short bursts. In other words, if the video shows a set of stairs, then you can guess the dog is going to climb them. But beyond that, life is simply too varied to predict.
Dogs “clearly demonstrate visual intelligence, recognizing food, obstacles, other humans, and animals,” so does a neural network trained to act like a dog show the same cleverness?
It turns out yes.
Researchers applied two tests to the neural network, asking it to identify different scenes (e.g., indoors, outdoors, on stairs, on a balcony) and “walkable surfaces” (which are exactly what they sound like: places can walk). In both cases, the neural network was able to complete these tasks with decent accuracy using just the basic data it had of a dog’s movements and whereabouts.
Many applications that enable users to create their own content from word processing to graphics/image creation have typically relied upon 3rd party Content Management Solutions (CMS) / Digital Asset Management (DAM) platforms to collect metadata describing the assets upon ingestion into their platforms. Many of these platforms have been “stood up” to support projects/teams either for collaboration on an existing project, or reuse of assets for “other” projects. As a person constantly creating content, where do you “park” your digital resources for archiving and reuse? Your local drive, cloud storage, or not archived?
Average “Jane” / “Joe” Digital Authors
If I were asked for all the content I’ve created around a particular topic or group of topics from all my collected/ingested digital assets, it may be a herculean search effort spanning multiple platforms. As an independent creator of content, I may have digital assets ranging from Microsoft Word documents, Google Sheets spreadsheets, Twitter tweets, Paint.Net (.pdn) Graphics, Blog Posts, etc.
Capturing Content from Microsoft Office Suite Products
Many of the MS Office content creation products such as Microsoft Word have minimal capacity to capture metadata, and if the ability exists, it’s subdued in the application. MS Word, for example, if a user selects “Save As”, they will be able to add/insert “Authors”, and Tags. In Microsoft Excel, latest version, the author of the Workbook has the ability to add Properties, such as Tags, and Categories. It’s not clear how this data is utilized outside the application, such as the tag data being searchable after uploaded/ingested by OneDrive?
Blog Posts: High Visibility into Categorization and Tagging
A “blogging platform”, such as WordPress, places the Category and Tagging selection fields right justified to the content being posted. In this UI/UX, it forces a specific mentality to the creation, categorization, and tagging of content. This blogging structure constantly reminds the author to identify the content so others may identify and consume the content. Blog post content is created to be consumed by a wide audience of interested viewers based on those tags and categories selected.
Proactive Categorization and Tagging
Perpetuate content classification through drill-down navigation of a derived Information Architecture Taxonomy. As a “light weight” example, in WordPress, the Tags field when editing a Post, a user starts typing in a few characters, an auto-complete dropdown list appears to the user to select one or more of these previously used tags. Excellent starting point for other Content Creation Apps.
Users creating Blog Posts can define a Parent/Child hierarchy of categories, and the author may select one or more of relevant categories to be associated with the Post.
Artificial Intelligence (AI) Derived Tags
It wouldn’t be a post without mentioning AI. Integrated into applications that enable user content creation could be a tool, at a minimum, automatically derives an “Index” of words, or tags. The way in which this “intelligent index” is derived may be based upon:
# of times word occurrence
mention of words in a particular context
reference of the same word(s) or phrases in other content
defined by the same author, and/or across the platform.
This intelligently derived index of data should be made available to any platforms that ingest content from OneDrive, SharePoint, Google Docs, etc. These DAMs ( or Intelligent Cloud Storage) can leverage this information for any searches across the platforms.
Easy to Retrieve the Desired Content, and Repurpose It
Many Content Creation applications heavily rely on “Recent Accessed Files” within the app. If the Information Architecture/Taxonomy hierarchy were presented in the “File Open” section, and a user can drill down on select Categories/Subcategories (and/or tags), it might be easier to find the most desired content.
All Eyes on Content Curation: Creation to Archive
Content creation products should all focus on the collection of metadata at the time of their creation.
Using the Blog Posting methodology, the creation of content should be alongside the metadata tagging
Taxonomy (categories, and tags with hierarchy) searches from within the Content Creation applications, and from the Operating System level, the “Original” Digital Asset Management solution (DAM), e.g. MS Windows, Mac
about deconstructing existing functionality of entire Photo Archive and Sharing platforms.
to bring an awareness to the masses about corporate decisions to omit the advanced capabilities of cataloguing photos, object recognition, and advanced metadata tagging.
Backstory: The Asks / Needs
Every day my family takes tons of pictures, and the pictures are bulk loaded up to The Cloud using Cloud Storage Services, such as DropBox, OneDrive, Google Photos, or iCloud. A selected set of photos are uploaded to our favourite Social Networking platform (e.g. Facebook, Instagram, Snapchat, and/or Twitter).
Every so often, I will take pause, and create either a Photobook or print out pictures from the last several months. The kids may have a project for school to print out e.g. Family Portrait or just a picture of Mom and the kids. In order to find these photos, I have to manually go through our collection of photographs from our Cloud Storage Services, or identify the photos from our Social Network libraries.
Social Networking Platform Facebook
As far as I can remember the Social Networking platform Facebook has had the ability to tag faces in photos uploaded to the platform. There are restrictions, such as whom you can tag from the privacy side, but the capability still exists. The Facebook platform also automatically identifies faces within photos, i.e. places a box around faces in a photo to make the person tagging capability easier. So, in essence, there is an “intelligent capability” to identify faces in a photo. It seems like the Facebook platform allows you to see “Photos of You”, but what seems to be missing is to search for all photos of Fred Smith, a friend of yours, even if all his photos are public. By design, it sounds fit for the purpose of the networking platform.
Automatically upload new images in bulk or one at a time to a Cloud Storage Service ( with or without Online Printing Capabilities, e.g. Photobooks) and an automated curation process begins.
The Auto Curation process scans photos for:
“Commonly Identifiable Objects”, such as #Car, #Clock, #Fireworks, and #People
Auto Curation of new photos, based on previously tagged objects and faces in newly uploaded photos will be automatically tagged.
Once auto curation runs several times, and people are manually #taged, the auto curation process will “Learn” faces. Any new auto curation process executed should be able to recognize tagged people in new pictures.
Auto Curation process emails / notifies the library owners of the ingestion process results, e.g. Jane Doe and John Smith photographed at Disney World on Date / Time stamp. i.e. Report of executed ingestion, and auto curation process.
After upload, and auto curation process, optionally, it’s time to manually tag people’s faces, and any ‘objects’ which you would like to track, e.g. Car aficionado, #tag vehicle make/model with additional descriptive tags. Using the photo curator function on the Cloud Storage Service can tag any “objects” in the photo using Rectangle or Lasso Select.
Curation to Take Action
Once photo libraries are curated, the library owner(s) can:
Automatically build albums based one or more #tags
Smart Albums automatically update, e.g. after ingestion and Auto Curation. Albums are tag sensitive and update with new pics that contain certain people or objects. The user/ librarian may dictate logic for tags.
Where is this Functionality??
Why are may major companies not implementing facial (and object) recognition? Google and Microsoft seem to have the capability/size of the company to be able to produce the technology.
Is it possible Google and Microsoft are subject to more scrutiny than a Shutterfly? Do privacy concerns at the moment, leave others to become trailblazers in this area?
Protecting the Data Warehouse with Artificial Intelligence
Teleran is a middleware company who’s software monitors and governs OLAP activity between the Data Warehouse and Business Intelligence tools, like Business Objects and Cognos. Teleran’s suite of tools encompass a comprehensive analytical and monitoring solution called iSight. In addition, Teleran has a product that leverages artificial intelligence and machine learning to impose real-time query and data access controls. Architecture also allows for Teleran’s agent not to be on the same host as the database, for additional security and prevention of utilizing resources from the database host.
Key Features of iGuard:
Policy engine prevents “bad” queries before reaching database
Patented rule engine resides in-memory to evaluate queries at database protocol layer on TCP/IP network
Patented rule engine prevents inappropriate or long-running queries from reaching the data
70 Customizable Policy Templates
SQL Query Policies
Create policies using policy templates based on SQL Syntax:
Require JOIN to Security Table
Column Combination Restriction – Ex. Prevents combining customer name and social security #
Table JOIN restriction – Ex. Prevents joining two different tables in same query
Equi-literal Compare requirement – Tightly Constrains Query Ex. Prevents hunting for sensitive data by requiring ‘=‘ condition
By user or user groups and time of day (shift) (e.g. ETL)
Blocks connections to the database
White list or black list by
DB User Logins
OS User Logins
Applications (BI, Query Apps)
Rule Templates Contain Customizable Messages
Each of the “Policy Templates” has the ability to send the user querying the database a customized message based on the defined policy. The message back to the user from Teleran should be seamless to the application user’s experience.
Machine Learning: Curbing Inappropriate, or Long Running Queries
iGuard has the ability to analyze all of the historical SQL passed through to the Data Warehouse, and suggest new, customized policies to cancel queries with certain SQL characteristics. The Teleran administrator sets parameters such as rows or bytes returned, and then runs the induction process. New rules will be suggested which exceed these defined parameters. The induction engine is “smart” enough to look at the repository of queries holistically and not make determinations based on a single query.
The ultimate goal, in my mind, is to have the capability within a Search Engine to be able to upload an image, then the search engine analyzes the image, and finds comparable images within some degree of variation, as dictated in the search properties. The search engine may also derive metadata from the uploaded image such as attributes specific to the image object(s) types. For example, determine if a person [object] is “Joyful” or “Angry”.
As of the writing of this article, search engines Yahoo and Microsoft Bing do not have the capability to upload an image and perform image/pattern recognition, and return results. Behold, Google’s search engine has the ability to use some type of pattern matching, and find instances of your image across the world wide web. From the Google Search “home page”, select “Images”, or after a text search, select the “Images” menu item. From there, an additional icon appears, a camera with the hint text “Search by Image”. Select the Camera icon, and you are presented with options on how Google can acquire your image, e.g. upload, or an image URL.
Select the “Upload an Image” tab, choose a file, and upload. I used a fictional character, Max Headroom. The search results were very good (see below). I also attempted an uncommon shape, and it did not meet my expectations. The poor performance of matching this possibly “unique” shape is mostly likely due to how the Google Image Classifier Model was defined, and correlating training data that tested the classifier model. If the shape is “Unique” the Google Search Image Engine did it’s job.
Google Image Search Results – Max Headroom
Google Image Search Results – Odd Shaped Metal Object
The Google Search Image Engine was able to “Classify” the image as “metal”, so that’s good. However I would have liked to see better matches under the “Visually Similar Image” section. Again, this is probably due to the image classification process, and potentially the diversity of image samples.
A Few Questions for Google
How often is the Classifier Modeling process executed (i.e. training the classifier), and the model tested? How are new images incorporated into the Classifier model? Are the user uploaded images now included in the Model (after model training is run again)? Is Google Search Image incorporating ALL Internet images into Classifier Model(s)? Is an alternate AI Image Recognition process used beyond Classifier Models?
I’m not sure if the Cloud Vision API uses the same technology as Google’s Search Image Engine, but it’s worth noting. After reaching the Cloud Vision API starting page, go to the “Try the API” section, and upload your image. I tried a number of samples, including my odd shaped metal, and I uploaded the image. I think it performed fairly well on the “labels” (i.e. image attributes)
Using the Google Cloud Vision API, to determine if there were any WEB matches with my odd shaped metal object, the search came up with no results. In contrast, using Google’s Search Image Engine produced some “similar” web results.
Finally, I tested the Google Cloud Vision API with a self portrait image. THIS was so cool.
The API brought back several image attributes specific to “Faces”. It attempts to identify certain complex facial attributes, things like emotions, e.g. Joy, and Sorrow.
The API brought back the “Standard” set of Labels which show how the Classifier identified this image as a “Person”, such as Forehead and Chin.
Finally, the Google Cloud Vision API brought back the Web references, things like it identified me as a Project Manager, and an obscure reference to Zurg in my Twitter Bio.
The Google Cloud Vision API, and their own baked in Google Search Image Engine are extremely enticing, but yet have a ways to go in terms of accuracy %. Of course, I tried using my face in the Google Search Image Engine, and looking at the “Visually Similar Images” didn’t retrieve any images of me, or even a distant cousin (maybe?)
Interesting approach to an AI Chatbot implementation. The business process owner creates one or more Google Forms containing questions and answers, and converts/deploys to a chatbot using fobi.io. All the questions for [potential] customers/users are captured in a multitude of forms. Without any code, and within minutes, an interactive chatbot can be produced and deployed for client use.
The trade off for rapid deployment and without coding is a rigid approach of triggering user desired “Goal/Intents”. It seems a single goal/intent is mapped to a single Google Form. As opposed to a digital agent, which leverages utterances to trigger the user’s intended goal/intent. Before starting the chat, the user must select the appropriate Google Form, with the guidance of the content curator.
Another trade off is, it seems, no integration on the backend to execute a business process, essential to many chatbot workflows. For example, given an Invoice ID, the chatbot may search in a transactional database, then retrieve and display the full invoice. Actually, I may be incorrect. On the Google Forms side, there is a Script Editor. Seems powerful and scary all at the same time.
Another trade off that seems to exist, more on the Google Forms side, is building not just a Form with a list of Questions, but a Consumer Process Workflow, that allows the business to provide an interactive dialog based on answers users provide. For example, a Yes/No or multichoice answer may lead to alternate sets of questions [and actions]. It doesn’t appear there is any workflow tool provided to structure the Google Forms / fobi.io chatbot Q&A.
However, there are still many business cases for the product, especially for small to mid size organizations.
* Business Estimates – although there is no logic workflow to guide the Q&A sessions with [prospective] customers, the business still may derive the initial information they require to make an initial assessment. It seems a Web form, and this fobi.io / Google Forms solution seems very comparable in capability, its just a change in the median in which the user interacts to collect the information.
One additional note, Google Forms is not a free product. Looks like it’s a part of the G Suite. Free two week trial, then the basic plan is $5 per month, which comes with other products as well. Click here for pricing details.
Although this “chatbot” tries to quickly provide a mechanism to turn a form to a chatbot, it seems it’s still just a form at the end of the day. I’m interested to see more products from Zoi.ai soon
Going through the Amazon Lex build chat process, and configuration of the Digital Assistant was a breeze. AWS employs a ‘wizard’ style interface to help the user build the Chatbot / Digital Agent. The wizard guides you through defining Intents, Utterances, Slots, and Fulfillment.
Intents – A particular goal that the user wants to achieve (e.g. book an airline reservation)
Utterances – Spoken or typed phrases that invoke your intent
Slots – Data the user must provide to fulfill the intent
Prompts – Questions that ask the user to input data
Fulfillment – The business logic required to fulfill the user’s intent (i.e. backend call to another system, e.g. SAP)
The Amazon Lex Chatbot editor is also extremely easy to use, and to update / republish any changes.
The challenge with Amazon Lex appears to be a very limiting ability for chatbot distribution / deployment. Your Amazon Lex Chatbot is required to use one of three methods to deploy: Facebook, Slack, or Twilio SMS. Facebook is limiting in a sense if you do not want to engage your customers on this platform. Slack is a ‘closed’ framework, whereby the user of the chat bot must belong to a Slack team in order to communicate. Finally, Twilio SMS implies use of your chat bot though a mobile phone SMS.
I’ve reached out to AWS Support regarding any other options for Amazon Lex chatbot deployment. Just in case I missed something.
There is a “Test Bot” in the lower right corner of the Amazon Lex, Intents menu. The author of the business process can, in real-time, make changes to the bot, and test them all on the same page.
Is there a way to leverage the “Test Bot” as a “no frills” Chatbot UI, and embed it in an existing web page? Question to AWS Support.
One concern is for large volumes of utterances / Intents and slots. An ideal suggestion would allow the user a bulk upload through an Excel spreadsheet, for example.
I’ve not been able to utilize the Amazon Lambda to trigger server side processing.
Note: there seem to be several ‘quirky’ bugs in the Amazon Lex UI, so it may take one or two tries to workaround the UI issue.
IBM Watson Conversation also contends for this Digital Agent / Assistant space, and have a very interesting offering including dialog / workflow definition.
Both Amazon Lex and IBM Watson Conversation are FREE to try, and in minutes, you could have your bots created and deployed. Please see sites for pricing details.
Google may attempt to leapfrog their Digital Assistant competition by taking advantage of their ability to search against all Google products. The more personal data a Digital Assistant may access, the greater the potential for increased value per conversation.
As a first step, Google’s “Personal” Search tab in their Search UI has access to Google Calendar, Photos, and your Gmail data. No doubt other Google products are coming soon.
Big benefits are not just for the consumer to search through their Personal Goggle data, but provide that consolidated view to the AI Assistant. Does the Google [Digital] Assistant already have access to Google Keep data, for example. Is providing Google’s “Personal” search results a dependency to broadening the Digital Assistant’s access and usage? If so, these…
interactions are most likely based on a reactive model, rather than proactive dialogs, i.e. the Assistant initiating the conversation with the human.
“What you need, before you ask. Stay a step ahead with Now cards about traffic for your commute, news, birthdays, scores and more.”
I’m not sure how proactive the Google AI is built to provide, but most likely, it’s barely scratching the service of what’s possible.
Modeling Personal, AI + Human Interactions
Starting from N number of accessible data sources, searching for actionable data points, correlating these data points to others, and then escalating to the human as a dynamic or predefined Assistant Consumer Workflow (ACW). Proactive, AI Digital Assistant initiates human contact to engage in commerce without otherwise being triggered by the consumer.
Actionable data point correlations can trigger multiple goals in parallel. However, the execution of goal based rules would need to be managed. The consumer doesn’t want to be bombarded with AI Assistant suggestions, but at the same time, “choice” opportunities may be appropriate, as the Google [mobile] App has implemented ‘Cards’ of bite size data, consumable from the UI, at the user’s discretion.
As an ongoing ‘background’ AI / ML process, Digital Assistant ‘server side’ agent may derive correlations between one or more data source records to get a deeper perspective of the person’s life, and potentially be proactive about providing input to the consumer decision making process.
The proactive Google Assistant may suggest to book your annual fishing trip soon. Elevated Interaction to Consumer / User.
The Assistant may search Gmail records referring to an annual fishing trip ‘last year’ in August. AI background server side parameter / profile search. Predefined Assistant Consumer Workflow (ACW) – “Annual Events” Category. Building workflows that are ‘predefined’ for a core set of goals/rules.
AI Assistant may search user’s photo archive on the server side. Any photo metadata could be garnished from search, including date time stamps, abstracted to include ‘Season’ of Year, and other synonym tags.
Photos from around ‘August’ may be earmarked for Assistant use
Photos may be geo tagged, e.g. Lake Champlain, which is known for its fishing.
All objects in the image may be stored as image metadata. Using image object recognition against all photos in the consumer’s repository, goal / rule execution may occur against pictures from last August, the Assistant may identify the “fishing buddies” posing with a huge “Bass fish”.
In addition to the Assistant making the suggestion re: booking the trip, Google’s Assistant may bring up ‘highlighted’ photos from last fishing trip to ‘encourage’ the person to take the trip.
This type of interaction, the Assistant has the ability to proactively ‘coerce’ and influence the human decision making process. Building these interactive models of communication, and the ‘management’ process to govern the AI Assistant is within reach.
Predefined Assistant Consumer / User Workflows (ACW) may be created by third parties, such as Travel Agencies, or by industry groups, such as foods, “low hanging fruit” easy to implement the “time to get more milk” . Or, food may not be the best place to start, i.e. Amazon Dash
Do AI Rules Engines “deliberate” any differently between rules with moral weight over none at all. Rhetorical..?
The ethics that will explicitly and implicitly be built into implementations of autonomous vehicles involves a full stack of technology, and “business” input. In addition, implementations may vary between manufacturers and countries.
In the world of Kosher Certification, there are several authorities that provide oversight into the process of food preparation and delivery. These authorities have their own seal of approval. In lieu of Kosher authorities, who will be playing the morality, seal of approval, role? Vehicle Insurance companies? Car insurance will be rewritten when it comes to autonomous cars. Some cars may have a higher deductible or the cost of the policy may rise based upon the autonomous implementation.
Conditions Under Consideration:
1. If the autonomous vehicle is in a position of saving a single life in the vehicle, and killing one or more people outside the vehicle, what will the autonomous vehicle do?
1.1 What happens if the passenger in the autonomous vehicle is a child/minor. Does the rule execution change?
1.2 what if the outside party is a procession, a condensed population of people. Will the decision change?
The more sensors, the more input to the decision process.