Tag Archives: Digital Assistant

Riddle of the Sphinx: Improving Machine Learning

Data Correlations Require Perspective

As I was going to St. Ives,

I met a man with seven wives,

Each wife had seven sacks,

Each sack had seven cats,

Each cat had seven kits:

Kits, cats, sacks, and wives,

How many were there going to St. Ives?

One.

This short example may confound man and machine. How does a rules engine work, how does it make correlations to derive an answer to this and other riddles?  If AI, a rules engine is wrong trying to solve this riddle, how does it use machine learning to adjust, and tune its “model” to draw an alternate conclusion to this riddle?

Training rules engines using machine learning and complex riddles may require AI to define relationships not previously considered, analogously to how a boy or man consider solving riddles.  Man has more experiences than a boy, widening their model to increase the possible answer sets. But how to conclude the best answer?  Question sentence fragments may differ over a lifetime, hence the man may have more context as to the number of ways the question sentence fragment may be interpreted.

Adding Context: Historical and Pop Culture

There are some riddles thousands of years old.  They may have spawned from another culture in another time and survived and evolved to take on a whole new meaning.  Understanding the context of the riddle may be the clue to solving it.

Layers of historical culture provide context to the riddle, and the significance of a word or phrase in one period of history may wildly differ.  When you think of “periods of history”, you might think of the pinnacle of the Roman empire, or you may compare the 1960s, the 70s, 80s, etc.

Asking a question of an AI, rules engine, such as a chatbot may need contextual elements, such as geographic location, and “period in history”, additional dimensions to a data model.

Many chatbots have no need for additional context, a referential subtext, they simply are “Expert Systems in a box”.  Now digital assistants may face the need for additional dimensions of context, as a general knowledge digital agent spanning expertise without bounds.

 Sophocles: The Sphinx’s riddle

Written in the fifth century B.C., Oedipus the King is one of the most famous pieces of literature of all time, so it makes sense that it gave us one of the most famous riddles of all time.

What goes on four legs in the morning, on two legs at noon, and on three legs in the evening?

A human.

Humans crawl on hands and knees (“four legs”) as a baby, walk on two legs in mid-life (representing “noon”) and use a walking stick or can (“three legs”) in old age.

A modern interpretation of the riddle may not allow for the correlation and solving the riddle.  As such “three legs”, i.e. a cane, may be elusive, as we think of the elderly on four wheels on a wheelchair.

In all sincerity, this article is not about an AI rules engine “firing rules” using a time dimension, such as:

  • Not letting a person gain entry to a building after a certain period of time, or…
  • Providing a time dimension to “Parental Controls” on a Firewall / Router, the Internet is “cut off” after 11 PM.

Adding a date/time dimension to the question may produce an alternate question. The context of the time changes the “nature” of the question, and therefore the answer as well.

Hostess with the Mostest – Apple Siri, Amazon Alexa, Microsoft Cortana, Google Assistant

Application Integration Opportunities:

  • Microsoft Office, Google G Suite, Apple iWork
    • Advice is integrated within the application, proactive and reactive: When searching in Microsoft Edge, a blinking circle representing Cortana is illuminated.  Cortana says “I’ve collected similar articles on this topic.”  If selected, presents 10 similar results in a right panel to help you find what you need.
  • Personal Data Access and Management
    • The user can vocally access their personal data, and make modifications to that data; E.g. Add entries to their Calendar, and retrieve the current day’s agenda.

Platform Capabilities: Mobile Phone Advantage

Strengthen core telephonic capabilities where competition, Amazon and Microsoft, are relatively week.

  • Ability to record conversations, and push/store content in Cloud, e.g. iCloud.  Cloud Serverless recording mechanism dynamically tags a conversations with “Keywords” creating an Index to the conversation.  Users may search recording, and playback audio clips +/- 10 seconds before and after tagged occurrence.
Calls into the User’s Smartphones May Interact Directly with the Digital Assistant
  • Call Screening – The digital assistant asks for the name of the caller, purpose of the call, and if the matter is “Urgent”
    • A generic “purpose” response, or a list of caller purpose items can be supplied to the caller, e.g. 1) Schedule an Appointment
    • The smartphone’s user would receive the caller’s name, and the purpose as a message back to the UI from the call, currently in a ‘hold’ state,
    • The smartphone user may decide to accept the call, or reject the call and send the caller to voice mail.
  • A  caller may ask to schedule a meeting with the user, and the digital assistant may access the user’s calendar to determine availability.  The digital assistant may schedule a ‘tentative’ appointment within the user’s calendar.
    • If calendar indicates availability, a ‘tentative’ meeting will be entered. The smartphone user would have a list of tasks from the assistant, and one of the tasks is to ‘affirm’ availability of the meetings scheduled.
  • If a caller would like to know the address of the smartphone user’s office, the Digital Assistant may access a database of “generally available” information, and provide it. The Smartphone user may use applications like Google Keep, and any note tagged with a label “Open Access” may be accessible to any caller.
  • Custom business workflows may be triggered through the smartphone, such as “Pay by Phone”.  When a caller is calling a business user’s smartphone, the call goes to “voice mail” or “digital assistant” based on smartphone user’s configuration.  If the user reaches the “Digital Assistant”, there may be a list of options the user may perform, such as “Request for Service” appointment.  The caller would navigate through a voice recognition, one of many defined by the smartphone users’ workflows.

Platform Capabilities: Mobile Multimedia

Either through your mobile Smartphone, or through a portable speaker with voice recognition (VR).

  • Streaming media / music to portable device based on interactions with Digital Assistant.
  • Menu to navigate relevant (to you) news,  and Digital Assistant to read articles through your portable media device (without UI)

Third Party Partnerships: Adding User Base, and Expanding Capabilities

In the form of platform apps (abstraction), or 3rd party APIs which integrate into the Digital Assistant, allowing users to directly execute application commands, e.g. Play Spotify song, My Way by Frank Sinatra.

  • Any “Skill Set” with specialized knowledge: direct Q&A or instructional guidance  – e.g Home Improvement, Cooking
  • eCommerce Personalized Experience – Amazon
  • Home Automation – doors, thermostats
  • Music – Spotify
  • Navigate Set Top Box (STB) – e.g. find a program to watch
  • Video on Demand (VOD) – e.g. set to record entertainment

 

Smartphone AI Digital Assistant Encroaching on the Virtual Receptionist

Businesses already exist which have developed and sell Virtual Receptionist, that handle many caller needs (e.g. call routing).

However, AI Digital Assistants such as Alexa, Cortana, Google Now, and Siri have an opportunity to stretch their capabilities even further.  Leveraging technologies such as Natural language processing (NLP) and Speech recognition (SR), as well as APIs into the Smartphone’s OS answer/calling capabilities, functionality can be expanded to include:

  • Call Screening –  The digital executive assistant asks for the name of the caller,  purpose of the call, and if the matter is “Urgent
    • A generic “purpose” response or a list of caller purpose items can be supplied to the caller, e.g. 1) Schedule an Appointment
    • The smartphone’s user would receive the caller’s name, and the purpose as a message back to the UI from the call, currently in a ‘hold’ state,
    • The smartphone user may decide to accept the call, or reject the call and send the caller to voicemail.
  • Call / Digital Assistant Capabilities
    • The digital executive assistant may schedule a ‘tentative’ appointment within the user’s calendar.  The caller may ask to schedule a meeting, the digital executive assistant would access the user’s calendar to determine availability.  If calendar indicates availability, a ‘tentative’ meeting will be entered.  The smartphone user would have a list of tasks from the assistant, and one of the tasks is to ‘affirm’ availability of the meetings scheduled.
    • Allow recall of ‘generally available’ information.  If a caller would like to know the address of the smartphone user’s office, the Digital Assistant may access a database of generally available information, and provide it.  The Smartphone user may use applications like Google Keep, and any notes tagged with a label “Open Access” may be accessible to any caller.
    • Join the smartphone user’s social network, such as LinkedIn. If the caller knows the phone number of the person but is unable to find the user through the social network directory, an invite may be requested by the caller.
    • Custom business workflows may also be triggered by the smartphone, such as “Pay by Phone”.

Takeaways

The Digital Executive Assistant capabilities:

  • Able to gain control of your Smartphone’s incoming phone calls
  • Able to interact with the 3rd party, dial in caller,  on a set of business dialog workflows defined by you, the executive.

Amazon’s Alexa vs. Google’s Assistant: Same Questions, Different Answers

Excellent article by  .

Amazon’s Echo and Google’s Home are the two most compelling products in the new smart-speaker market. It’s a fascinating space to watch, for it is of substantial strategic importance to both companies as well as several more that will enter the fray soon. Why is this? Whatever device you outfit your home with will influence many downstream purchasing decisions, from automation hardware to digital media and even to where you order dog food. Because of this strategic importance, the leading players are investing vast amounts of money to make their product the market leader.

These devices have a broad range of functionality, most of which is not discussed in this article. As such, it is a review not of the devices overall, but rather simply their function as answer engines. You can, on a whim, ask them almost any question and they will try to answer it. I have both devices on my desk, and almost immediately I noticed something very puzzling: They often give different answers to the same questions. Not opinion questions, you understand, but factual questions, the kinds of things you would expect them to be in full agreement on, such as the number of seconds in a year.

How can this be? Assuming they correctly understand the words in the question, how can they give different answers to the same straightforward questions? Upon inspection, it turns out there are ten reasons, each of which reveals an inherent limitation of artificial intelligence as we currently know it…


Addendum to the Article:

As someone who has worked with Artificial Intelligence in some shape or form for the last 20 years, I’d like to throw in my commentary on the article.

  1. Human Utterances and their Correlation to Goal / Intent Recognition.  There are innumerable ways to ask for something you want.  The ‘ask’ is a ‘human utterance’ which should trigger the ‘goal / intent’ of what knowledge the person is requesting.  AI Chat Bots, digital agents, have a table of these utterances which all roll up to a single goal.  Hundreds of utterances may be supplied per goal.  In fact, Amazon has a service, Mechanical Turk, the Artificial Artificial Intelligence, which you may “Ask workers to complete HITs – Human Intelligence Tasks – and get results using Mechanical Turk”.   They boast access to a global, on-demand, 24 x 7 workforce to get thousands of HITs completed in minutes.  There are also ways in which the AI Digital Agent may ‘rephrase’ what the AI considers utterances that are closely related.  Companies like IBM look toward human recognition, accuracy of comprehension as 95% of the words in a given conversation.  On March 7, IBM announced it had become the first to hone in on that benchmark, having achieved a 5.5% error rate.
  2. Algorithmic ‘weighted’ Selection verses Curated Content.   It makes sense based on how these two companies ‘grew up’, that Amazon relies on their curated content acquisitions such as Evi,  a technology company which specialises in knowledge base and semantic search engine software. Its first product was an answer engine that aimed to directly answer questions on any subject posed in plain English text, which is accomplished using a database of discrete facts.   “Google, on the other hand, pulls many of its answers straight from the web. In fact, you know how sometimes you do a search in Google and the answer comes up in snippet form at the top of the results? Well, often Google Assistant simply reads those answers.”  Truncated answers equate to incorrect answers.
  3. Instead of a direct Q&A style approach, where a human utterance, question, triggers an intent/goal [answer], a process by which ‘clarifying questions‘ maybe asked by the AI digital agent.  A dialog workflow may disambiguate the goal by narrowing down what the user is looking for.  This disambiguation process is a part of common technique in human interaction, and is represented in a workflow diagram with logic decision paths. It seems this technique may require human guidance, and prone to bias, error and additional overhead for content curation.
  4. Who are the content curators for knowledge, providing ‘factual’ answers, and/or opinions?  Are curators ‘self proclaimed’ Subject Matter Experts (SMEs), people entitled with degrees in History?  or IT / business analysts making the content decisions?
  5. Questions requesting opinionated information may vary greatly between AI platform, and between questions within the same AI knowledge base.  Opinions may offend, be intentionally biased, sour the AI / human experience.

Evaluating fobi.io Chatbot Powered By Google Forms: AI Digital Agent?

Interesting approach to an AI Chatbot implementation.  The business process owner creates one or more Google Forms containing questions and answers, and converts/deploys to a chatbot using fobi.io.  All the questions for [potential] customers/users are captured in a multitude of forms.  Without any code, and within minutes, an interactive chatbot can be produced and deployed for client use.

The trade off for rapid deployment and without coding is a rigid approach of triggering user desired “Goal/Intents”.  It seems a single goal/intent is mapped to a single Google Form.  As opposed to a digital agent, which leverages utterances to trigger the user’s intended goal/intent.  Before starting the chat, the user must select the appropriate Google Form, with the guidance of the content curator.

Another trade off is, it seems, no integration on the backend to execute a business process, essential to many chatbot workflows. For example, given an Invoice ID, the chatbot may search in a transactional database, then retrieve and display the full invoice.  Actually, I may be incorrect. On the Google Forms side, there is a Script Editor. Seems powerful and scary all at the same time.

Another trade off that seems to exist, more on the Google Forms side, is building not just a Form with a list of Questions, but a Consumer Process Workflow, that allows the business to provide an interactive dialog based on answers users provide.  For example, a Yes/No or multichoice answer may lead to alternate sets of questions [and actions].  It doesn’t appear there is any workflow tool provided to structure the Google Forms / fobi.io chatbot Q&A.

However, there are still many business cases for the product, especially for small to mid size organizations.

* Business Estimates – although there is no logic workflow to guide the Q&A sessions with [prospective] customers, the business still may derive the initial information they require to make an initial assessment.  It seems a Web form, and this fobi.io / Google Forms solution seems very comparable in capability, its just a change in the median in which the user interacts to collect the information.

One additional note, Google Forms is not a free product.  Looks like it’s a part of the G Suite. Free two week trial, then the basic plan is $5 per month, which comes with other products as well.  Click here for pricing details.

Although this “chatbot” tries to quickly provide a mechanism to turn a form to a chatbot, it seems it’s still just a form at the end of the day.  I’m interested to see more products from Zoi.ai soon

Beyond Google Search of Personal Data – Proactive, AI Digital Assistant 

As per previous Post, Google Searches Your Personal Data (Calendar, Gmail, Photos), and Produces Consolidated Results, why can’t the Google Assistant take advantage of the same data sources?

Google may attempt to leapfrog their Digital Assistant competition by taking advantage of their ability to search against all Google products.  The more personal data a Digital Assistant may access, the greater the potential for increased value per conversation.

As a first step,  Google’s “Personal”  Search tab in their Search UI has access to Google Calendar, Photos, and your Gmail data.  No doubt other Google products are coming soon.

Big benefits are not just for the consumer to  search through their Personal Goggle data, but provide that consolidated view to the AI Assistant.  Does the Google [Digital] Assistant already have access to Google Keep data, for example.  Is providing Google’s “Personal” search results a dependency to broadening the Digital Assistant’s access and usage?  If so, these…

interactions are most likely based on a reactive model, rather than proactive dialogs, i.e. the Assistant initiating the conversation with the human.

Note: The “Google App” for mobile platforms does:

“What you need, before you ask. Stay a step ahead with Now cards about traffic for your commute, news, birthdays, scores and more.”

I’m not sure how proactive the Google AI is built to provide, but most likely, it’s barely scratching the service of what’s possible.

Modeling Personal, AI + Human Interactions

Starting from N number of accessible data sources, searching for actionable data points, correlating these data points to others, and then escalating to the human as a dynamic or predefined Assistant Consumer Workflow (ACW).  Proactive, AI Digital Assistant initiates human contact to engage in commerce without otherwise being triggered by the consumer.

Actionable data point correlations can trigger multiple goals in parallel.  However, the execution of goal based rules would need to be managed.  The consumer doesn’t want to be bombarded with AI Assistant suggestions, but at the same time, “choice” opportunities may be appropriate, as the Google [mobile] App has implemented ‘Cards’ of bite size data, consumable from the UI, at the user’s discretion.

As an ongoing ‘background’ AI / ML process, Digital Assistant ‘server side’ agent may derive correlations between one or more data source records to get a deeper perspective of the person’s life, and potentially be proactive about providing input to the consumer decision making process.

Bass Fishing Trip
Bass Fishing Trip

For example,

  • The proactive Google Assistant may suggest to book your annual fishing trip soon.  Elevated Interaction to Consumer / User.
  • The Assistant may search Gmail records referring to an annual fishing trip ‘last year’ in August. AI background server side parameter / profile search.   Predefined Assistant Consumer Workflow (ACW) – “Annual Events” Category.  Building workflows that are ‘predefined’ for a core set of goals/rules.
  • AI Assistant may search user’s photo archive on the server side.   Any photo metadata could be garnished from search, including date time stamps, abstracted to include ‘Season’ of Year, and other synonym tags.
  • Photos from around ‘August’ may be earmarked for Assistant use
  • Photos may be geo tagged,  e.g. Lake Champlain, which is known for its fishing.
  •  All objects in the image may be stored as image metadata. Using image object recognition against all photos in the consumer’s repository,  goal / rule execution may occur against pictures from last August, the Assistant may identify the “fishing buddies” posing with a huge “Bass fish”.
  • In addition to the Assistant making the suggestion re: booking the trip, Google’s Assistant may bring up ‘highlighted’ photos from last fishing trip to ‘encourage’ the person to take the trip.

This type of interaction, the Assistant has the ability to proactively ‘coerce’ and influence the human decision making process.  Building these interactive models of communication, and the ‘management’ process to govern the AI Assistant is within reach.

Predefined Assistant Consumer / User Workflows (ACW) may be created by third parties, such as Travel Agencies, or by industry groups, such as foods, “low hanging fruit” easy to implement the “time to get more milk” .  Or, food may not be the best place to start, i.e. Amazon Dash

 

Microsoft to Release AI Digital Agent SDK Integration with Visio and Deploy to Bing Search

Build and deploy a business AI Digital Assistant with the ease of building visio diagrams, or ‘Business Process Workflows’.  In addition, advanced Visio workflows offer external integration, enabling the workflow to retrieve information from external data sources; e.g. SAP CRM; Salesforce.

As a business, Digital Agent subscriber,  Microsoft Bing  search results will contain the business’ AI Digital Assistant created using Visio.  The ‘Chat’ link will invoke the business’ custom Digital Agent.  The Agent has the ability to answer business questions, or lead the user through “complex”, workflows.  For example, the user may ask if a particular store has an item in stock, and then place the order from the search results, with a ‘small’ transaction fee to the business. The Digital Assistant may be hosted with MSFT / Bing or an external server.  Applying the Digital Assistant to search results pushes the transaction to the surface of the stack.

Bing Chat
Bing Digital Chat Agent

Leveraging their existing technologies, Microsoft will leap into the custom AI digital assistant business using Visio to design business process workflows, and Bing for promotion placement, and visibility.  Microsoft can charge the business for the Digital Agent implementation and/or usage licensing.

  • The SDK for Visio that empowers the business user to build business process workflows with ease may have a low to no cost monthly licensing as a part of MSFT’s cloud pricing model.
  • Microsoft may charge the business a “per chat interaction”  fee model, either per chat, or bundles with discounts based on volume.
  • In addition, any revenue generated from the AI Digital Assistant, may be subject to transactional fees by Microsoft.

Why not use Microsoft’s Cortana, or Google’s AI Assistant?  Using a ‘white label’ version of an AI Assistant enables the user to interact with an agent of the search listed business, and that agent has business specific knowledge.  The ‘white label’ AI digital agent is also empowered to perform any automation processes integrated into the user defined, business workflows. Examples include:

  • basic knowledge such as store hours of operation
  • more complex assistance, such as walking a [perspective] client through a process such as “How to Sweat Copper Pipes”.  Many “how to” articles and videos do exist on the Internet already through blogs or youtube.    The AI digital assistant “curator of knowledge”  may ‘recommended’ existing content, or provide their own content.
  • Proprietary information can be disclosed in a narrative using the AI digital agent, e.g.  My order number is 123456B.  What is the status of my order?
  • Actions, such as employee referrals, e.g. I spoke with Kate Smith in the store, and she was a huge help finding what I needed.  I would like to recommend her.  E.g.2. I would like to re-order my ‘favorite’ shampoo with my details on file.  Frequent patrons may reorder a ‘named’ shopping cart.

Escalation to a human agent is also a feature.  When the business process workflow dictates, the user may escalate to a human in ‘real-time’, e.g. to a person’s smartphone.

Note: As of yet, Microsoft representatives have made no comment relating to this article.