Tag Archives: Microsoft Cortana

As your Digital Assistant, Siri Will Answer Incoming Calls

Voice mail is so LAST Century. It’s a static communications interface to address your incoming phone calls. It’s a dinosaur in terms of communications protocol. Yes, a digital assistant, or chat bots should “field” your incoming calls, providing your callers a higher level of service.

Business or Personal?

Why not both? There are use cases which highlight the value of a Digital Assistant answering your phone calls when you’re unavailable.

Trusted Friends and Business Pins

Level of available services may change based upon the level of trusted access, such as:

  • Friends Seeking Your Availability for a Hockey Game Next Week
  • Business Partners Sharing Information access such as invoices

Untrusted Caller Access

  • The Vetting of Unsolicited Calls, such as robocalls

Defining and Default Dialogs

Users can define dialogs through drop and drag workflow diagram tools making it easy to “build” conversations / dialogs flows. In addition, out of the box flows can provide administrators with opportunities and discover the ways in which AI digital assistant may be leveraged.

Canned / Default dialog templates to handle the most common dialogs / workflows will empower users to the implement rapidly.

Any Acquisitions in the Pipeline?

Are the big names in the Digital Assistant space looking to partner or acquire tools that can easily transform workflows to be leveraged by digital assistant?

  • IBM’s Conversations – chatbot dialog definition tool
  • Interactive Voice Response (IVR) solutions

APIs available on Mobile OS SDKs?

Are the components available for third party product companies to extend the Mobile OS capabilities as of now? Or are the mobile OS companies the only ones in a possession of performing these upgrades?

Smartphone AI Digital Assistant Encroaching on the Virtual Receptionist

Businesses already exist which have developed and sell Virtual Receptionist, that handle many caller needs (e.g. call routing).

However, AI Digital Assistants such as Alexa, Cortana, Google Now, and Siri have an opportunity to stretch their capabilities even further.  Leveraging technologies such as Natural language processing (NLP) and Speech recognition (SR), as well as APIs into the Smartphone’s OS answer/calling capabilities, functionality can be expanded to include:

  • Call Screening –  The digital executive assistant asks for the name of the caller,  purpose of the call, and if the matter is “Urgent
    • A generic “purpose” response or a list of caller purpose items can be supplied to the caller, e.g. 1) Schedule an Appointment
    • The smartphone’s user would receive the caller’s name, and the purpose as a message back to the UI from the call, currently in a ‘hold’ state,
    • The smartphone user may decide to accept the call, or reject the call and send the caller to voicemail.
  • Call / Digital Assistant Capabilities
    • The digital executive assistant may schedule a ‘tentative’ appointment within the user’s calendar.  The caller may ask to schedule a meeting, the digital executive assistant would access the user’s calendar to determine availability.  If calendar indicates availability, a ‘tentative’ meeting will be entered.  The smartphone user would have a list of tasks from the assistant, and one of the tasks is to ‘affirm’ availability of the meetings scheduled.
    • Allow recall of ‘generally available’ information.  If a caller would like to know the address of the smartphone user’s office, the Digital Assistant may access a database of generally available information, and provide it.  The Smartphone user may use applications like Google Keep, and any notes tagged with a label “Open Access” may be accessible to any caller.
    • Join the smartphone user’s social network, such as LinkedIn. If the caller knows the phone number of the person but is unable to find the user through the social network directory, an invite may be requested by the caller.
    • Custom business workflows may also be triggered by the smartphone, such as “Pay by Phone”.

Takeaways

The Digital Executive Assistant capabilities:

  • Able to gain control of your Smartphone’s incoming phone calls
  • Able to interact with the 3rd party, dial in caller,  on a set of business dialog workflows defined by you, the executive.

Amazon and Microsoft Drinking their own AI Chatbot Champagne?

A relatively new medium of support for businesses small to global conglomerates becomes available based on the exciting yet  embryonic [Chabot] / Digital Agent services.   Amazon and Microsoft, among others, are diving into this transforming space.  The coat of paint is still wet on Amazon Lex and Microsoft Cortana Skills.   MSFT Cortana Skills Kit is not yet available to any/all developers, but has been opened to a select set of partners, enabling them to expand Cortana’s core knowledge set.  Microsoft’s Bot Framework is in “Preview”  phase.  However, the possibilities are extensive, such as another tier of support for both of these companies, if they turn on their own knowledge repositories using their respective Digital Agents [Chabot]  platforms.

Approach from Inception to Deployment

  • The curation and creation of knowledge content may occur with the definition of ‘Goals/Intents’ and their correlated human utterances which trigger the Goal Question and Answer (Q&A) dialog format.  Classic Use Case.  The question may provide an answer with text, images, and video.
  • Taking Goals/Intents and Utterances to ‘the next level’ involves creating / implementing Process Workflows (PW).    A workflow may contain many possibilities for the user to reach their goal with a single utterance triggered.  Workflows look very similar to what you might see in a Visio diagram, with multiple logical paths. Instead of presenting users with the answer based upon the single human utterance, the question, the workflow navigates the users through a narrative to:
    • disambiguate the initial human utterance, and get a better understanding of the specific user goal/intention.  The user’s question to the Digital Agent may have a degree of ambiguity, and workflows enable the AI Digital Agent to determine the goal through an interactive dialog/inspection.   The larger the volume of knowledge, and the closer the goals/intentions, the implementation would require disambiguation.
    • interactive conversation / dialog with the AI Digital Agent, to walk through a process step by step, including text, images, and Video inline with the conversation.  The AI chat agent may pause the ‘directions’ waiting for the human counterpart to proceed.

Future  Opportunities:

  • Amazon to provide billing and implementation / technical support for AWS services through a customized version of their own AWS Lex service?   All the code used to provide this Digital Agent / Chabot maybe ‘open source’ for those looking to implement similar [enterprise] services.
  • Digital Agent may allow the user to share their screen, OCR the current section of code from an IDE, and perform a code review on the functions / methods.
  • Microsoft has an ‘Online Chat’ capability for MSDN.  Not sure how extensive the capability is, and if its a true 1:1 chat, which they claim is a 24/7 service. Microsoft has libraries of content from Microsoft Docs, MSDN, and TechNet.  If the MSFT Bot framework has the capability to ingest their own articles,  users may be able to trigger these goals/intents from utterances, similar to searching for knowledge base articles today.
  • Abstraction, Abstraction, Abstraction.  These AI Chatbot/Digital Agents must float toward Wizards to build and deploy, and attempt to stay away from coding.  Elevating this technology to be configurable by a business user.  Solutions have significant possibilities for small companies, and this technology needs to reach their hands.  It seems that Amazon Lex is well on their way to achieving the wizard driven creation / distribution, but have ways to go.  I’m not sure if the back end process execution, e.g. Amazon Lambda, will be abstracted any time soon.

AI Digital Assistant verse Search Engines

Aren’t AI Digital Assistants just like Search Engines? They both try to recognize your question or human utterance as best as possible to serve up your requested content. E.g.classic FAQ. The difference in the FAQ use case is the proprietary information from the company hosting the digital assistant may not be available on the internet.

Another difference between the Digital Assistant and a Search Engine is the ability of the Digital Assistant to ‘guide’ a person through a series of questions, enabling elaboration, to provide the user a more precise answer.

The Digital Assistant may use an interactive dialog to guide the user through a process, and not just supply the ‘most correct’ responses. Many people have flocked to YouTube for instructional type of interactive medium. When multiple workflow paths can be followed, the Digital Assistant has the upper hand.

The Digital Assistant has the capability of interfacing with 3rd parties (E.g. data stores with API access). For example, there may be a Digital Assistant hosted by Medical Insurance Co that has the ability to not only check the status of a claim, but also send correspondence to a medical practitioner on your behalf. A huge pain to call the insurance company, then the Dr office, then the insurance company again. Even the HIPPA release could be authenticated in real time, in line during the chat.  A digital assistant may be able to create a chat session with multiple participants.

Digital Assistants overruling capabilities over Search Engines are the ability to ‘escalate’ at any time during the Digital Assistant interaction. People are then queued for the next available human agent.

There have been attempts in the past, such as Ask.com (originally known as Ask Jeeves) is a question answering-focused e-business.  Google Questions and Answers (Google Otvety, Google Ответы) was a free knowledge market offered by Google that allowed users to collaboratively find good answers, through the web, to their questions (also referred as Google Knowledge Search).

My opinions are my own, and do not reflect my employer’s viewpoint.

AI Personal Assistant Needs Remedial Guidance for their Users

Providing Intelligent ‘Code’ Completion

At this stage in the application platform growth and maturity of the AI Personal Assistant, there are many commands and options that common users cannot formulate due to a lack of knowledge and experience.  Using Natural Language to formulate questions has gotten better over the years, but assistance / guidance formulating the requests would maximize intent / goal accuracy.

A key usability feature for many integrated development environments (IDE) are their capability to use “Intelligent Code Completion” to guide their programmers to produce correct, functional syntax. This feature also enables the programmer to be unburdened by the need to look up syntax for each command reference, saving significant time.  As the usage of the AI Personal Assistant grows, and their capabilities along with it, the amount of commands and their parameters required to use the AI Personal Assistant will also increase.

AI Leveraging Intelligent Command Completion

For each command parameter [level\tree], a drop down list may appear giving users a set of options to select for the next parameter. A delimiter such as a period(.) indicates to the AI Parser another set of command options must be presented to the person entering the command. These options are typically in the form of drop down lists concatenated to the right of the formulated commands.  Vocally, parent / child commands and parameters may be supplied in a similar fashion.

AI Personal Assistant Language Syntax

Adding another AI parser on top of the existing syntax parser may allow commands like these to be executed:

  • Abstraction (e.g. no application specified)
    • Order.Food.Focacceria.List123
    • Order.Food.FavoriteItalianRestaurant.FavoriteLunchSpecial
  • Application Parser
    • Seamless.Order.Food.Focacceria.Large Pizza

These AI command examples uses a hierarchy of commands and parameters to perform the function. One of the above commands leverages one of my contacts, and a ‘List123’ object.  The ‘List123’ parameter may be a ‘note’ on my Smartphone that contains a list of food we would like to order. The command may place the order either through my contact’s email address, fax number, or calling the business main number and using AI Text to Speech functionality.

All personal data, such as Favorite Italian Restaurant,  and Favorite Lunch Special could be placed in the AI Personal Assistant ‘Settings’.  A group of settings may be listed as Key-Value pairs,  that may be considered short hand for conversations involving the AI Assistant.

A majority of users are most likely unsure of many of the options available within the AI Personal assistant command structure. Intelligent command [code] completion empowers users with visibility into the available commands, and parameters.

For those without a programming background, Intelligent “Command” Completion is slightly similar to the autocomplete in Google’s Search text box, predicting possible choices as the user types. In the case of the guidance provided by an AI Personal Assistant the user is guided to their desired command; however, the Google autocomplete requires some level or sense of the end result command. Intelligent code completion typically displays all possible commands in a drop down list next to the constructor period (.). In this case the user may have no knowledge of the next parameter without the drop down choice list.  An addition feature enables the AI Personal Assistant to hover over one of the commands\parameters to show a brief ‘help text’ popup.

Note, Microsoft’s Cortana AI assistant provides a text box in addition to speech input.  Adding another syntax parser could be allowed and enabled through the existing User Interface.  However, Siri seems to only have voice recognition input, and no text input.

Is Siri handling the iOS ‘Global Search’ requests ‘behind the scenes’?  If so, the textual parsing, i.e. the period(.) separator would work. Siri does provide some cursory guidance on what information the AI may be able to provide,  “Some things you can ask me:”

With only voice recognition input, use the Voice Driven Menu Navigation & Selection approach as described below.

Voice Driven, Menu Navigation and Selection

The current AI personal assistant, abstraction layer may be too abstract for some users.  The difference between these two commands:

  • Play The Rolling Stones song Sympathy for the Devil.
    • Has the benefit of natural language, and can handle simple tasks, like “Call Mom”
    • However, there may be many commands that can be performed by a multitude of installed platform applications.

Verse

  • Spotify.Song.Sympathy for the Devil
    • Enables the user to select the specific application they would like a task to be performed by.
  • Spotify Help
    • A voice driven menu will enable users to understand the capabilities of the AI Assistant.    Through the use of a voice interactive menu, users may ‘drill down’ to the action they desire to be performed. e.g. “Press # or say XYZ”
    • Optionally, the voice menu, depending upon the application, may have a customer service feature, and forward the interaction to the proper [calling or chat] queue.

Update – 9/11/16

  • I just installed Microsoft Cortana for iOS, and at a glance, the application has a leg up on the competition
    • The Help menu gives a fair number of examples by category.  Much better guidance that iOS / Siri 
    • The ability to enter\type or speak commands provides the needed flexibility for user input.
      • Some people are uncomfortable ‘talking’ to their Smartphones.  Awkward talking to a machine.
      • The ability to type in commands may alleviate voice command entry errors, speech to text translation.
      • Opportunity to expand the AI Syntax Parser to include ‘programmatic’ type commands allows the user a more granular command set,  e.g. “Intelligent Command Completion”.  As the capabilities of the platform grow, it will be a challenge to interface and maximize AI Personal Assistant capabilities.

AI Personal Assistants are “Life Partners”

Artificial Intelligent (AI)  “Assistants”, or “Bots” are taken to the ‘next level’ when the assistant becomes a proactive entity based on the input from human intelligent experts that grows with machine learning.

Even the implication of an ‘Assistant’ v.  ‘Life Partner’ implies a greater degree of dynamic, and proactive interaction.   The cross over to becoming ‘Life Partner’ is when we go ‘above and beyond’ to help our partners succeed, or even survive the day to day.

Once we experience our current [digital, mobile] ‘assistants’ positively influencing our lives in a more intelligent, proactive manner, an emotional bond ‘grows’, and the investment in this technology will also expand.

Practical Applications Range:

  • Alcoholics Anonymous Coach , Mentor – enabling the human partner to overcome temporary weakness. Knowledge,  and “triggers” need to be incorporated into the AI ‘Partner’;  “Location / Proximity” reminder if person enters a shopping area that has a liquor store.  [AI] “Partner” help “talk down”
  • Understanding ‘data points’ from multiple sources, such as alarms,  and calendar events,  to derive ‘knowledge’, and create an actionable trigger.
    • e.g. “Did you remember to take your medicine?” unprompted; “There is a new article in N periodical, that pertains to your medicine.  Would you like to read it?”
    • e.g. 2 unprompted, “Weather calls for N inches of Snow.  Did you remember to service your Snow Blower this season?”
  • FinTech – while in department store XYZ looking to purchase Y over a certain amount, unprompted “Your credit score indicates you are ‘most likely’ eligible to ‘sign up’ for a store credit card, and get N percentage off your first purchase”  Multiple input sources used to achieve a potential sales opportunity.

IBM has a cognitive cloud of AI solutions leveraging IBM’s Watson.  Most/All of the 18 web applications they have hosted (with source) are driven by human interactive triggers, as with the “Natural Language Classifier”, which helps build a question-and-answer repository.

There are four bits that need to occur to accelerate adoption of the ‘AI Life Partner’:

  1. Knowledge Experts, or Subject Matter Experts (SME) need to be able to “pass on” their knowledge to build repositories.   IBM Watson Natural Language Classifier may be used.
  2. The integration of this knowledge into an AI medium, such as a ‘Digital Assistant’ needs to occur with corresponding ‘triggers’ 
  3. Our current AI ‘Assistants’ need to become [more] proactive as they integrate into our ‘digital’ lives, such as going beyond the setting of an alarm clock, hands free calling, or checking the sports score.   Our [AI] “Life Partner” needs to ‘act’ like buddy and fan of ‘our’ sports team.  Without prompting, proactively serve up knowledge [based on correlated, multiple sources], and/or take [acceptable] actions.
    1. E.g. FinTech – “Our schedule is open tonight, and there are great seats available, Section N, Seat A for ABC dollars on Stubhub.  Shall I make the purchase?”
      1. Partner with vendors to drive FinTech business rules.
  4. Take ‘advantage’ of more knowledge sources, such as the applications we use that collect our data.  Use multiple knowledge sources in concert, enabling the AI to correlate data and propose ‘complex’ rules of interaction.

Our AI ‘Life Partners’ may grow in knowledge, and mature the relationship between man and machine.   Incorporating derived rules leveraging machine learning, without input of a human expert, will come with risk and reward.

Alzheimer’s Inflicted: Technology to Help Remember Habitual Activities  

Anyone ever walk into a room and forget why on Earth you were there?  Were you about to get a cup of coffee, or get your car keys?  Wonderful!  It’s frustrating on my level of distraction, now magnify that to the Nth degree, Alzheimer’s.  Apply a rules and Induction engine, and poof!  A step further away from a managed care facility.

Teaching the AI Induction and rules engine may require the help of your 10 year old grandson.  Relatively easy,  you might need your grandson to sleep over for a day or two.

It’s all about variations of the same theme, tag a location, a room in an apartment, also action tag, such as getting a cup of coffee from the kitchen.  The repetitive nature of the activities with a location tag draws conclusions based on historical behavior.  The more variations of action and coinciding location tags, will begin to become ‘smarter’ about your habitual activities.  In addition, the calculations create a bell curve, a way to prioritize the most probable Location/Action tags used for the suggested behavior.    The ‘outliers’ on the bell curve will have the lowest probability of occurrence.

In addition, RFID tags installed in your apartment will increase the effectiveness of the ‘advice’ engine by adding more granular location tags.

Microchip_rfid_rice
Microchip RFID compared to the size of a grain of rice.
Beyond this ‘black box’ small, lightweight computer (smartphone) integrate a Bluetooth, NFC, WiFi antenna, a mobile application and you’re set.  A small, high quality Bluetooth microphone to interact with the app.  There’s also potential for exploring beyond the home.

Kidding, you don’t need that Grandson to help.  Speak into the mic, “Train” go into the room and say your activity, coffee.  This app will correlate your location, and action.  Everyone loves to be included in the Internet of Things, so app features like alerts for deviation from the location ‘map’ are possible.

In earnest, I am mostly certain that this type of solution exists.  Barriers to adoption could be computer/ smartphone generational gap.  Otherwise, someone is already producing the solution, and I just wasted a bus ride home.

Additionally, this software may be integrated with Apple’s Siri, Google Now,  Yahoo Index, Microsoft Cortana,  an extension of the Personal Assistant.