Tag Archives: Amazon Alexa

Hostess with the Mostest – Apple Siri, Amazon Alexa, Microsoft Cortana, Google Assistant

Application Integration Opportunities:

  • Microsoft Office, Google G Suite, Apple iWork
    • Advice is integrated within the application, proactive and reactive: When searching in Microsoft Edge, a blinking circle representing Cortana is illuminated.  Cortana says “I’ve collected similar articles on this topic.”  If selected, presents 10 similar results in a right panel to help you find what you need.
  • Personal Data Access and Management
    • The user can vocally access their personal data, and make modifications to that data; E.g. Add entries to their Calendar, and retrieve the current day’s agenda.

Platform Capabilities: Mobile Phone Advantage

Strengthen core telephonic capabilities where competition, Amazon and Microsoft, are relatively week.

  • Ability to record conversations, and push/store content in Cloud, e.g. iCloud.  Cloud Serverless recording mechanism dynamically tags a conversations with “Keywords” creating an Index to the conversation.  Users may search recording, and playback audio clips +/- 10 seconds before and after tagged occurrence.
Calls into the User’s Smartphones May Interact Directly with the Digital Assistant
  • Call Screening – The digital assistant asks for the name of the caller, purpose of the call, and if the matter is “Urgent”
    • A generic “purpose” response, or a list of caller purpose items can be supplied to the caller, e.g. 1) Schedule an Appointment
    • The smartphone’s user would receive the caller’s name, and the purpose as a message back to the UI from the call, currently in a ‘hold’ state,
    • The smartphone user may decide to accept the call, or reject the call and send the caller to voice mail.
  • A  caller may ask to schedule a meeting with the user, and the digital assistant may access the user’s calendar to determine availability.  The digital assistant may schedule a ‘tentative’ appointment within the user’s calendar.
    • If calendar indicates availability, a ‘tentative’ meeting will be entered. The smartphone user would have a list of tasks from the assistant, and one of the tasks is to ‘affirm’ availability of the meetings scheduled.
  • If a caller would like to know the address of the smartphone user’s office, the Digital Assistant may access a database of “generally available” information, and provide it. The Smartphone user may use applications like Google Keep, and any note tagged with a label “Open Access” may be accessible to any caller.
  • Custom business workflows may be triggered through the smartphone, such as “Pay by Phone”.  When a caller is calling a business user’s smartphone, the call goes to “voice mail” or “digital assistant” based on smartphone user’s configuration.  If the user reaches the “Digital Assistant”, there may be a list of options the user may perform, such as “Request for Service” appointment.  The caller would navigate through a voice recognition, one of many defined by the smartphone users’ workflows.

Platform Capabilities: Mobile Multimedia

Either through your mobile Smartphone, or through a portable speaker with voice recognition (VR).

  • Streaming media / music to portable device based on interactions with Digital Assistant.
  • Menu to navigate relevant (to you) news,  and Digital Assistant to read articles through your portable media device (without UI)

Third Party Partnerships: Adding User Base, and Expanding Capabilities

In the form of platform apps (abstraction), or 3rd party APIs which integrate into the Digital Assistant, allowing users to directly execute application commands, e.g. Play Spotify song, My Way by Frank Sinatra.

  • Any “Skill Set” with specialized knowledge: direct Q&A or instructional guidance  – e.g Home Improvement, Cooking
  • eCommerce Personalized Experience – Amazon
  • Home Automation – doors, thermostats
  • Music – Spotify
  • Navigate Set Top Box (STB) – e.g. find a program to watch
  • Video on Demand (VOD) – e.g. set to record entertainment

 

Smartphone AI Digital Assistant Encroaching on the Virtual Receptionist

Businesses already exist which have developed and sell Virtual Receptionist, that handle many caller needs (e.g. call routing).

However, AI Digital Assistants such as Alexa, Cortana, Google Now, and Siri have an opportunity to stretch their capabilities even further.  Leveraging technologies such as Natural language processing (NLP) and Speech recognition (SR), as well as APIs into the Smartphone’s OS answer/calling capabilities, functionality can be expanded to include:

  • Call Screening –  The digital executive assistant asks for the name of the caller,  purpose of the call, and if the matter is “Urgent
    • A generic “purpose” response or a list of caller purpose items can be supplied to the caller, e.g. 1) Schedule an Appointment
    • The smartphone’s user would receive the caller’s name, and the purpose as a message back to the UI from the call, currently in a ‘hold’ state,
    • The smartphone user may decide to accept the call, or reject the call and send the caller to voicemail.
  • Call / Digital Assistant Capabilities
    • The digital executive assistant may schedule a ‘tentative’ appointment within the user’s calendar.  The caller may ask to schedule a meeting, the digital executive assistant would access the user’s calendar to determine availability.  If calendar indicates availability, a ‘tentative’ meeting will be entered.  The smartphone user would have a list of tasks from the assistant, and one of the tasks is to ‘affirm’ availability of the meetings scheduled.
    • Allow recall of ‘generally available’ information.  If a caller would like to know the address of the smartphone user’s office, the Digital Assistant may access a database of generally available information, and provide it.  The Smartphone user may use applications like Google Keep, and any notes tagged with a label “Open Access” may be accessible to any caller.
    • Join the smartphone user’s social network, such as LinkedIn. If the caller knows the phone number of the person but is unable to find the user through the social network directory, an invite may be requested by the caller.
    • Custom business workflows may also be triggered by the smartphone, such as “Pay by Phone”.

Takeaways

The Digital Executive Assistant capabilities:

  • Able to gain control of your Smartphone’s incoming phone calls
  • Able to interact with the 3rd party, dial in caller,  on a set of business dialog workflows defined by you, the executive.

AI Personal Assistant Needs Remedial Guidance for their Users

Providing Intelligent ‘Code’ Completion

At this stage in the application platform growth and maturity of the AI Personal Assistant, there are many commands and options that common users cannot formulate due to a lack of knowledge and experience.  Using Natural Language to formulate questions has gotten better over the years, but assistance / guidance formulating the requests would maximize intent / goal accuracy.

A key usability feature for many integrated development environments (IDE) are their capability to use “Intelligent Code Completion” to guide their programmers to produce correct, functional syntax. This feature also enables the programmer to be unburdened by the need to look up syntax for each command reference, saving significant time.  As the usage of the AI Personal Assistant grows, and their capabilities along with it, the amount of commands and their parameters required to use the AI Personal Assistant will also increase.

AI Leveraging Intelligent Command Completion

For each command parameter [level\tree], a drop down list may appear giving users a set of options to select for the next parameter. A delimiter such as a period(.) indicates to the AI Parser another set of command options must be presented to the person entering the command. These options are typically in the form of drop down lists concatenated to the right of the formulated commands.  Vocally, parent / child commands and parameters may be supplied in a similar fashion.

AI Personal Assistant Language Syntax

Adding another AI parser on top of the existing syntax parser may allow commands like these to be executed:

  • Abstraction (e.g. no application specified)
    • Order.Food.Focacceria.List123
    • Order.Food.FavoriteItalianRestaurant.FavoriteLunchSpecial
  • Application Parser
    • Seamless.Order.Food.Focacceria.Large Pizza

These AI command examples uses a hierarchy of commands and parameters to perform the function. One of the above commands leverages one of my contacts, and a ‘List123’ object.  The ‘List123’ parameter may be a ‘note’ on my Smartphone that contains a list of food we would like to order. The command may place the order either through my contact’s email address, fax number, or calling the business main number and using AI Text to Speech functionality.

All personal data, such as Favorite Italian Restaurant,  and Favorite Lunch Special could be placed in the AI Personal Assistant ‘Settings’.  A group of settings may be listed as Key-Value pairs,  that may be considered short hand for conversations involving the AI Assistant.

A majority of users are most likely unsure of many of the options available within the AI Personal assistant command structure. Intelligent command [code] completion empowers users with visibility into the available commands, and parameters.

For those without a programming background, Intelligent “Command” Completion is slightly similar to the autocomplete in Google’s Search text box, predicting possible choices as the user types. In the case of the guidance provided by an AI Personal Assistant the user is guided to their desired command; however, the Google autocomplete requires some level or sense of the end result command. Intelligent code completion typically displays all possible commands in a drop down list next to the constructor period (.). In this case the user may have no knowledge of the next parameter without the drop down choice list.  An addition feature enables the AI Personal Assistant to hover over one of the commands\parameters to show a brief ‘help text’ popup.

Note, Microsoft’s Cortana AI assistant provides a text box in addition to speech input.  Adding another syntax parser could be allowed and enabled through the existing User Interface.  However, Siri seems to only have voice recognition input, and no text input.

Is Siri handling the iOS ‘Global Search’ requests ‘behind the scenes’?  If so, the textual parsing, i.e. the period(.) separator would work. Siri does provide some cursory guidance on what information the AI may be able to provide,  “Some things you can ask me:”

With only voice recognition input, use the Voice Driven Menu Navigation & Selection approach as described below.

Voice Driven, Menu Navigation and Selection

The current AI personal assistant, abstraction layer may be too abstract for some users.  The difference between these two commands:

  • Play The Rolling Stones song Sympathy for the Devil.
    • Has the benefit of natural language, and can handle simple tasks, like “Call Mom”
    • However, there may be many commands that can be performed by a multitude of installed platform applications.

Verse

  • Spotify.Song.Sympathy for the Devil
    • Enables the user to select the specific application they would like a task to be performed by.
  • Spotify Help
    • A voice driven menu will enable users to understand the capabilities of the AI Assistant.    Through the use of a voice interactive menu, users may ‘drill down’ to the action they desire to be performed. e.g. “Press # or say XYZ”
    • Optionally, the voice menu, depending upon the application, may have a customer service feature, and forward the interaction to the proper [calling or chat] queue.

Update – 9/11/16

  • I just installed Microsoft Cortana for iOS, and at a glance, the application has a leg up on the competition
    • The Help menu gives a fair number of examples by category.  Much better guidance that iOS / Siri 
    • The ability to enter\type or speak commands provides the needed flexibility for user input.
      • Some people are uncomfortable ‘talking’ to their Smartphones.  Awkward talking to a machine.
      • The ability to type in commands may alleviate voice command entry errors, speech to text translation.
      • Opportunity to expand the AI Syntax Parser to include ‘programmatic’ type commands allows the user a more granular command set,  e.g. “Intelligent Command Completion”.  As the capabilities of the platform grow, it will be a challenge to interface and maximize AI Personal Assistant capabilities.