Category Archives: Music

Hey Siri, Ready for an Antitrust Lawsuit Against Apple? Guess Who’s Suing.

The AI personal assistant with the “most usage” spanning  connectivity across all smart devices, will be the anchor upon which users will gravitate to control their ‘automated’ lives.  An Amazon commercial just aired which depicted  a dad with his daughter, and the daughter was crying about her boyfriend who happened to be in the front yard yelling for her.  The dad says to Amazon’s Alexa, sprinklers on, and yes, the boyfriend got soaked.

What is so special about top spot for the AI Personal Assistant? Controlling the ‘funnel’ upon which all information is accessed, and actions are taken means the intelligent ability to:

  • Serve up content / information, which could then be mixed in with advertisements, or ‘intelligent suggestions’ based on historical data, i.e. machine learning.
  • Proactive, suggestive actions  may lead to sales of goods and services. e.g. AI Personal Assistant flags potential ‘buys’ from eBay based on user profiles.

Three main sources of AI Personal Assistant value add:

  • A portal to the “outside” world; E.g. If I need information, I wouldn’t “surf the web” I would ask Cortana to go “Research” XYZ;   in the Business Intelligence / data warehousing space, a business analyst may need to run a few queries in order to get the information they wanted.  In the same token, Microsoft Cortana may come back to you several times to ask “for your guidance”
  • An abstraction layer between the user and their apps;  The user need not ‘lift a finger’ to any app outside the Personal Assistant with noted exceptions like playing a game for you.
  • User Profiles derived from the first two points; I.e. data collection on everything from spending habits, or other day to day  rituals.

Proactive and chatty assistants may win the “Assistant of Choice” on all platforms.  Being proactive means collecting data more often then when it’s just you asking questions ADHOC.  Proactive AI Personal Assistants that are Geo Aware may may make “timely appropriate interruptions”(notifications) that may be based on time and location.  E.g. “Don’t forget milk” says Siri,  as your passing the grocery store.  Around the time I leave work Google maps tells me if I have traffic and my ETA.

It’s possible for the [non-native] AI Personal Assistant to become the ‘abstract’ layer on top of ANY mobile OS (iOS, Android), and is the funnel by which all actions / requests are triggered.

Microsoft Corona has an iOS app and widget, which is wrapped around the OS.  Tighter integration may be possible but not allowed by the iOS, the iPhone, and the Apple Co. Note: Google’s Allo does not provide an iOS widget at the time of this writing.

Antitrust violation by mobile smartphone maker Apple:  iOS must allow for the ‘substitution’ of a competitive AI Personal Assistant to be triggered in the same manner as the native Siri,  “press and hold home button” capability that launches the default packaged iOS assistant Siri.
Reminiscent of the Microsoft IE Browser / OS antitrust violations in the past.

Holding the iPhone Home button brings up Siri. There should be an OS setting to swap out which Assistant is to be used with the mobile OS as the default.  Today, the iPhone / iPad iOS only supports “Siri” under the Settings menu.

ANY AI Personal assistant should be allowed to replace the default OS Personal assistant from Amazon’s Alexa, Microsoft’s Cortana to any startup company with expertise and resources needed to build, and deploy a Personal Assistant solution.  Has Apple has taken steps to tightly couple Siri with it’s iOS?

AI Personal Assistant ‘Wish” list:

  • Interactive, Voice Menu Driven Dialog; The AI Personal Assistant should know what installed [mobile] apps exist, as well as their actionable, hierarchical taxonomy of feature / functions.   The Assistant should, for example, ask which application the user wants to use, and if not known by the user, the assistant should verbally / visually list the apps.  After the user selects the app, the Assistant should then provide a list of function choices for that application; e.g. “Press 1 for “Play Song”
    • The interactive voice menu should also provide a level of abstraction when available, e.g. User need not select the app, and just say “Create Reminder”.  There may be several applications on the Smartphone that do the same thing, such as Note Taking and Reminders.  In the OS Settings, under the soon to be NEW menu ‘ AI Personal Assistant’, a list of installed system applications compatible with this “AI Personal Assistant” service layer should be listed, and should be grouped by sets of categories defined by the Mobile OS.
  • Capability to interact with IoT using user defined workflows.  Hardware and software may exist in the Cloud.
  • Ever tighter integration with native as well as 3rd party apps, e.g. Google Allo and Google Keep.

Apple could already be making the changes as a natural course of their product evolution.  Even if the ‘big boys’ don’t want to stir up a hornet’s nest, all you need is VC and a few good programmers to pick a fight with Apple.

AI Personal Assistant Needs Remedial Guidance for their Users

Providing Intelligent ‘Code’ Completion

At this stage in the application platform growth and maturity of the AI Personal Assistant, there are many commands and options that common users cannot formulate due to a lack of knowledge and experience.  Using Natural Language to formulate questions has gotten better over the years, but assistance / guidance formulating the requests would maximize intent / goal accuracy.

A key usability feature for many integrated development environments (IDE) are their capability to use “Intelligent Code Completion” to guide their programmers to produce correct, functional syntax. This feature also enables the programmer to be unburdened by the need to look up syntax for each command reference, saving significant time.  As the usage of the AI Personal Assistant grows, and their capabilities along with it, the amount of commands and their parameters required to use the AI Personal Assistant will also increase.

AI Leveraging Intelligent Command Completion

For each command parameter [level\tree], a drop down list may appear giving users a set of options to select for the next parameter. A delimiter such as a period(.) indicates to the AI Parser another set of command options must be presented to the person entering the command. These options are typically in the form of drop down lists concatenated to the right of the formulated commands.  Vocally, parent / child commands and parameters may be supplied in a similar fashion.

AI Personal Assistant Language Syntax

Adding another AI parser on top of the existing syntax parser may allow commands like these to be executed:

  • Abstraction (e.g. no application specified)
    • Order.Food.Focacceria.List123
    • Order.Food.FavoriteItalianRestaurant.FavoriteLunchSpecial
  • Application Parser
    • Seamless.Order.Food.Focacceria.Large Pizza

These AI command examples uses a hierarchy of commands and parameters to perform the function. One of the above commands leverages one of my contacts, and a ‘List123’ object.  The ‘List123’ parameter may be a ‘note’ on my Smartphone that contains a list of food we would like to order. The command may place the order either through my contact’s email address, fax number, or calling the business main number and using AI Text to Speech functionality.

All personal data, such as Favorite Italian Restaurant,  and Favorite Lunch Special could be placed in the AI Personal Assistant ‘Settings’.  A group of settings may be listed as Key-Value pairs,  that may be considered short hand for conversations involving the AI Assistant.

A majority of users are most likely unsure of many of the options available within the AI Personal assistant command structure. Intelligent command [code] completion empowers users with visibility into the available commands, and parameters.

For those without a programming background, Intelligent “Command” Completion is slightly similar to the autocomplete in Google’s Search text box, predicting possible choices as the user types. In the case of the guidance provided by an AI Personal Assistant the user is guided to their desired command; however, the Google autocomplete requires some level or sense of the end result command. Intelligent code completion typically displays all possible commands in a drop down list next to the constructor period (.). In this case the user may have no knowledge of the next parameter without the drop down choice list.  An addition feature enables the AI Personal Assistant to hover over one of the commands\parameters to show a brief ‘help text’ popup.

Note, Microsoft’s Cortana AI assistant provides a text box in addition to speech input.  Adding another syntax parser could be allowed and enabled through the existing User Interface.  However, Siri seems to only have voice recognition input, and no text input.

Is Siri handling the iOS ‘Global Search’ requests ‘behind the scenes’?  If so, the textual parsing, i.e. the period(.) separator would work. Siri does provide some cursory guidance on what information the AI may be able to provide,  “Some things you can ask me:”

With only voice recognition input, use the Voice Driven Menu Navigation & Selection approach as described below.

Voice Driven, Menu Navigation and Selection

The current AI personal assistant, abstraction layer may be too abstract for some users.  The difference between these two commands:

  • Play The Rolling Stones song Sympathy for the Devil.
    • Has the benefit of natural language, and can handle simple tasks, like “Call Mom”
    • However, there may be many commands that can be performed by a multitude of installed platform applications.

Verse

  • Spotify.Song.Sympathy for the Devil
    • Enables the user to select the specific application they would like a task to be performed by.
  • Spotify Help
    • A voice driven menu will enable users to understand the capabilities of the AI Assistant.    Through the use of a voice interactive menu, users may ‘drill down’ to the action they desire to be performed. e.g. “Press # or say XYZ”
    • Optionally, the voice menu, depending upon the application, may have a customer service feature, and forward the interaction to the proper [calling or chat] queue.

Update – 9/11/16

  • I just installed Microsoft Cortana for iOS, and at a glance, the application has a leg up on the competition
    • The Help menu gives a fair number of examples by category.  Much better guidance that iOS / Siri 
    • The ability to enter\type or speak commands provides the needed flexibility for user input.
      • Some people are uncomfortable ‘talking’ to their Smartphones.  Awkward talking to a machine.
      • The ability to type in commands may alleviate voice command entry errors, speech to text translation.
      • Opportunity to expand the AI Syntax Parser to include ‘programmatic’ type commands allows the user a more granular command set,  e.g. “Intelligent Command Completion”.  As the capabilities of the platform grow, it will be a challenge to interface and maximize AI Personal Assistant capabilities.

Microsoft Flow – Platform Review

It looks like Microsoft created a generic workflow platform, product independent.

Microsoft has software solutions, like MS Outlook with an [email] rules engine built into Outlook.  SharePoint has a workflow solution within the Sharepoint Platform, typically governing the content flowing through it’s system.

Microsoft Flow is a different animal.  It seems like Microsoft has built a ‘generic’ rules engine for processing almost any event.  The Flow product:

  1. Start using the product from one of two areas:  a) “My Flows” where I may view existing and create new [work]flows. b) “Activity”, that shows “Notifications” and “Failures”
  2. Select “My Flows”, and the user may “Create [a workflow] from Blank”,  or “Browse Templates”.  MSFT existing set of templates were created by Microsoft, and also by a 3rd party implying a marketplace.
  3. Select “Create from Blank” and the user has a single drop down list of events, a culmination events across Internet products. There is an implication there could be any product, and event “made compatible” with MSFT Flows.
    1. The drop down list of events has a format of “Product – Event”.  As the list of products and events grow, we should see at least two separate drop down lists, one for products, and a sub list for the product specific events.
    2. Several Example Events Include:
      1. “Dropbox – When a file is created”
      2. “Facebook – When there is a new post to my timeline”
      3. “Project Online – When a new task is created”
      4. “RSS – When a feed item is published”
      5. “Salesforce – When an object is created”
    3. The list of products as well as there events may need a business analyst to rationalize the use cases.
  4. Once an Event is selected, event specific details may be required, e.g. Twitter account details, or OneDrive “watch” folder
  5. Next, a Condition may be added to this [work]flow,  and may be specific to the Event type, e.g. OneDrive File Type properties [contains] XYZ value.  There is also an “advanced mode” using a conditional scripting language.
  6. There is “IF YES” and “IF NO” logic, which then allows the user to select one [or more] actions to perform
    1. Several Action Examples Include:
      1. “Excel – Insert Rows”
      2. “FTP – Create File”
      3. “Google Drive – List files in folder”
      4. “Mail – Send email”
      5. “Push Notification – Send a push notification”
    2. Again, it seems like an eclectic bunch of Products, Actions, and Events strung together to have a system to POC.
  7. The Templates list, predefined set of workflows that may be of interest to anyone who does not want to start from scratch.   The UI provides several ways to filter, list, and search through templates.

Applicable to everyday life, from an individual home user, small business, to the enterprise.  At this stage the product seems in Beta at best, or more accurately, just after clickable prototype.  I ran into several errors trying to go through basic use cases, i.e. adding rules.

Despite the “Preview” launch, Microsoft has showed us the power in [work]flow processing regardless of the service platform provider, e.g.  Box, DropBox, Facebook, GitHub, Instagram, Salesforce, Twitter, Google, MailChimp, …

Microsoft may be the glue to combine service providers who may / expose their services to MSFT Flow functionality.

Create from Blank - Select Condition
Create from Blank – Select Condition

 

Create Rule from Template
Create Rule from Template
Create from Blank Rule Building UI
Create from Blank Rule Building UI

 

Update June 28th, 2016:

Opportunities for Event, Condition, Action Rules

  • Transcoding [cloud] Services
  • [IBM Watson] Cognitive APIs
    • e.g. Language:Translation; E.g.2. Visual Recognition;
  • WordPress – Create a Post
    • New text file dropped in specific folder on Box, DropBox, etc. being ‘monitored’ by MSFT flow [?] Additional code required by user for ‘polling’ capabilities
    • OR new text file attached, and emailed to specific email account folder ‘watched’ by MSFT Flow.
    • Event triggers – Automatic read of new text file
      • stylizing may occur if HTML coding used
    • Action – Post to a Blog
  • ‘ANY’ Event occurs, a custom message is sent using Skype for a single or group of Skype accounts;
    • On several ‘eligible’ events, such as “File Creation” into Box,  the file (or file shared URL) may be sent to the Skype account.
  • ‘ANY’ Event occurs, a custom mobile text message is sent to a single or group of phone numbers.
  • Event occurs for “File Creation” e.g. into Box; after passing a “Condition”, actions occur:
    • IBM Watson Cognitive API, Text to Speech, occurs, and the product of the action is placed in the same Box folder.
  • Action: Using Microsoft Edge (powered by MSN), in the “My news feed” tab, enable action to publish “Cards”, such as app notifications

Challenges \ Opportunities \ Unknowns

  • 3rd party companies existing, published [cloud; web service] APIs may not even need any modification to integrate with Microsoft Flow; however, business approval may be required to use the API in this manner,
  • It is unclear re: Flow Templates need to be created by the product owner, e.g. Telestream, or knowledgeable third party, following the Android, iOS, and/or MSFT Mobile Apps model.
  • It is unclear if the MSFT Flow app may be licensed individually in the cloud, within the 365 cloud suite, or offered for Home and\or Business?

Cloud Storage: Ingestion, Management, and Sharing

Cloud Storage Solutions need differentiation that matters, a tipping point to select one platform over the other.

Common Platforms Used:

Differentiation may come in the form of:

  • Collaborative Content Creation Software, such as DropBox Paper enables individuals or teams to produce content, all the while leveraging the Storage platform for e.g. version control,
  • Embedded integration in a suite of content creation applications, such as Microsoft Office, and OneDrive.
  • Making the storage solution available to developers, such as with AWS S3, and Box.  Developers may create apps powered by the Box Platform or custom integrations with Box
  • iCloud enables users to backup their smartphone, as well tightly integrating with the capture and sharing of content, e.g. Photos.

Cloud Content Lifecycle Categories:

  • Content Creation
    • 3rd Party (e.g. Camera) or Integrated Platform Products
  • Content Ingestion
    • Capture Content and Associated Metadata
  • Content Collaboration
    • Share, Update and Distribution
  • Content Discovery
    • Surface Content; Searching and Drill Down
  • Retention Rules
    • Auto expire pointer to content, or underlying content

Cloud Content Ingestion Services:

Cloud Ingestion Services
Cloud Ingestion Services

Applying Gmail Labels Across All Google Assets: Docs, Photos, Contacts + Dashboard, Portal View

Google applications contain [types of] assets,  either created within the application, or imported into the application.    In Gmail, you have objects, emails, and Gmail enables users to add metadata to the email in the form of tags or “Labels”.  Labeling emails is a very easy way to organize these assets, emails.   If you’re a bit more organized, you may even devise a logical taxonomy to classify your emails.

An email can also be put into a folder and this is completely different than what we are talking about with labels.  An email may be placed into a folder, and have a parent child folder hierarchy.  Only the name of the folder, and it’s correlations to positions in the hierarchy provide this relational metadata.

For personal use, or for small to medium size businesses, users may want to categorize  all of the Google “objects” from each Google App,  so why Isn’t there the capability to apply labels across all Google App assets?  If you work at a law firm, for example, and have documents in Google Docs, and use Google for email, it would be ideal to leverage a company wide taxonomy, and upon any internal search discover all objects logically grouped in a container by labels.

For each Google object asset, such as email in Gmail, users may apply N number of labels to each Google Object asset.

A [Google] dashboard, or portal view may be used to display and access Google assets across Google applications, grouped by Labels .  A Google Apps “Portal Search” may consist of queries that contain asset labels.  A  relational, Google object repository containing assets across all object types (e.g. Google Docs), may be leveraged to store metadata about each Google asset and their relationships.

A [Google] dashboard, or portal view may be organized around individuals (e.g. personal), teams, or an organization.  So, in a law firm, for example, a case number label could be applied to Google Docs,  Google Photos (i.e. Photos and Videos),  and of course, Gmail.

A relatively simple feature to be implemented with a lot of value for Google’s clients, us?  So, why isn’t it implemented?

One better, when we have facial recognition code implemented in Photos (and Videos), applying Google labels to media assets may allow for correlation of Emails to Photos with a rule based engine.

The Google Search has expanded into the mobile Google app.

Leveraging Google “Cards“, developers may create “Cards” for a single or group of Google assets.   Grouping of Google assets may be applied using “Labels”.   As Google assets go through a business or personal user workflow, additional metadata may be added to the asset, such as additional “Labels”.

Expanding upon this solution,  scripts may be created to “push” assets through a workflow, perhaps using Google Cloud Functions.  Google “Cards” may be leveraged as “the bit” that informs users when they have new items to process in a workflow.

Metadata, or Labels, may be used such as “Document Ready for Legal Review” or “Legal Document Review Completed”.

So Much Streaming Music, Just Not in One Place

In the old days, you never knew which CDs the record store would have in stock.  That limitation of physical media was supposed to be solved by digital. Back in the 1990s, technology evangelists and music fans alike began to talk about a “celestial jukebox” — a utopian ideal in which every song ever recorded would be available at a click.  In reality, even a celestial jukebox has gaps. Or more precisely, numerous jukeboxes have come along – iTunes, Pandora, Spotify, SoundCloud, YouTube – and each service has had gaps in its repertoire. And those gaps have been growing bigger and more complicated as artists have wielded more power in withholding their music from one outlet or another.

Source: So Much Streaming Music, Just Not in One Place – The New York Times

Additional Editorial:

Published music libraries are numerous, and have scattered artist coverage for one reason or another.  Music repositories may overlap, or lack completeness of coverage.

As expressed in “As a Data Deluge Grows, Companies Rethink Storage“, creating a system similar to the Internet Domain Name System for “Information Asset Libraries” would help in numerous ways.  Front end UIs may query these “Information Asset (object) libraries” to understand the availability of content across the Internet.

The Domain Name System (DNS) is a hierarchical decentralized naming system for computers, services, or any resource connected to the Internet or a private network.

Another opportunity would be to leverage the existing DNS platform for managing these “Information Asset Repositories”

In a relatively cost restrained implementation, a DNS type effort can be taken up by the music industry.  From artists to distribution channels, existing music repositories can be leveraged, and within months, a music aficionado may go to any participating platform, and search for an artist, title, album, or any other indexed meta data, and results across ‘Information Asset Repositories’ would be displayed to the user with a jump link to the registered information asset in the library.

Small independent artists need just populate a spreadsheet with rows that contain a row for each asset, and all the ‘advertised’ meta data.  Their Information Asset library may be a single flat file, i.e. XML, that conforms to a basic record/row structure.  The independent artist places this file on their web site, e.g. in their root folder, and informs their ISP of the address record type, and it’s location.  A new DNS record specification may need to be created, e.g. MX record.

Cloud Storage and DAM Solutions: Don’t Reign in the Beast

Are you trying to apply metadata on individual files or en masse, attempting to make the vast  growth of cloud storage usage manageable, meaningful storage?

Best practices leverage a consistent hierarchy, an Information Architecture in which to store and retrieve information, excellent.

Beyond that, capabilities computer science has documented and used time and time again, checksum algorithms. Used frequently after a file transfer to verify the file you requested is the file you received.  Most / All Enterprise DAM solutions use some type of technology to ‘allow’ the enforcement of unique assets [upon upload].  In cloud storage and photo solutions targeted toward the individual, consumer side, the feature does not appear to be up ‘close and personal’ to the user experience, thus building a huge expanse of duplicate data (documents, photos, music, etc.).  Another feature, a database [primary] key has been used for decades to identify that a record of data is unique.

Our family sharing alone has thousands and thousands of photos and music. The names of the files could be different for many of the same digital assets.  Sometimes file names are the same, but the metadata between the same files is not unique, but provides value. Tools for ‘merging’ metadata, DAM tools have value to help manage digital assets.

Cloud storage usage is growing exponentially, and metadata alone won’t help rope in the beast. Maybe ADHOC or periodic indexing of files [e.g. by #checksum algorithm] could take on the task of identifying duplicate assets?  Duplicate  assets could be viewed by the user in an exception report?  Less boring, upon upload, ‘on the fly’ let the user know the asset is already in storage, and show a two column diff. of the metadata.

It’s a pain for me, and quite possibly many cloud storage users.  As more people jump on cloud storage, this feature should be front and center to help users grow into their new virtual warehouse.

The industry of cloud storage most likely believes for the common consumer, storage is ‘cheap’, just provide more.  At some stage, the cloud providers may look to DAM tools as the cost of managing a users’ storage rises.  Tools like:

  • duplicate digital assets, files. Use exception reporting to identify the duplicates, and enable [bulk] corrective action, and/or upon upload, duplicate ‘error/warning’ message.
  • Dynamic metadata tagging upon [bulk] upload using object recognition.  Correlating and cataloging one or more [type] objects in a picture using defined Information Architecture.  In addition, leveraging facial recognition for updates to metadata tagging.
    • e.g. “beach” objects: sand, ocean; [Ian Roseman] surfing;
  • Brief questionnaires may enable the user to ‘smartly’ ingest the digital assets; e.g. ‘themes’ of current upload; e.g. a family, or relationship tree to  extend facial recognition correlations.
    • e.g. themes – summer; party; New Year’s Eve
    • e.g. relationship tree – office / work
  • Pan Information Architecture (IA) spanning multiple cloud storage [silos]. e.g. for Photos, spanning [shared] ‘albums’
  • Publically published / shared components of an IA;  e.g. Legal documents;  standards and reuse

Entertainment Portals: Streaming VOD and Live Broadcasts, Games, News

Netflix is a subscription-based film and television program rental service that offers media to subscribers via Internet streaming.

Amazon Instant Video is an Internet video on demand service. It offers television shows and films for rental or purchase.  Selected titles offered free to customers with Amazon Prime subscription.

Bland definitions of what is formulating to be entertainment portals, encompassing multiple media types:

  • Games
  • Movies
  • Music
  • Photos
  • News
  • Social [Platform Integration]
  • Television
  • YouTube
Entertainment Portals:

All or some of the above media types, licensed for distribution,  are served through one or more portals.

Licensing content to be offered across several platforms requires a robust DAM.  Digital asset management (DAM) consists of management tasks and decisions surrounding the ingestion, annotation, cataloguing, storage, retrieval and distribution of digital assets.  The DAM products/processes looks like it will continue to bloom as distribution models are ‘experimented’ by the providers

  • Amazon [Instant]
  • Apple ecosphere
  • AOL
  • Cablevision – Optimum
  • Facebook [social]
  • G+ [social]
  • MSN
  • Netflix
  • ReMake – a fictitious Entertainment portal
    • the project team iterates through user design input, and remakes the UI, [and Workflow]  bi-weekly based on consumer feedback
  • Twitter [social]
  • Verizon FiOS
  • Yahoo
Segmented portals, containing one or two media types
  • Music and Music Games,  name that tune;
Industry Standards for Interfaces to/from Entertainment Portals
  • Search Catalog [by …]
    • API returns ‘Stream able’ / Playable URL for a VOD or Broadcast feed.