Category Archives: Technology

Help Wanted: Civil War Reenactment Soldiers to Improve AI Models

I just read an article on Digital PC Magazine, “Human Help Wanted: Why AI Is Terrible at Content Moderation” which started to get my neurons firing.

Problem Statement

Every day, Facebook’s artificial intelligence algorithms tackle the enormous task of finding and removing millions of posts containing spam, hate speech, nudity, violence, and terrorist propaganda. And though the company has access to some of the world’s most coveted talent and technology, it’s struggling to find and remove toxic content fast enough.

Ben Dickson
July 10, 2019 1:36PM EST

I’ve worked at several software companies which leveraged Artifical Intelligence, Machine Learning to recognize patterns, correlations. The larger the data sets, in general, the higher the accuracy of the predictions. The outliers in the data, the noise, “falls out” of the data set. Without quality, large training data, Artificial Intelligence makes more mistakes.

In terms of speech recognition, image classification, and natural language processing (NLP), in general, programs like chatbots, digital assistants, are becoming more accurate because of their sample size, training data sets are large, and there is no shortage of these data types. For example, there are many ways I can ask my digital assistant for something, like “Get the movie times”. Training a digital assistant, at a high level, would be to catalog how many ways can I ask for “something”, achieve my goal. I can go and create that list. I could write a few dozen questions, but still, my sample data set would be too small. Amazon has a crowdsourcing platform, Amazon Mechanical Turk, which I can request they build me the data sets, thousands of questions, and correlated goals.

MTurk enables companies to harness the collective intelligence, skills, and insights from a global workforce to streamline business processes, augment data collection and analysis, and accelerate machine learning development.

Amazon Mechanical Turk: Access a global, on-demand, 24×7 workforce

Video “Scene” Recognition – Annotated Data Sets for a Wide Variety of Scene Themes

In silent films, the plot was conveyed by the use of title cards, written indications of the plot and key dialogue lines. Unfortunately, silent films are not making a comeback. In order to achieve a high rate of successful identification of activities within a given video clip, video libraries of metadata need to be created, that capture:

  • Media / Video Asset, Unique Identifier
  • Scene Clip IN and OUT timecodes
  • Scene Theme(s), similar to Natural language processing (NLP), Goals = Utterances / Sentences
    • E.g. Man drinking water; Woman playing Tennis
  • Image recognition, in the context of machine vision, is the ability of software to identify objects, places, people, writing and actions in images. Image recognition is used to perform a large number of machine-based visual tasks, such as labeling the content of images with meta-tags

Not Enough Data

Here is an example of how Social Media, such as Facebook, attempts to deal with video deemed inappropriate for their platform:

In March, a shooter in New Zealand live-streamed the brutal killing of 51 people in two mosques on Facebook. But the social-media giant’s algorithms failed to detect the gruesome video. It took Facebook an hour to take the video down, and even then, the company was hard-pressed to deal with users who reposted the video.

Ben Dickson
July 10, 2019 1:36PM EST

…in many cases, such as violent content, there aren’t enough examples to train a reliable AI model. “Thankfully, we don’t have a lot of examples of real people shooting other people,” Yann LeCun, Facebook’s chief artificial-intelligence scientist, told Bloomberg.

Ben Dickson
July 10, 2019 1:36PM EST

Opportunities for Actors and Curators of Video Content: Dramatizations

All those thousands of people who perform, creating videos of content that range the gamut from playing video games to “unboxing” collectible items. The actors who perform dramatizations could add tags to their videos indicating as per above, documenting themes for a given skit. If actors post their videos on YouTube or proprietary crowdsourcing platforms, they would be entitled to some revenue for the use of their licensed video.

Disclosure Regarding Flag Controversy

I now realize there are politics around Nike “tipping their hat” toward the Betsy Ross flag. However, when I referenced the flag in this blog post, I was thinking of the American Revolution, and the 13 colonies flag. I didn’t think the title would resonate with readers, “Help Wanted: Amerian Revolutionary war Reenactment Soldiers to Improve AI Models.”, so I took some creative liberty.

Social Media: News Feed verse App InMail

Better Demographic Penetration and Transparency to More Accurately Determine Creative Media Asset Worth

News Media Assets

News Media Assets are created by writers of non-fictional work, coverage of various topics targeted towards the periodical demographic.

Selling Advertising Space

Layered within the news media product, consists of News Media Assets and sold advertisement space. Ad positioning throughout the news media product may have commonality between the product or service being advertised and the news media asset. A goal is the smooth transition between reader of asset and advertisement.

Revenue Models For News Media Assets

  • Deriving revenue from sponsors of news Media Assets
  • Subscription Base of News Media Assets, regular frequency of news media product to subscriber base.

Social Media – News Feeds

The news agencies post to public news feeds a “teaser” headline, a sentence or two describing the news media asset, and a teaser image all to lure prospective readers to clink a link to the news media publisher’s platform. At that point, the publisher sets the “ground rules” for the potential subscriber, e.g. 10 free articles a month, then their digital subscription price of NN goes into effect.

Social Media – InMail (I.e. eMail within the platform)

InMail through the social media platform can come from a variety of sources, for example:

  • Former colleague looking to reconnect
  • Recruiter looking to pitch a potential role
  • Sales / Marketing InMail targeting you as a potential customer of their product or service
The Tools to get the Job Done

As a prior client of LinkedIn Advertising for both ad placement and Sponsored InMail, I found the tools provided and the granularity upon which to refine the demographics impressive, and not lacking in any way.

Personable, Targeted Marketing of News Media Assets, sponsored by 3rd party promoting their product or service.

Delivering News Media Assets to your digital door step, with advertising partners speckled into the asset. Because of the granularity of the InMail advertising controls demographics are at a level of precision. Beyond what a magazine or newspaper, digital or print, can offer.

it’s all about the targeted audience and the granularity of the data collected and then leveraged to meet the desired audience. Much more personal than a link back to the publisher’s platform.

Just like there are expenses to do business in print or traditional digital, the price of doing business with a platform like LinkedIn Sponsored InMail, would be absorbed by the news media agency, net advertisement placement for advertisements.

Although the LinkedIN Social platform was used for reference, other platforms may be leveraged, depending upon the product or services being marketed, such as a Facebook People Magazine article relevant to their demographic, partnership / sponsorship.

Fake News – Not a Problem

Since News Media Agencies will now pair with “sponsors” or commonly know as advertisers, both parties, the news agency and the sponsor have “skin in the game”, it is less likely to be a factitious article.

Free Nights and Weekends Makes a Comeback

Remember when you could make free mobile calls after 9:30 PM weeknights, and all weekend? For awhile the mobile carriers competed on the time when “off-peak” started, from 10 PM to 8:30 PM. A whole hour and a half! These days we have unlimited domestic calling all the time.

So, now we have varying degrees of data plans, such as AT&T Wireless 3 GB, 9 GB, or unlimited per month, but there are caps where after 22 GB data transfer speeds are slowed down.  22 gigs seem like a lot until you have kids using Snapchat and TikTok.

When you think about it, data peak is when you may not be in a hot spot. At night, you’re at home using your own WiFi, or at an establishment with their complimentary WiFi. Weekends and weekdays are a bit scattered. Your work may have WiFi, but weekdays “on peak” are mostly commuting times, the “rush hour(s)”,

Can wireless carriers bring back on and off-peak for data?  The simplest approach:  “turn off the meter” during off-peak data periods.  Maybe on-peak the consumer can elect 5G, when available, and off-peak at 4G LTE? Our Smartphones can identify low consuming bandwidth opportunities, e.g. when the phone is locked, text messages without graphics and email are semi-passive states. Maybe users are able to prioritize their apps data usage? What about those “chatty” apps that you rarely use? Smartphone settings may show you those apps bandwidth consumption as opportunities to prioritize them lower than your priority apps.

Skeptic, and think there are no Peak or Off-Peak periods with data?  Check the business analytics.  I’m sure wireless carriers have a depth of understanding for their own business intelligence (BI).

7 Failures I Needed to Succeed

Here is a list of seven failures from my professional career, how I met those challenges, and in some cases, turned them into opportunities

Underestimate

Eager to please throughout my career, I was burned many times, and in some cases continue to be burned by underestimating the effort required for an activity, or task, which roll up to the delivery of features, or meeting a milestone. In my earlier years, I “shot from the hip” to senior management, and they held me to those commitments. Over the years, I’ve been fortunate enough to document and mitigate risks. In addition I learned additional tools, both process and communication / people skills:

* “Interesting point, let me consider, and get back to you.” You don’t have to provide an answer right away. Consider the scope and impact of the questions you are presented. Unless you are almost certain of the answer, try to defer.

* Planning Poker (Agile) collaborative (blind) estimates make better estimations. Through collaboration, you reach joint commitment. You eliminate the “boss knows best” factor.

Hearing but not Listening

Throughout my personal and professional life, I’ve struggled with this aspect of communication, more so earlier on in my life. Two people have a meeting, and discuss their point of views regarding the same topic. They both leave the room, and have two polar opposite prospectives of what was communicated.

Even in the same language, things get “lost in the translation.“. There are many process tools to better your communications style. You hear what you want to hear. You don’t probe deep enough into another person’s perspective.

Overestimate

Adding too much margin into an estimate, being conservative in your effort estimate at times may not be the best course of action. “Right Sizing” the estimate is typically the desired approach unless otherwise guided by the appropriate stakeholders. There are lots of tools for Effort estimation, poker planning, and fist of five are just two examples.

Army of One – Embrace Opportunity

I was brought into a development team as a Software Quality Assurance manager for a well known Financial Services organization. I was to build a team of QA staff as well as mature their process workflow, e.g. implement software change management.

The department’s QA resources per team dwindled, letting go these resources, and not growing the teams as first advertised during the interviews. I found myself constantly working with the team putting out fires. Best case scenario, I worked “after” hours just to work on the strategic stuff like process improvements, and automation. I stuck to the opportunity to learn as much as possible. Sticking with the job, I built my knowledge and relationships that would wind up propelling my career to later on build and manage a 50 person, global team.

Build it and they will Come…Bull!

I chose to try my own startup at some point in my professional career. I had worked for a startup firm out of college, but that was not the same as my own self startup. There were lots of balls to juggle, decisions to make and prioritize. After a year and a half, I shutdown the company, more money going out than in, and I was also “relatively” self funded.

One of the several ill choices I made was “Build it and They will Come.” At the time it was 2009, and the mobile frenzy was just starting to heat up. Feb 2009, Apple was at 30 USD per share! 30! I built a client/server mobile application for expertise transactions, way ahead of my time. I was almost entirely focused on the development of the solution, I clearly lost sight of the focused requirement of building market share. I did post Press Releases, but I didn’t embrace digital marketing as a core spend and activity for my business.

Needless to say I was “The Best Kept Secret”.

Chasing the Sun

As a software product, startup firm, you need to segment your product to align to a target audience. However, honing in on the target market maybe problematic if the “fish aren’t biting”.

You find yourself reassessing the strategic and tactical goals of your product, pivoting often to eventually find your “pay dirt”. There may be fundamental influences to your ecosystem, such as a shift in a 3rd party product previously seen as complementary now seen as “overlapping”. Sales pitch and marketing approach may need to change along with your product.

Although pivoting often may be the name of the game, you still should recognize the cost in adapting to change. Process flows like being “agile” and Scrum help to smooth the pivot, as these processes revolve around constant development iterations and reflections every few weeks.

Time to Pull the Parachute Cord

I still have trouble with knowing when it’s time to say when. I enjoy troubleshooting problems, business, people, process, and technical. So, how long do you work on problem before you pull the ripcord?

Riddle of the Sphinx: Improving Machine Learning

Data Correlations Require Perspective

As I was going to St. Ives,

I met a man with seven wives,

Each wife had seven sacks,

Each sack had seven cats,

Each cat had seven kits:

Kits, cats, sacks, and wives,

How many were there going to St. Ives?

One.

This short example may confound man and machine. How does a rules engine work, how does it make correlations to derive an answer to this and other riddles?  If AI, a rules engine is wrong trying to solve this riddle, how does it use machine learning to adjust, and tune its “model” to draw an alternate conclusion to this riddle?

Training rules engines using machine learning and complex riddles may require AI to define relationships not previously considered, analogously to how a boy or man consider solving riddles.  Man has more experiences than a boy, widening their model to increase the possible answer sets. But how to conclude the best answer?  Question sentence fragments may differ over a lifetime, hence the man may have more context as to the number of ways the question sentence fragment may be interpreted.

Adding Context: Historical and Pop Culture

There are some riddles thousands of years old.  They may have spawned from another culture in another time and survived and evolved to take on a whole new meaning.  Understanding the context of the riddle may be the clue to solving it.

Layers of historical culture provide context to the riddle, and the significance of a word or phrase in one period of history may wildly differ.  When you think of “periods of history”, you might think of the pinnacle of the Roman empire, or you may compare the 1960s, the 70s, 80s, etc.

Asking a question of an AI, rules engine, such as a chatbot may need contextual elements, such as geographic location, and “period in history”, additional dimensions to a data model.

Many chatbots have no need for additional context, a referential subtext, they simply are “Expert Systems in a box”.  Now digital assistants may face the need for additional dimensions of context, as a general knowledge digital agent spanning expertise without bounds.

 Sophocles: The Sphinx’s riddle

Written in the fifth century B.C., Oedipus the King is one of the most famous pieces of literature of all time, so it makes sense that it gave us one of the most famous riddles of all time.

What goes on four legs in the morning, on two legs at noon, and on three legs in the evening?

A human.

Humans crawl on hands and knees (“four legs”) as a baby, walk on two legs in mid-life (representing “noon”) and use a walking stick or can (“three legs”) in old age.

A modern interpretation of the riddle may not allow for the correlation and solving the riddle.  As such “three legs”, i.e. a cane, may be elusive, as we think of the elderly on four wheels on a wheelchair.

In all sincerity, this article is not about an AI rules engine “firing rules” using a time dimension, such as:

  • Not letting a person gain entry to a building after a certain period of time, or…
  • Providing a time dimension to “Parental Controls” on a Firewall / Router, the Internet is “cut off” after 11 PM.

Adding a date/time dimension to the question may produce an alternate question. The context of the time changes the “nature” of the question, and therefore the answer as well.

IBM didn’t inform people when it used their Flickr photos for facial recognition training – The Verge

The problem is more widespread then highlighted in the article.  It’s not just these high profile companies using “public domain” images to annotate with facial recognition notes and training machine learning (ML) models.  Anyone can scan the Internet for images of people, and build a vast library of faces.  These faces can then be used to train ML models.  In fact, using public domain images from “the Internet” will cut across multiple data sources, not just Flickr, which increases the sample size, and may improve the model.

The rules around the uses of “Public Domain” image licensing may need to be updated, and possibly a simple solution, add a watermark to any images that do not have permission to be used for facial recognition model training.  All image processors may be required to include a preprocessor to detect the watermark in the image, and if found, skip the image from being included in the training of models.

Source: IBM didn’t inform people when it used their Flickr photos for facial recognition training – The Verge

When and How to Create Journey Maps

Journey Maps are excellent as a tool for deriving requirements, as well as better understanding the customer.  Similar to a paper-based, use case process to understand an “Actor” on their business workflow, journey maps visualize the customer/user experiences.  The article below is a primer to the creation and usage of a Journey Map.

Summary: Journey maps combine two powerful instruments—storytelling and visualization—in order to help teams understand and address customer needs. While maps take a wide variety of forms depending on context and business goals, certain elements are generally included, and there are underlying guidelines to follow that help them be the most successful.

What Is a Customer Journey Map?
In its most basic form, journey mapping starts by compiling a series of user goals and actions into a timeline skeleton. Next, the skeleton is fleshed out with user thoughts and emotions in order to create a narrative. Finally, that narrative is condensed into a visualization used to communicate insights that will inform design processes.

Source: When and How to Create Customer Journey Maps

Microsoft’s Azure DevOps – Planning Poker Estimation Tool

Although I’ve been a huge fan of PlanningPoker.com since 2011, my Scrum Product team consisted of more than five members, and their Free Membership allows up to 5 users. The team I was working with had just started their agile transformation and was trying out aspects of Agile / Scrum they wanted to adopt. They weren’t about to make the investment in Planning Poker for estimations quite yet, so I stumbled across an estimation tool as a free add-on to Azure DevOps.

Microsoft’s Azure DevOps solution is both a code and requirements repository in one. Requirements are managed from an Agile perspective, through a Product Backlog of user stories. The user story backlog item type contains a field called “Story Points”, or sometimes configured as “Effort”.

Ground Rules – 50k Overview

All team members select from a predetermined relative effort scale, such as Tee Shirt Sizes (XS, S, M, L, XL) or Fibonacci sequence (0, 1/2, 1, 2, 3, 5, 8, 13, 21, 34…) All selections of team members are hidden until the facilitator decides to expose/flip all team selections at once. Flipping at once should help to remove natural biases, such as selecting the same value as the team tech lead’s selection. After that, there’s a team discussion to normalize the value into an agreed selection, such as the average value.

Estimate New Session

Integration with Azure DevOps

The interesting thing about this estimation tool is you can explicitly select stories to perform the effort estimation process right from the backlog, and in turn, once the team agrees upon a value, it can be committed to the User Story in the Backlog. No jumping between user stories, updating and saving field values. All performed from the effort estimation tool.

As your Digital Assistant, Siri Will Answer Incoming Calls

Voice mail is so LAST Century. It’s a static communications interface to address your incoming phone calls. It’s a dinosaur in terms of communications protocol. Yes, a digital assistant, or chat bots should “field” your incoming calls, providing your callers a higher level of service.

Business or Personal?

Why not both? There are use cases which highlight the value of a Digital Assistant answering your phone calls when you’re unavailable.

Trusted Friends and Business Pins

Level of available services may change based upon the level of trusted access, such as:

  • Friends Seeking Your Availability for a Hockey Game Next Week
  • Business Partners Sharing Information access such as invoices

Untrusted Caller Access

  • The Vetting of Unsolicited Calls, such as robocalls

Defining and Default Dialogs

Users can define dialogs through drop and drag workflow diagram tools making it easy to “build” conversations / dialogs flows. In addition, out of the box flows can provide administrators with opportunities and discover the ways in which AI digital assistant may be leveraged.

Canned / Default dialog templates to handle the most common dialogs / workflows will empower users to the implement rapidly.

Any Acquisitions in the Pipeline?

Are the big names in the Digital Assistant space looking to partner or acquire tools that can easily transform workflows to be leveraged by digital assistant?

  • IBM’s Conversations – chatbot dialog definition tool
  • Interactive Voice Response (IVR) solutions

APIs available on Mobile OS SDKs?

Are the components available for third party product companies to extend the Mobile OS capabilities as of now? Or are the mobile OS companies the only ones in a possession of performing these upgrades?

Cryptocurrency + Quantum Computing := Encryption Fail, The Next Y2K

Over the last several months I’ve been researching Quantum Computing (QC) and trying to determine how far we’ve come from the theoretical to the practical implementation.  It seems we are in the early commercial prototypical phase.

Practical Application of QC

The most discussed application of Quantum Computing has been to crack encryption.  Encrypted data that may take months or years to decipher given our current supercomputing capabilities, may take hours or minutes when the full potential of Quantum Computing has been realized.

Bitcoin and Ethereum Go Boom

One source paraphrased: Once quantum computing is actualized, encryption will be in lockstep progress, and a new cryptology paradigm will be implemented to secure our data. This kind of optimism has no place in the “Real World”. and most certainly not in the world financial markets.   Are there hedge funds which rightfully hedge against the cryptocurrency / QC risk paradigm?

Where is the Skepticism?

Is there anyone researching next steps in the evolution of cryptography/encryption, hedging the risk that marketplace encryption will be ready? The lack of fervor in the development of “Quantum Computing Ready” encryption has me speechless. Government organizations like DARPA / SBIR should already be at a conceptual level if not at the prototypical phase with next-generation cryptology.

Too Many Secrets

Sneakers“, a classic fictional action movie with a fantastic cast, and its plot, a mathematician in secret develops the ultimate code-breaking device, and everyone is out to possess the device.  An excellent movie soon to be non-fictional..?

References: