Category Archives: Artificial Intelligence

Privacy: Exposing AI Chat Plugin Access to User’s Conversation History.

There are many benefits for allowing third-party plugin access to a user’s Chat history. For example, an OpenAPI, ChatGPT Plugin could periodically troll through a user’s chat history and proactively follow up on a conversation thread that appears to still be open-ended. Or, periodically, the Chat Plugin could aggregate the chat history into subjects by “smart tagging” conversations and then ask the user if they want to talk about the Manchester United, football game last night. Note, in the case of OpenAPI ChatGPT, it has “Limited knowledge of world and events after 2021.” Also, note presently OpenAPI ChatGPT API or the ChatGPT plugin has not exposed the user’s chat history.

3rd Party, Security Permissions for OpenAI, ChatGPT API, and Plugins

Just like authenticating 3rd party apps with your Google credentials, allowing the app to access Google user’s data, this level of authentication should be presented to the user, i.e., “Would you like to allow XYZ ChatGPT Plugin access to your Chat History?” I’m sure there are many other security questions that could be presented to the user BEFORE they authenticate the ChatGPT plugin, such as access to personal data. For example, if the AI Chat application has access to the user’s Google Calendar and “recognizes” the user is taking a business trip next week, the Chat app can proactively ping the user a reminder to pack for warm weather, in contrast to the user’s local weather.

Grass Roots, Industry Standards Body: Defining All Aspects of AI Chat Implementations

We don’t need another big tech mogul marching up to Washington to try and scare a committee of lawmakers into the benefits of defining and enforcing legal standardization, whatever that might be for some and not for others. One of the items that was suggested is capping the sizes of AI models with oversight for exceptions. This could cripple the AI Chat evolution.

Just like we’ve had an industry standards body on the OAuth definition for implementation, another cross-industry standards body can be formed to help define all aspects of an AI Chat Implementation, technology agnostic, to help put aside the proprietary nature.

In terms of industry standards for artificial intelligence, Chat standards, permissions for the chat app, and 3rd party plugins should be high on the list of items to invoke standards.

Extensions to AI Chat – Tools in Their Hands

Far more important than the size of the AI Chat Model may be the tools or integrations to the AI Chat that should be regulated/reviewed for implementation. The knowledge base of the Chat Model may be far less impactful than what you can do with that knowledge. Just like we see in many software products, they have an ecosystem of plugins that can be integrated into the main software product, such as within JIRA or Azure DevOps marketplaces. With relatively simple implementation, some plugins may be restricted for implementation. Many AI Chat applications’ extensibility requires manual coding to integrate APIs/Tools; however, assigned API keys can solve the same issue to limit the distribution of some AI Chat tools.

AI Chat “Plugins/Extensions” can vary from access to repositories and tools like SalesForce, DropBox, and many, many more. That’s on the private sector side. On the government sector side, AI Chat plugins can range, some of which may require classified access, but all stem from a marketplace of extensibility for the AI Chatbots. That’s the real power of these chatbots. It’s not necessarily the knowledge of cheating on a university term paper. Educators are already adapting to OpenAPI, ChatGPT. A recent article in the MIT Technology Review, explains how teachers who think generative AI could actually make learning better.

Grassroots, Industry Standards Bodies should be driving the technology standards, and not lawmakers, at least until these standards bodies could expose all facets of AI Chat. Standards may also spawn from other areas of AI such as image/object recognition, and not all items brought about during the discovery phase should necessarily be restrictive. Some standards may positively grow the capabilities of AI solutions.

Chat Reactive versus Proactive Dialogs

We are still predominantly in a phase of reactive chat, answering our questions regarding the infinite. Proactive dialogs will help us by interjecting at the right moments and assist us in our time of need, whether we recognize it or not. I believe this is the scary bit for many folks who are engaging in this technology. Mix proactive dialog capabilities with Chat Plugins/Extensions with N capabilities/tools, creating a recipe for challenges that can be put beyond our control.

Who’s at the Front Door…Again?

Busy Time of Year, Happy Holidays

The holiday season brings lots of people to your front door. If you have a front door camera, you may be getting many alerts from your front door that let you know there is motion at the door. It would be great if the front doorbell cameras could take the next step and incorporate #AI facial/image recognition and notify you through #iOS notifications WHO is at the front door and, in some cases, which “uniformed” person is at the door, e.g. FedEx/UPS delivery person.

RIng iOS Notification
RIng iOS Notification

This facial recognition technology is already baked into Microsoft #OneDrive Photos and Apple #iCloud Photos. It wouldn’t be a huge leap to apply facial and object recognition to catalog the people who come to your front door as well as image recognition for uniforms that they are wearing, e.g., UPS delivery person.

iCloud/OneDrive Photos identify faces in your images, group by likeness, so the owner of the photo gallery can identify this group of faces as Grandma, for example. It may take one extra step for the camera owner to login into the image/video storage service and classify a group of videos converted to stills containing the face of Grandma. Facebook Meta also can tag the faces within pictures you upload and share. The Facebook app also can “guess” faces based on previously uploaded images.

No need to launch the Ring app and see who’s at the front door. Facial recognition can remove the step required to find out what is the motion at the front door and just post the iOS notification with the “who’s there”.

One less step to launching the Ring app and see who is at the front door.

Who’s Managing & Securing Your Information Assets?

What is meant by Information Architecture (IA)?

Information architecture (IA) focuses on organizing, structuring, and labeling content in an effective and sustainable way. The goal is to help users find information and complete tasks.

There must be a common consensus, an understanding of each data point collected, and the appropriate labeling and cataloging of the Information Asset. Information assets may have a score attributed to the asset and leveraged in a multitude of ways, such as guidelines for the purging of archives, sensitivity of the information, and the levels of trust.

For each data point collected, correlations/relationships can be added either manually, or through an Induction Engine (AI) leveraging a history of relationships. The definition of hierarchical relationships between data points, and link types (e.g. processor, successor, child, or generally related) further to bolster a larger lexicon.

What are Information Assets?

For example, your phone number is an information asset. Your phone number is provided to everyone you know and is a primary point of reference to contact you. Traditionally, the “phone companies” manage that resource for you. However, in this “new” day and age, we see companies like Google providing a phone number, and as a result providing features not generally available, such as Google Voice, with Call Forwarding, and obfuscation.

Common, Consumer, Information Assets Include:

  • Documents of ALL Types, e.g. text, spreadsheets, presentations, etc.
  • Domain Names and Email Addresses are Information Assets.
  • Twitter, Facebook, Instagram, and Other Social Media Platforms Assets, such as User Names, Post Text, Images, Video, and Profile details.
  • Skype, WhatsApp, and other VoIP Info Assets such as Phone Number, User Profile information
  • Windows Teams, Slack, and other Team Collaboration, Information Assets, such as the historical, ongoing posted information in the Team Chat, including the integration of 3rd party apps, such as Whiteboard collaborative drawings.
  • Passwords, Passwords, Passwords

Common, Corporate, Information Assets Include:

  • All of the Consumer, Information Assets PLUS
  • Documents of ALL Types, e.g. Solution Architecture docs, Database Models, HR Policies, Org Charts, Corp. Network Topography, etc.

Disaster Recovery for Information Assets

What happens when the technology managing information assets become “unavailable”? What is your impact assessment? Is there a centralized data/information catalog or repository that contains a partial or complete set of Information Assets?

Information Assets are also passwords, and we have a plethora of “secure” password managers, such as Norton Antivirus provides a mechanism to hold passwords in a virtual “safe”.

Insurance Policies for [digital] Information Assets

What is the cost of securing these Information Assets, verse the payment of recuperating the information assets, if even possible?

What about Hackers that “hold your data/information” hostage?

How to price out “Insurance” for your information, just like safeguarding any other personal articles insurance policies today? Are there “Personal Articles, Insurance Policies” that can currently add a rider to your existing policies? Need to price out “Information Assets”, and the recuperation values?

Norton Life Lock [Personal / Business]

Norton LifeLock reimburses funds stolen due to identity theft up to the limit of the plan total not exceeding $1 Million USD.

Notes Repositories

Notepads like Notepad++, Microsoft OneNote, and Google Keep are tools that allow their authors to quickly take notes and organize them. A wide array of Information Assets are contained within these applications, such as text, and photos with some data describing the information captured (i.e. metadata). Gathering and exporting this information to reference Information Assets could be a lengthy and laborious process without automation, rules for sorting, and tagging info.

AI Induction and Rules Engines

Dynamically labeling Information Assets as they are “discovered”, an auto curation process. For example, the Microsoft Outlook rules engine has a robust library of canned AI rules for sorting, forwarding, formatting as emails arrive in your inbox, as well as a host of other rules “triggers”. An Induction engine is a predictive instrument that “observes” behavior over time, and then creates/suggests new rules on the basis of the history of user behavior. For example, if MS Outlook had an AI Induction engine, and observed a user ‘almost’ always moving an email with the same subject to folder N, the AI Induction engine could create the rule to anticipate the user’s behavior.

Data Lakes or Sea of Information Assets

  • Structured, Semi-Structured, and Unstructured data.
  • Labeling/tagging Information Assets in a consistent fashion.
  • Retrieval of data, and cross-referenced data types

19 Best Data Catalog Tools and Software for 2020

Extract –

Tool: Alation Data Catalog

Description: Alation is a complete repository for enterprise data, providing a single point of reference for business glossaries, data dictionaries, and Wiki articles. The product profiles data and monitors usage to ensure that users have accurate insight into data accuracy. Alation also provides insight into how users are creating and sharing information from raw data. Customers tout the product for its expansive partner ecosystem, and Alation has focused on increasing data literacy when metadata is distributed across business and IT.

Related Articles from this Site:

Roblox, Massive Tween Gaming Platform, Goes Public

Popular tween gaming platform Roblox filed to go public on Thursday. The company declined an interview, citing a quiet period.

Source: Roblox, massive tween gaming platform, goes public – CNN

My son and I, OneWildRide,  are hooked on the Roblox game Theme Park Tycoon 2   I’m fixated on building out my park.  For beginners, there are the “out of the box” rides you can buy, and the amount of items you can use to accessorize your park is staggering.    Not only can you add “canned” rides, such as the Gravatron, but the theme park builder can add all different types of roller coasters, water rides, park transportation, etc.

Users of the Theme Park Tycoon 2 are Graded by:

  • number of active users in your park
  • the amount of money you make based on park admission, pay per ride, and concession stands
  • People can “like” your park, and provide feedback at the entrance

Commoditizing Roblox Games

I will shamefully admit that I purchased Roblox Bucks, with real dollars, that can be used on a plethora of items to build my Theme Park.  For example, the Theme Park has a height limit for how high you can build your roller coasters, so naturally, the builder/user has the ability to purchase to lift the height requirements.  You can also purchase additional “packs” that provide the builder enhancements to their rides, such as running the ride in reverse or looping the ride three times instead of the default single loop.  There’s also the conversion of USD to Roblox $$ because builders need to buy the components to build water rides or roller coasters.  You can even purchase concession stands (e.g. Popcorn Vendors).  The builder of the amusement park must also buy/build restrooms and spread out trash cans throughout the park.  There is also the concept of day and night, so make sure to buy/place lamps across the park.

Pay to Play – AI Bots = Theme Park $$

These “auto” bots/characters paying to play in your park may leave if they are dissatisfied, such as no bathrooms.  Also, without trash cans, there will be visible trash on the ground that must be painfully cleaned up, pile by pile, or left there to pile up.  On the flip side, these AI amusement goers will pay:

  • Park Entrance Fees
  • Pay Per Ride
  • Pay to use the loo
  • Pay for Concession Stands, such as Soft Drinks, Popcorn, and Pizza
  • Pay for Theme Park Memorabilia, such as Santa Hats, Tis the Season!

The Theme Park Builder sets the prices for EVERYTHING.  The AI Bots have “thoughts”, such as “This ride is really cheap.” to help you gauge your ride pricing, or “I’m Hungry”, to imply you should buy/place concession stands throughout your park.

Minecraft Anyone?!

I should say someone should have seen this coming, several someones.  You build this Theme Park at the “block” level, very similar to Minecraft, however, it seems, as far as I can tell, the graphics of Roblox seem somewhat superior to Minecraft, although this is a very debatable topic.  Minecraft has lots of 3rd party “mods” or customizations/modifications to the game.  Minecraft has had a lot of time to cultivate its userbase as well as a marketplace for users to buy these modifications.  Roblox as an application/gaming platform seems intriguing in light of the IPO.  I wonder what the highest-grossing games are on the Roblox platform.

Availability

Roblox Theme Park Tycoon 2 is available on Xbox, iPad / iPhone, and Windows to name the environments we use, jumping from device to device wherever is convenient.

Multiplayer Environment

My son constantly wants me to go over to his Theme Park, and go on rides he has just built.  It’s really a lot of fun to go to other builders’ parks.  There is a basic transit system to move between amusement parks.  You can get LOTS of ideas by looking at other builders’ parks, some of these parks put the “real world” amusement parks to shame.  So far, I’ve seen six (6) people playing concurrently, where you can see who has the most Roblox Bucks, and who’s park has the most visitors currently.  Naturally, if you’re not the big kahuna, you’ll want to stroll by the other builders’ parks.  If you are in close proximity, if you time it right, you can log in to the same server, and play with friends.  Doesn’t always seem to work quite right when people jump on and off the game.  There is probably a feature I’m not using to guarantee the same server with friends, maybe the “Premium” version of Roblox?

Build Your Own Roblox Games?  Monetary Incentives?

Wow, I really didn’t contemplate it that much.  I didn’t even think about the possible monetary returns from building one’s own Roblox game.  Not sure what the requirements would be to be a developer, how easy or hard it would be to build Roblox games, i.e. is there a coding language to use, a proprietary language, or just a simple graphical tool to build games.  No clue if there is a “developer/partner” annual cost, which is what I paid when developing applications for the iPhone / iPad.  Also, playing on the iPad / iPhone Roblox platform hosting the Theme Park game, would Apple get a percentage of “In-App” purchases for Roblox dollars?  We purchased Roblox bucks from the PC, and XBOX, so it didn’t occur to me there would be margin paid to the platform on which it runs.

Disclosure – I am not a “Premium” Roblox member or a “game” builder.

Data Loss Prevention (DLP) for Structured Data Sources

When people think of Data Loss Prevention, we usually think of Endpoint protection, such as Symantec Endpoint Security solution, preventing the upload of data to web sites, or downloaded to a USB device. The data being “illegally” transferred typically conforms to a particular pattern such as Personal Identifiable Information (PII), i.e. Social Security numbers.

Using a client for local monitoring of the endpoint, the agent detects the transfer of information as a last line of defense for external distribution. EndPoint solutions could monitor suspicious activity and/or proactively cancel the data transfer in progress.

Moving closer to the source of the data loss, monitoring databases filled with Personal Identifying Information (PII) has its advantages and disadvantages. One may argue there is no data loss until the employee attempts to export the data outside the corporate network, and the data is in-flight. In addition, extracted PII data may be “properly utilized” within the corporate network for analysis.

There is a database solution that provides similar “endpoint” monitoring and protection, e.g. identifying PII data extraction, with real-time query cancellation upon detection, leveraging “out of the box” data patterns, Teleran Technologies. Teleran supports relational databases such as Oracle, and Microsoft SQL Server, both on-prem, and cloud solutions.

Updates in Data Management Policies

Identifying the data loss points of origination provides opportunities to update the gaps in data management policy and the implementation of additional controls over data. Data classification is done dynamically based on common data mask structures. Users may build additional rules to cover custom structures. So, for example, a business analyst executes a query against a database that appears to fit predefined data masks, such as SSN, the query may be canceled before it’s even executed, and/or this “suspicious” activity can be flagged for the Chief Information Officer and/or Chief Security Officer (CSO)

Bar none, I’ve seen only one firm that defends a company’s data assets closer to the probable leak of information, the database, Teleran Technologies, See what they have to offer your organization for data protection and compliance.

Prevalent Remote Work Changes Endpoint Strategy

Endpoints in our corporate environments of prevalent remote working may highlight the need that relying on endpoints may be too late to enforce data protection. We may need to bring potential data loss detection into the inner sanctum of the corporate networks and need prevention closer to the source of data being extracted. How are “semi-trusted” third parties such as staff augmentation from offshore dealt?

Endpoint DLP – Available Breach Tactics

Endpoint DLP may capture and contain attempts to extract PII data, for example, parsing text files for SSNs, or other data masks. However, there are ways around the transfer detection, making it lofty to identify, such as screen captures of data, converting from text into images. Some Endpoint providers boast about their Optical Character Recognition (OCR), however, turning on this feature may produce many false positives, too many to sift through in monitoring, and unmanageable to control. The best DLP defense is to monitor and control closer to the data source, and perhaps, flag data requests from employees, e.g. after SELECT statement entered, UI Pops up a “Reason for Request?” if PII extraction is identified in real-time, with auditable events that can flow into Splunk.

AR Sudoku Solver Uses Machine Learning To Solve Puzzles Instantly

Very novel concept, applying Augmented Reality and Artificial Intelligence (i.e. Machine Learning) to solving puzzles, such as Sudoko.  Maybe not so novel considering AR uses in manufacturing.

Next, we’ll be using similar technology for human to human negotiations, reading body language, understanding logical arguments, reading human emotion, and to rebut remarks in a debate.

Litigators watch out… Or, co-counsel?   Maybe a hand of Poker?

Source: AR Sudoku Solver Uses Machine Learning To Solve Puzzles Instantly

Going Solo – Gig to Gig

Having the Stamina to Last…

Going the consulting path, on your own, is no small feat. Do you have what it takes to persist, survive, and thrive?

  • Army of One – Not only do you need to perform your CONSULTANCY role, but you also have to be bookkeeper, sales and marketing, looking for new opportunities.
  • The Gap Between Gigs – To all recruiters and hiring managers – it’s not a bad thing to have gaps in a candidate’s resume. Its the way of life in our gig economy. We are constantly hunting for just the right opportunity in a sea of hundreds or thousands of candidates per role.
  • Keeping Up With Market Trends – Online learning platforms such as Pluralsight, keep their content fresh, relevant, and in line with your career path.
  • Networking, Networking, Networking – at every opportunity, build your network of contacts and keep them in the know

Follow the Breadcrumbs: Identify and Transform

Trends – High Occurrence, Word Associations

Over the last two decades, I’ve been involved in several solutions that incorporated artificial intelligence and in some cases machine learning. I’ve understood at the architectural level, and in some cases, a deeper dive.

I’ve had the urge to perform a data trending exercise, where not only do we identify existing trends, similar to “out of the box” Twitter capabilities, we can also augment “the message” as trends unfold. Also, probably AI 101. However, I wanted to submerge myself in understanding this Data Science project. My Solution Statement: Given a list of my interests, we can derive sentence fragments from Twitter, traverse the tweet, parsing each word off as a possible “breadcrumb”. Then remove the Stop Words, and voila, words that can identify trends, and can be used to create/modify trends.

Finally, to give the breadcrumbs, and those “words of interest” greater depth, using the Oxford Dictionaries API we can enrich the data with things like their Thesaurus and Synonyms.

Gotta Have a Hobby

It’s been a while now that I’ve been hooked on Microsoft Power Automate, formerly known as Microsoft Flow. It’s relatively inexpensive and has the capabilities to be a tremendous resource for almost ANY project. There is a FREE version, and then the paid version is $15 per month. No brainer to pick the $15 tier with bonus data connectors.

I’ve had the opportunity to explore the platform and create workflows. Some fun examples, initially, using MS Flow, I parsed RSS feeds, and if a criterion was met, I’d get an email. I did the same with a Twitter feed. I then kicked it up a notch and inserted these records of interest into a database. The library of Templates and Connectors is staggering, and I suggest you take a look if you’re in a position where you need to collect and transform data, followed by a Load and a notification process.

What Problem are we Trying to Solve?

How are trends formed, how are they influenced, and what factors influence them? The most influential people providing input to a trend? Influential based on location? Does language play a factor on how trends are developed? End Goal: driving trends, and not just observing them.

Witches Brew – Experiment Ingredients:

Obtaining and Scrubbing Data

Articles I’ve read regarding Data Science projects revolved around 5 steps:

  1. Obtain Data
  2. Scrub Data
  3. Explore Data
  4. Model Data
  5. Interpreting Data

The rest of this post will mostly revolve around steps 1 and 2. Here is a great article that goes through each of the steps in more detail: 5 Steps of a Data Science Project Lifecycle

Capturing and Preparing the Data

The data set is arguably the most important aspect of Machine Learning. Not having a set of data that conforms to the bell curve and consists of all outliers will produce an inaccurate reflection of the present, and poor prediction of the future.

First, I created a table of search criteria based on topics that interest me.

Search Criteria List

Then I created a Microsoft Flow for each of the search criteria to capture tweets with the search text, and insert the results into a database table.

MS Flow - Twitter : Ingestion of Learning Tweets
MS Flow – Twitter: Ingestion of Learning Tweets

Out of the total 7450 tweets collected from all the search criteria, 548 tweets were from the Search Criteria “Learning” (22).

Data Ingestion - Twitter
Data Ingestion – Twitter

After you’ve obtained the data, you will need to parse the Tweet text into “breadcrumbs”, which “lead a path” to the Search Criteria.

Machine Learning and Structured Query Language (SQL)

This entire predictive trend analysis could be much easier with a more restrictive syntax language like SQL instead of English Tweets. Parsing SQL statements would be easier to make correlations. For example, the SQL structure can be represented such as: SELECT Col1, Col2 FROM TableA where Col2 = ‘ABC’. Based on the data set size, we may be able to extrapolate and correlate rows returned to provide valuable insights, e.g. projected impact performance of the query to the data warehouse.

R language and R Studio

Preparing Data Sets Using Tools Designed to Perform Data Science.

R language and R Studio seems to be very powerful when dealing with large data sets, and syntax makes it easy to “clean” the data set. However, I still prefer SQL Server and a decent query tool. Maybe my opinion will change over time. The most helpful thing I’ve seen from R studio is to create new data frames and the ability to rollback to a point in time, i.e. the previous version of the data set.

Changing column data type on the fly in R studio is also immensely valuable. For example, the data in the column are integers but the data table/column definition is a string or varchar. The user would have to drop the table in SQL DB, recreate the table with the new data type, and then reload the data. Not so with R.