Tag Archives: Cloud Storage

Microsoft Productivity Suite – Content Creation, Ingestion, Curation, Search, and Repurpose

Auto Curation: AI Rules Engine Processing

There are, of course, 3rd party platforms that perform very well, are feature rich, and agnostic to all file types.  For example, within a very short period of time, low cost, and possibly a few plugins, a WordPress site can be configured and deployed to suit your needs of Digital Asset Managment (DAM).  The long-term goal is to incorporate techniques such as Auto Curation to any/all files, leveraging an ever-growing intelligent taxonomy, a taxonomy built on user-defined labels/tags, as well an AI rules engine with ML techniques.   OneDrive, as a cloud storage platform, may bridge the gap between JUST cloud storage and a DAM.

Ingestion and Curation Workflow

Content Creation Apps and Auto Curation

  • The ability for Content Creation applications, such as Microsoft Word, to capture not only the user-defined tags but also the context of the tags relating to the content.
    • When ingesting a Microsoft PowerPoint presentation, after consuming the file, and Auto Curation process can extract “reusable components” of the file, such as slide header/name, and the correlated content such as a table, chart, or graphics.
    • Ingesting Microsoft Excel and Auto Curation of Workbooks may yield “reusable components” stored as metadata tags, and their correlated content, such as chart and table names.
    • Ingesting and Auto Curation of Microsoft Word documents may build a classic Index for all the most frequently occurring words, and augment the manually user-defined tags in the file.
    • Ingestion of Photos [and Videos] into and Intelligent Cloud Storage Platform, during the Auto Curation process, may identify commonly identifiable objects, such as trees or people.  These objects would be automatically tagged through the Auto Curation process after Ingestion.
  • Ability to extract the content file metadata, objects and text tags, to be stored in a standard format to be extracted by DAMs, or Intelligent Cloud Storage Platforms with file and metadata search capabilities.  Could OneDrive be that intelligent platform?
  • A user can search for a file title or throughout the Manual and Auto Curated, defined metadata associated with the file.  The DAM or Intelligent Cloud Storage Platform provides both search results.   “Reusable components” of files are also searchable. 
    • For “Reusable Components” to be parsed out of the files to be separate entities, a process needs to occur after Ingestion Auto Curration.
  • Content Creation application, user-entry tag/text fields should have “drop-down” access to the search index populated with auto/manual created tags.

Auto Curation and Intelligent Cloud Storage

  • The intelligence of Auto Curation should be built into the Cloud Storage Platform, e.g. potentially OneDrive.
  • At a minimum, auto curation should update the cloud storage platform indexing engine to correlate files and metadata.
  • Auto Curation is the ‘secret sauce’ that “digests” the content to build the search engine index, which contains identified objects (e.g. tag and text or coordinates)  automatically
    • Auto Curation may leverage a rules engine (AI) and apply user configurable rules such as “keyword density” thresholds
    • Artificial Intelligence, Machine Learning rules may be applied to the content to derive additional labels/tags.
  • If leveraging version control of the intelligent cloud storage platform, each iteration should “re-index” the content, and update the Auto Curation metadata tags.  User-created tags are untouched.
  • If no user-defined labels/tags exist, upon ingestion, the user may be prompted for tags

Auto Curation and “3rd Party” Sources

In the context of sources such as a Twitter feed, there exists no incorporation of feeds into an Intelligent Cloud Storage.  OneDrive, Cloud Intelligent Storage may import feeds from 3rd party sources, and each Tweet would be defined as an object which is searchable along with its metadata (e.g. likes; tags).

Operating System, Intelligent Cloud Storage/DAM

The Intelligent Cloud Storage and DAM solutions should have integrated search capabilities, so on the OS (mobile or desktop) level, the discovery of content through the OS search of tagged metadata is possible.

Current State

  1. OneDrive has no ability to search Microsoft Word tags
  2. The UI for all Productivity Tools must have a comprehensive and simple design for leveraging an existing taxonomy for manual tagging, and the ability to add hints for auto curation
    1. Currently, Microsoft Word has two fields to collect metadata about the file.  It’s obscurely found at the “Save As” dialog.
      1. The “Save As” dialogue box allows a user to add tags and authors but only when using the MS Word desktop version.  The Online (Cloud) version of Word has no such option when saving to Microsoft OneDrive Cloud Storage
  3. Auto Curation (Artificial Intelligence, AI) must inspect the MS Productivity suite tools, and extract tags automatically which does not exist today.
  4. No manual taging or Auto Curation/Facial Recognition exists.

Cloud Storage and DAM Solutions: Don’t Reign in the Beast

Are you trying to apply metadata on individual files or en masse, attempting to make the vast  growth of cloud storage usage manageable, meaningful storage?

Best practices leverage a consistent hierarchy, an Information Architecture in which to store and retrieve information, excellent.

Beyond that, capabilities computer science has documented and used time and time again, checksum algorithms. Used frequently after a file transfer to verify the file you requested is the file you received.  Most / All Enterprise DAM solutions use some type of technology to ‘allow’ the enforcement of unique assets [upon upload].  In cloud storage and photo solutions targeted toward the individual, consumer side, the feature does not appear to be up ‘close and personal’ to the user experience, thus building a huge expanse of duplicate data (documents, photos, music, etc.).  Another feature, a database [primary] key has been used for decades to identify that a record of data is unique.

Our family sharing alone has thousands and thousands of photos and music. The names of the files could be different for many of the same digital assets.  Sometimes file names are the same, but the metadata between the same files is not unique, but provides value. Tools for ‘merging’ metadata, DAM tools have value to help manage digital assets.

Cloud storage usage is growing exponentially, and metadata alone won’t help rope in the beast. Maybe ADHOC or periodic indexing of files [e.g. by #checksum algorithm] could take on the task of identifying duplicate assets?  Duplicate  assets could be viewed by the user in an exception report?  Less boring, upon upload, ‘on the fly’ let the user know the asset is already in storage, and show a two column diff. of the metadata.

It’s a pain for me, and quite possibly many cloud storage users.  As more people jump on cloud storage, this feature should be front and center to help users grow into their new virtual warehouse.

The industry of cloud storage most likely believes for the common consumer, storage is ‘cheap’, just provide more.  At some stage, the cloud providers may look to DAM tools as the cost of managing a users’ storage rises.  Tools like:

  • duplicate digital assets, files. Use exception reporting to identify the duplicates, and enable [bulk] corrective action, and/or upon upload, duplicate ‘error/warning’ message.
  • Dynamic metadata tagging upon [bulk] upload using object recognition.  Correlating and cataloging one or more [type] objects in a picture using defined Information Architecture.  In addition, leveraging facial recognition for updates to metadata tagging.
    • e.g. “beach” objects: sand, ocean; [Ian Roseman] surfing;
  • Brief questionnaires may enable the user to ‘smartly’ ingest the digital assets; e.g. ‘themes’ of current upload; e.g. a family, or relationship tree to  extend facial recognition correlations.
    • e.g. themes – summer; party; New Year’s Eve
    • e.g. relationship tree – office / work
  • Pan Information Architecture (IA) spanning multiple cloud storage [silos]. e.g. for Photos, spanning [shared] ‘albums’
  • Publically published / shared components of an IA;  e.g. Legal documents;  standards and reuse

Google Introduces their Cloud, Digital Asset Management (DAM) solution

Although this is a saturated space, with many products, some highly recommended, I thought this idea might interest those involved in the Digital Asset Management space.  Based on the maturity of existing products, and cost, it’s up to you, build or buy.  The following may provide an opportunity for augmenting existing Google products, and overlaying a custom solution.

Google products can be integrated across their suite of solutions and may produce a cloud based, secure, Digital Asset Management, DAM solution.   In this use case, the digital assets are Media (e.g. videos, still images)

A Google DAM may be created by leveraging existing features of Google Plus, Google Drive, YouTube, and other Google products, as well as building / extending additional functionality, e.g. Google Plus API, to create a DAM solution.   An over arching custom framework weaves these products together to act as the DAM.

Google Digital Asset Management (New)

  1. A dashboard for Digital Asset Management should be created, which articulates, at a glance, where project media assets are in their life cycle, e.g. ingestion, transcoding, editing media, adding meta data, inclusion / editing of closed captions, workflow approvals, etc.
  2. Creation and maintenance of project asset folder structure within storage such as Google Drive for active projects as well as Google Cloud Storage for archived content.  Ingested content to arrive in the project folders.
  3. Ability to use [Google YouTube] default encoding / transcoding functionality, or optionally leverage alternate cloud accessible transcoding solutions.
  4. A basic DAM UI may provide user interaction with the project and asset meta data.
  5. Components of the DAM should allow plug in integration with other components on the  market today, such as an ingestion solution.

Google Drive and Google Cloud Storage.  Cloud storage offers large quantities of storage e.g. for Media (video, audio), economically.

  1. Google Drive ingestion of assets may occur through an automated process, such as a drop folder within an FTP site.  The folder may be polled every N seconds by the Google DAM orchestration, or other 3rd party orchestration product, and ingested into Google Drive.  The ingested files are placed into a project folder designated by the accompanying XML meta file.
  2. The version control of assets, implemented by Google Drive and the DAM to facilitate collaboration and approval.
  3. Distribution and publishing media to designated people and locations, such as to social media channels, may be automatically triggered by DAM orchestration polling Google Drive custom meta data changes.   On demand publishing is also achievable through the DAM.
  4. Archiving project assets to custom locations, such as Google Cloud solution, may be triggered by a project meta data status modification, or on demand through the DAM.
  5. Assets may be spawned into other assets, such as clips.  Derived child assets are correlated with the master, or parent asset within the DAM asset meta data to trace back to origin.  Eliminates redundancy of asset, enabling users to easily find related files and reuse all or a portion of the asset.

Google Docs

  1. Documents required to accompany each media project, such as production guidelines, may go through several iterations before they are complete.  Many of the components of a document may be static.  Google Docs may incorporate ‘Document Assembly’ technology for automation of document construction.

Google’s YouTube

  1. Editing media either using default YouTube functionality, or using third party software, e.g. Adobe suite
  2. Enable caption creation and editing  may use YouTube or third party software.
  3. The addition & modification of meta data according to the corporate taxonomy may be added or modified through [custom] YouTube fields, or directly through the Google DAM Db where the project data resides.

Google’s Google Plus +

  1. G+ project page may be used for project and asset collaboration
  2. Project team members may subscribe to the project page to receive notifications on changes, such as new sub clips
  3. Asset workflow notifications,  human and automated:
    1. Asset modification approvals (i.e. G+ API <- -> DAM Db) through custom fields in G + page
    2. Changes to assets (i.e. collaboration) notifications,
    3. [Automated] e.g. ingestion in progress, or completed updates.
    4. [Automated] Process notifications: e.g. ‘distribution to XYZ’ and ‘transcoding N workflow’.  G + may include links to assets.
  4. Google Plus for in-house, and outside org. team(s) collaboration
  5. G + UI may trigger actions, such as ingestion e.g.  by specifying a specific Google Drive link, and a configured workflow.

Google Custom Search

  1. Allows for the search of assets within a project, within all projects within a silo of business, and across entire organization of assets.
  2. Ability to find and share DAM motion pictures, still images, and text assets with individuals, groups, project teams in or outside the organization.  Google Plus to facilitate sharing.
  3. Asset meta data will e.g. describe how the assets may be used for distribution, digital distribution rights.   Users and groups are implemented within G+, control of asset distribution may be implemented in Google Plus, and/or custom Google Search.

Here are a list of DAM vendors.

Amazon Cloud Services uses their Shipping Logistics to Build and Ship

Amazon leverages its existing shipping and logistics knowledge and applies it to a new cloud resource, a 3D Printer.

Using Amazon’s platform, a user can connect through Amazon’s cloud services, lock a shared cloud resource, a 3D printer, feed the printer one of several formats: 3D blueprint, 3D digital scan, or industry spec file format.  The object is then printed out, and shipped using Amazon’s shipping logistics engine.

Make the object you want and Amazon will ship it to you.  How much does it cost? Cost of materials used to produce the object is quantified and charged, in addition to a cloud [resource] usage fee, and potentially discounted shipping based on Amazon’s current scale.

As the service is matured, design tools, basic and advanced, will be provided to produce your designed object.  Only your imagination, and capability to express it limits your ability along with Amazon to deliver your products.

At some point, a seller can have a storefront, where objects can not only be shipped, but build on demand as well.  For example, circuit boards can be sold now, with the above engine and service, from a seller with the proper schematics.

 

WORM Storage in the Cloud?

WORM Storage, or Write Once Read Many, is ideal for archiving data, such as electronic communications.  I was just wondering out of the big, commodity cloud storage vendors, such as Amazon (S3), Windows, or Google, for example, what their offerings would be included but not limited to physical media retrieval upon request, such legal requests, or the client requests a backup of the storage.

Google Takes Cloud Computing Services; Adds Android APIs to SDK

Google Takes on Amazon and Microsoft for Cloud Computing Services – NYTimes.com.  This is the first bit of ‘news’ I’ve read all week with regard to Technology.  The article holds no suprises, but a good read to the uninformed.  I am unable to disagree, or apply additional insight to this article.  Amazon has a strong lead, and as I mentioned last year, I saw Google getting into this space with it’s software and APIs available.  It may have needed the manpower, and or infrastructure to build the back end to support the extensibility of the front end.  Google also may offer new business models to complement it’s existing API offerings, as well as expand those APIs, and provide user friendly tools.  I’d see, from this article, an Android API SDK extensibility to grab market share from Amazon.  The article quotes that Android applications are using AWS, so if Google adds Android APIs to it’s SDK, it would give developers an easy, plug in option.

Solving the Corporate and Personal Data Dilemma for Mobile Devices

After reading this article in the New York Timers, I.T. Managers Struggle to Contain Corporate Data in the Mobile Age – NYTimes.com, regarding employees using their mobile devices for both corporate and personal use, I pulled apart these challenges one by one.

One of the challenges mentioned is corporate and personal applications, potentially malware running on the same mobile OS, and this is not a new problem.  Typically, companies provide the hardware, and lock down the PCs where non-corporate software installed is a violation of corporate policy, and there are even at times, nightly corporate programs that go on the network, find these non-corporate applications and remove them.  One solution to the mobile devices in the corporate world is a similar approach whereby the mobile OS vendors allow corporations to apply this approach, if the corporation is providing the mobile hardware.

If the company is not providing the mobile device, an effective way partitioning data in the PC world could also be applied to the moble OS world, multiple boot partitions, just like a VM image. As the mobile hardware gets more robust, such as more processors and more, and fast RAM, this solution should be extremely feasible, and allow for the partitioning of corporate data.

There was also an implied usage of personal application using the bandwidth of corporate data, not so much an issue, however, this too can be solved with the traditional PC approach of a Proxy and Firewall approach, i.e. know, acceptable, published ports for approved applications.

In short, a multiple, dual, (or even mobile) OS boot, just like a virtual machine, whereby when you start up your mobile device, you select personal, or corporate image (or corporate 1, 2, etc), and even the image of the mobile OS could be housed in a clould architecture, which I have mentioned in the previous article I posted last year, Elastic Computing for Mobile Devices: Mobile OS Hosting Maximizing Computing Capacity

 

WordPress Shortcode API to Cloud Storage to Sell Any Digital Intellectual Property.

So, I was a browsing, going through bills, and thinking, hey relating to my other article on Google Docs and their new API where you could use them as a data warehouse, it occurred to me.   Why can’t we have a public API for all the Cloud Storage systems like Amazon Web Services (AWS) S3 (or Box.com), create a plugin to WordPress, add E-Commerce, and you now have your own place to sell digital music, or any Digital intellectual, property store, or host your own database OLTP or OLAP.

And my bro, Fat Panda, might have been thinking the same thing.  He’s one step behind, but he will catch on.  I will try to update for ‘the cheap seats’ in a bit.

For the cheap seats, even those static files stored up in the cloud, you can use a similar model to Google Docs <-> Google Fusion where you add tabular data to storage, read,over-write, or update using home made table locking mechanism, and essentially use the cloud as a data warehouse, or even a database.  Microsoft seems to have a lead on transitional and analytical storage with Microsoft Azure, relational in nature in the cloud, but it is so much simpler than that with cloud storage, although if not implemented with ‘row’ locking,there is an issue with OLTP (On Line Transaction Processing) row level, high volume, but with OLAP, On Line Analytic Processing, not so much, analyzing the way your business does business, and profit more from your consumer data.  There are easy ways to implement row level locking for row level locking of tabular data stored in cloud storage like AWS or Box.Net,  The methods to implement row level locking for OLTP systems using storage in the cloud are easy to implement, and will remind you of old school type alternatives to supplement the AutoNumber columns in MS Access or Identity columns in SQL Server. At the end of the day to either sell digital intellectual property from a WordPress implementation, or run your entire business with a robust cloud database solution for OLTP or OLAP systems using flat file storage!  Why go through all this when the Amazons AWS and Microsoft Azure have or will yearn to start building these solutions in parallel?  Cost effective solutions, and the entire database arena monopolized by Oracle, IBM, Microsoft, and MySQL, just got extended to a whole lot of database vendors.  It may take a while, but we already know the big Gorilla in the room Google is the first to strike in this game, as a non-traditional database vendor, cloud storage provider with their updated Google Docs API, and optionally usage of their Fusion application.