Tag Archives: Microsoft Word

Information Architecture: An Afterthought for Content Creation Solutions

Maximizing Digital Asset Reuse

Many applications that enable users to create their own content from word processing to graphics/image creation have typically relied upon 3rd party Content Management Solutions (CMS) / Digital Asset Management (DAM) platforms to collect metadata describing the assets upon ingestion into their platforms.  Many of these platforms have been “stood up” to support projects/teams either for collaboration on an existing project, or reuse of assets for “other” projects.  As a person constantly creating content, where do you “park” your digital resources for archiving and reuse?  Your local drive, cloud storage, or not archived?

Average “Jane” / “Joe” Digital Authors

If I were asked for all the content I’ve created around a particular topic or group of topics from all my collected/ingested digital assets, it may be a herculean search effort spanning multiple platforms.  As an independent creator of content, I may have digital assets ranging from Microsoft Word documents, Google Sheets spreadsheets, Twitter tweets,  Paint.Net (.pdn) Graphics, Blog Posts, etc.

Capturing Content from Microsoft Office Suite Products

Many of the MS Office content creation products such as Microsoft Word have minimal capacity to capture metadata, and if the ability exists, it’s subdued in the application.  MS Word, for example, if a user selects “Save As”, they will be able to add/insert “Authors”, and Tags.  In Microsoft Excel, latest version,  the author of the Workbook has the ability to add Properties, such as Tags, and Categories.  It’s not clear how this data is utilized outside the application, such as the tag data being searchable after uploaded/ingested by OneDrive?

Blog Posts: High Visibility into Categorization and Tagging

A “blogging platform”, such as WordPress, places the Category and Tagging selection fields right justified to the content being posted.  In this UI/UX, it forces a specific mentality to the creation, categorization, and tagging of content.  This blogging structure constantly reminds the author to identify the content so others may identify and consume the content.  Blog post content is created to be consumed by a wide audience of interested viewers based on those tags and categories selected.

Proactive Categorization and Tagging

Perpetuate content classification through drill-down navigation of a derived Information Architecture Taxonomy.  As a “light weight” example, in WordPress, the Tags field when editing a Post, a user starts typing in a few characters, an auto-complete dropdown list appears to the user to select one or more of these previously used tags.  Excellent starting point for other Content Creation Apps.

Users creating Blog Posts can define a Parent/Child hierarchy of categories, and the author may select one or more of relevant categories to be associated with the Post.

Artificial Intelligence (AI) Derived Tags

It wouldn’t be a post without mentioning AI.  Integrated into applications that enable user content creation could be a tool, at a minimum, automatically derives an “Index” of words, or tags.  The way in which this “intelligent index” is derived may be based upon:

  • # of times word occurrence
  • mention of words in a particular context
  • reference of the same word(s) or phrases in other content
    • defined by the same author, and/or across the platform.

This intelligently derived index of data should be made available to any platforms that ingest content from OneDrive, SharePoint, Google Docs, etc.  These DAMs ( or Intelligent Cloud Storage) can leverage this information for any searches across the platforms.

Easy to Retrieve the Desired Content, and Repurpose It

Many Content Creation applications heavily rely on “Recent Accessed Files” within the app.  If the Information Architecture/Taxonomy hierarchy were presented in the “File Open” section, and a user can drill down on select Categories/Subcategories (and/or tags), it might be easier to find the most desired content.

All Eyes on Content Curation: Creation to Archive
  • Content creation products should all focus on the collection of metadata at the time of their creation.
  • Using the Blog Posting methodology, the creation of content should be alongside the metadata tagging
  • Taxonomy (categories, and tags with hierarchy) searches from within the Content Creation applications, and from the Operating System level, the “Original” Digital Asset Management solution (DAM), e.g. MS Windows, Mac

 

Microsoft Productivity Suite – Content Creation, Ingestion, Curation, Search, and Repurpose

Auto Curation: AI Rules Engine Processing

There are, of course, 3rd party platforms that perform very well, are feature rich, and agnostic to all file types.  For example, within a very short period of time, low cost, and possibly a few plugins, a WordPress site can be configured and deployed to suit your needs of Digital Asset Managment (DAM).  The long-term goal is to incorporate techniques such as Auto Curation to any/all files, leveraging an ever-growing intelligent taxonomy, a taxonomy built on user-defined labels/tags, as well an AI rules engine with ML techniques.   OneDrive, as a cloud storage platform, may bridge the gap between JUST cloud storage and a DAM.

Ingestion and Curation Workflow

Content Creation Apps and Auto Curation

  • The ability for Content Creation applications, such as Microsoft Word, to capture not only the user-defined tags but also the context of the tags relating to the content.
    • When ingesting a Microsoft PowerPoint presentation, after consuming the file, and Auto Curation process can extract “reusable components” of the file, such as slide header/name, and the correlated content such as a table, chart, or graphics.
    • Ingesting Microsoft Excel and Auto Curation of Workbooks may yield “reusable components” stored as metadata tags, and their correlated content, such as chart and table names.
    • Ingesting and Auto Curation of Microsoft Word documents may build a classic Index for all the most frequently occurring words, and augment the manually user-defined tags in the file.
    • Ingestion of Photos [and Videos] into and Intelligent Cloud Storage Platform, during the Auto Curation process, may identify commonly identifiable objects, such as trees or people.  These objects would be automatically tagged through the Auto Curation process after Ingestion.
  • Ability to extract the content file metadata, objects and text tags, to be stored in a standard format to be extracted by DAMs, or Intelligent Cloud Storage Platforms with file and metadata search capabilities.  Could OneDrive be that intelligent platform?
  • A user can search for a file title or throughout the Manual and Auto Curated, defined metadata associated with the file.  The DAM or Intelligent Cloud Storage Platform provides both search results.   “Reusable components” of files are also searchable. 
    • For “Reusable Components” to be parsed out of the files to be separate entities, a process needs to occur after Ingestion Auto Curration.
  • Content Creation application, user-entry tag/text fields should have “drop-down” access to the search index populated with auto/manual created tags.

Auto Curation and Intelligent Cloud Storage

  • The intelligence of Auto Curation should be built into the Cloud Storage Platform, e.g. potentially OneDrive.
  • At a minimum, auto curation should update the cloud storage platform indexing engine to correlate files and metadata.
  • Auto Curation is the ‘secret sauce’ that “digests” the content to build the search engine index, which contains identified objects (e.g. tag and text or coordinates)  automatically
    • Auto Curation may leverage a rules engine (AI) and apply user configurable rules such as “keyword density” thresholds
    • Artificial Intelligence, Machine Learning rules may be applied to the content to derive additional labels/tags.
  • If leveraging version control of the intelligent cloud storage platform, each iteration should “re-index” the content, and update the Auto Curation metadata tags.  User-created tags are untouched.
  • If no user-defined labels/tags exist, upon ingestion, the user may be prompted for tags

Auto Curation and “3rd Party” Sources

In the context of sources such as a Twitter feed, there exists no incorporation of feeds into an Intelligent Cloud Storage.  OneDrive, Cloud Intelligent Storage may import feeds from 3rd party sources, and each Tweet would be defined as an object which is searchable along with its metadata (e.g. likes; tags).

Operating System, Intelligent Cloud Storage/DAM

The Intelligent Cloud Storage and DAM solutions should have integrated search capabilities, so on the OS (mobile or desktop) level, the discovery of content through the OS search of tagged metadata is possible.

Current State

  1. OneDrive has no ability to search Microsoft Word tags
  2. The UI for all Productivity Tools must have a comprehensive and simple design for leveraging an existing taxonomy for manual tagging, and the ability to add hints for auto curation
    1. Currently, Microsoft Word has two fields to collect metadata about the file.  It’s obscurely found at the “Save As” dialog.
      1. The “Save As” dialogue box allows a user to add tags and authors but only when using the MS Word desktop version.  The Online (Cloud) version of Word has no such option when saving to Microsoft OneDrive Cloud Storage
  3. Auto Curation (Artificial Intelligence, AI) must inspect the MS Productivity suite tools, and extract tags automatically which does not exist today.
  4. No manual taging or Auto Curation/Facial Recognition exists.