Are you trying to apply metadata on individual files or en masse, attempting to make the vast growth of cloud storage usage manageable, meaningful storage?
Best practices leverage a consistent hierarchy, an Information Architecture in which to store and retrieve information, excellent.
Beyond that, capabilities computer science has documented and used time and time again, checksum algorithms. Used frequently after a file transfer to verify the file you requested is the file you received. Most / All Enterprise DAM solutions use some type of technology to ‘allow’ the enforcement of unique assets [upon upload]. In cloud storage and photo solutions targeted toward the individual, consumer side, the feature does not appear to be up ‘close and personal’ to the user experience, thus building a huge expanse of duplicate data (documents, photos, music, etc.). Another feature, a database [primary] key has been used for decades to identify that a record of data is unique.
Our family sharing alone has thousands and thousands of photos and music. The names of the files could be different for many of the same digital assets. Sometimes file names are the same, but the metadata between the same files is not unique, but provides value. Tools for ‘merging’ metadata, DAM tools have value to help manage digital assets.
Cloud storage usage is growing exponentially, and metadata alone won’t help rope in the beast. Maybe ADHOC or periodic indexing of files [e.g. by #checksum algorithm] could take on the task of identifying duplicate assets? Duplicate assets could be viewed by the user in an exception report? Less boring, upon upload, ‘on the fly’ let the user know the asset is already in storage, and show a two column diff. of the metadata.
It’s a pain for me, and quite possibly many cloud storage users. As more people jump on cloud storage, this feature should be front and center to help users grow into their new virtual warehouse.
The industry of cloud storage most likely believes for the common consumer, storage is ‘cheap’, just provide more. At some stage, the cloud providers may look to DAM tools as the cost of managing a users’ storage rises. Tools like:
- duplicate digital assets, files. Use exception reporting to identify the duplicates, and enable [bulk] corrective action, and/or upon upload, duplicate ‘error/warning’ message.
- Dynamic metadata tagging upon [bulk] upload using object recognition. Correlating and cataloging one or more [type] objects in a picture using defined Information Architecture. In addition, leveraging facial recognition for updates to metadata tagging.
- e.g. “beach” objects: sand, ocean; [Ian Roseman] surfing;
- Brief questionnaires may enable the user to ‘smartly’ ingest the digital assets; e.g. ‘themes’ of current upload; e.g. a family, or relationship tree to extend facial recognition correlations.
- e.g. themes – summer; party; New Year’s Eve
- e.g. relationship tree – office / work
- Pan Information Architecture (IA) spanning multiple cloud storage [silos]. e.g. for Photos, spanning [shared] ‘albums’
- Publically published / shared components of an IA; e.g. Legal documents; standards and reuse