Tag Archives: EMC

As a Data Deluge Grows, Companies Rethink Storage

At Pure Storage, a device introduced on Monday holds five times as much data as a conventional unit.

  • IBM estimates that by 2020 we will have 44 zettabytes — the thousandfold number next up from exabytes — generated by all those devices. It is so much information that Big Blue is staking its future on so-called machine learning and artificial intelligence, two kinds of pattern-finding software built to cope with all that information.
  • Pure Storage chief executive, Scott Dietzen, “No one can look at all their data anymore; they need algorithms just to decide what to look at,”

Source: As a Data Deluge Grows, Companies Rethink Storage – The New York Times

Additional Editorial:

Pure Storage is looking to “compress” the amount of data that can be stored in a Storage Array using Flash Memory, “Flashblade”.   They are also tuning the capabilities of the solution for higher I/O throughput, and optimized, addressable storage.

Several companies with large and growing storage footprints have already begin to customize their storage solutions to accommodate the void in this space.

Building more storage arrays is a temporary measure while the masses of people, or fleets of cars turn on their IoT enabled devices.

Data is flooding the Internet, and innumerable, duplicate ‘objects’  of information, requiring redundant storage, are prevalent conditions. A registry, or public ‘records’ may be maintained.   Based on security measures, and the public’s appetite determine what “information objects” may be centrally located.  As intermediaries, registrars may build open source repositories, as an example, using Google Drive, or Microsoft Azure based on the data types of ‘Information Objects”

  • Information object registrars may contain all different types of objects, which indicate where data resides on the Internet.
    • vaguely similar to Domain name registrar hierarchy
    • another example, Domain Name System (DNS) is the best example of the registration process I am suggesting to clone and leverage for all types of data ranging from entertainment to medical records.
  • Medical “Records”, or Medical “Information Objects”
    • X-ray images, everything from dental to medical, and correlating to other medical information object(s),
  • Official ‘Education’ records from K-12 and beyond, e.g. degrees and certifications achieved;
  • Secure, easy access to ‘public’ ‘information objects’ by the owner, and creator.  Central portal(s) driving user traffic.  Enables ‘owner’ of records to take ‘ownership’ of their health, for example

Note: there are already ‘open’ platforms being developed and used for several industries including medical; with limed access.  However, the changes I’m proposing imposes a ‘registrar’ process whereby portals of information are registered, and are interwoven, linking to one another.

It’s an issue of excess weight upon the “Internet”, and not just the ‘weight’ of unnecessary storage, the throughput within a weaved set of networks as well.

Think of it in terms of opportunity cost.  First quantify what an ‘information object’, or ‘block of data’ equates to in cost.  It seems there must already be a measurement in existence, a medium amount to charge / cost per “information object”.  Finally, for each information object type, e.g. song, movie, news story, technical specifications, etc. identify how many times this exact object is perpetuated in the Internet.

Steps on reducing  data waste:

  • Without exception, each ‘information object’ contains an (XML) meta data file.
  • Each of the attributes describing information objects are built out as these assets are being used; e.g. proactive autopopulate search, and using an AI Induction engine
  • X out of Y metadata type and values are equivalent
    • the more attributes correlate to one or more objects, the more likely these objects are
      • related on some level, e.g. sibling, cousin
      • or identical objects, and may need meta relationship update
    • the metadata encapsulates the ‘information object’

Another opportunity to organize “Information Asset Objects” would be to leverage the existing DNS platform for managing “Information Asset Repositories”.   This additional Internet DNS structure would enable queries across information asset repositories.   Please see “So Much Streaming Music, Just Not in One Place”  for more details.

Cloud Document Mgt: Box Overhaul Architecture With Paperless Services

Building on the Cloud: Gehry and Box Overhaul Architecture With New Paperless Service | Wired Design | Wired.com.

I read this article, and thought instantly, this is where we were leaning toward a few years ago with a prototype project I worked on for a large metropolitan city, and electronically expediting the process of business occupancy workflows.

I’ve read Box.net current collaboration platform, and it is similar to what I was working on with an internal electronic document management system built upon EMC Documentum.  That system I worked on in 1993, was an internal enterprise product that included collaboration with multiple teams, approval workflows, i.e. gateways to the next workflow, as well as code allowing the documents to be dynamically edited, appended, reading from corporate databases, and integrating that data into the final product documents, which then would be distributed automatically on multiple platforms once the last approval was received.  The documents were distributed to third parties such as FactSet, and other major 3rd party outlets, as well as internal client web platforms

If Box can get those workflows incorporated into the production of any products, and incorporate all parties from all stakeholders from clients, subject matter experts, engineers, validators, e.g. sign off stakeholders, including the public sector, all engineers, which sign off on permits.

Other products may be able to be incorporated, such as the shipment of products, just like eBay has exposed their incorporation of both the U.S. Post Office, as well as PayPal, now a subsidiary of eBay.

EMC’s Documentum Competition for Google Docs in the SaaS space?

I was just curious if we would see the positioning of EMC’s Documentum as Software as a Service to compete with the likes of Google Docs, or will we see them continue to position for the Enterprise level private cloud model?  It would be great to hear your thoughts.  The Document Management Suite was an amazing full featured workflow document system, why not bring that to the forefront of the consumer market as a public cloud SaaS targeting small to mid sized markets, as well as the individual.  There are several profitable models where they would achieve significant margins, including accounting for the price to enter the market.

***

Update:

When I first started commenting on Digital Assent Management solutions, I did not include the vast amounts of existing solutions by vendors in this space.  Below are lists of the DAM software vendors which are already in place today:

Digital Asset Management VendorsDigital Asset Management Vendors Directory

I’ve had hands on experience with several DAM products, including Documentum and SharePoint.  I have no idea why I did not include SharePoint for DAM evaluation.