At Pure Storage, a device introduced on Monday holds five times as much data as a conventional unit.
- IBM estimates that by 2020 we will have 44 zettabytes — the thousandfold number next up from exabytes — generated by all those devices. It is so much information that Big Blue is staking its future on so-called machine learning and artificial intelligence, two kinds of pattern-finding software built to cope with all that information.
- Pure Storage chief executive, Scott Dietzen, “No one can look at all their data anymore; they need algorithms just to decide what to look at,”
Source: As a Data Deluge Grows, Companies Rethink Storage – The New York Times
Pure Storage is looking to “compress” the amount of data that can be stored in a Storage Array using Flash Memory, “Flashblade”. They are also tuning the capabilities of the solution for higher I/O throughput, and optimized, addressable storage.
Several companies with large and growing storage footprints have already begin to customize their storage solutions to accommodate the void in this space.
Building more storage arrays is a temporary measure while the masses of people, or fleets of cars turn on their IoT enabled devices.
Data is flooding the Internet, and innumerable, duplicate ‘objects’ of information, requiring redundant storage, are prevalent conditions. A registry, or public ‘records’ may be maintained. Based on security measures, and the public’s appetite determine what “information objects” may be centrally located. As intermediaries, registrars may build open source repositories, as an example, using Google Drive, or Microsoft Azure based on the data types of ‘Information Objects”
- Information object registrars may contain all different types of objects, which indicate where data resides on the Internet.
- vaguely similar to Domain name registrar hierarchy
- another example, Domain Name System (DNS) is the best example of the registration process I am suggesting to clone and leverage for all types of data ranging from entertainment to medical records.
Medical “Records”, or Medical “Information Objects”
- X-ray images, everything from dental to medical, and correlating to other medical information object(s),
- Official ‘Education’ records from K-12 and beyond, e.g. degrees and certifications achieved;
- Secure, easy access to ‘public’ ‘information objects’ by the owner, and creator. Central portal(s) driving user traffic. Enables ‘owner’ of records to take ‘ownership’ of their health, for example
Note: there are already ‘open’ platforms being developed and used for several industries including medical; with limed access. However, the changes I’m proposing imposes a ‘registrar’ process whereby portals of information are registered, and are interwoven, linking to one another.
It’s an issue of excess weight upon the “Internet”, and not just the ‘weight’ of unnecessary storage, the throughput within a weaved set of networks as well.
Think of it in terms of opportunity cost. First quantify what an ‘information object’, or ‘block of data’ equates to in cost. It seems there must already be a measurement in existence, a medium amount to charge / cost per “information object”. Finally, for each information object type, e.g. song, movie, news story, technical specifications, etc. identify how many times this exact object is perpetuated in the Internet.
Steps on reducing data waste:
- Without exception, each ‘information object’ contains an (XML) meta data file.
- Each of the attributes describing information objects are built out as these assets are being used; e.g. proactive autopopulate search, and using an AI Induction engine
- X out of Y metadata type and values are equivalent
- the more attributes correlate to one or more objects, the more likely these objects are
- related on some level, e.g. sibling, cousin
- or identical objects, and may need meta relationship update
- the metadata encapsulates the ‘information object’
Another opportunity to organize “Information Asset Objects” would be to leverage the existing DNS platform for managing “Information Asset Repositories”. This additional Internet DNS structure would enable queries across information asset repositories. Please see “So Much Streaming Music, Just Not in One Place” for more details.