Category Archives: Data Warehouse

People Turn Toward “Data Banks” to Commoditize on their Purchase and User Behavior Profiles

Anyone who is anti “Big Brother”, this may not be the article for you, in fact, skip it. ūüôā

 

The Pendulum Swings Away from GDPR

In the not so distant future, “Data Bank” companies consisting of¬†Subject Matter Experts¬†(SME) across all verticals, ¬†may process¬†your data¬†feeds collected from your purchase and user behavior profiles.¬† Consumers will be encouraged to submit their data profiles into a Data Bank who will offer incentives such as a reduction of¬†insurance premiums to cash back rewards.

 

Everything from activity trackers, home¬†automation, to¬†vehicular automation¬†data may be captured and aggregated. ¬† ¬†The data collected can then be sliced and diced to provide macro and¬†micro views of the information. ¬† ¬†On the abstract, macro level the¬†information¬†may allow for demographic, statistical correlations, which may¬†contribute to corporate strategy. On a¬†granular¬†view, the data¬†will provide “data banks” the opportunity to sift through data to perform analysis and correlations that lead to actionable information.

 

Is it secure?  Do you care if a hacker steals your weight loss information? May not be an issue if collected Purchase and Use Behavior Profiles aggregate into a Blockchain general ledger.  Data Curators and Aggregators work with SMEs to correlate the data into:

  • Canned, ‘intelligent’ reports targeted for a specific subject matter, or across silos of¬†data types
  • ‘Universes’ (i.e. ¬†Business Objects) of data that may be ‘mined’ by consumer approved, ‘trusted’ third party companies, e.g. your insurance companies.
  • Actionable information based on AI subject matter rules engines and consumer rule transparency may be provided.

 

¬†“Data Banks” may be required to report to their customers who agreed to sell their data examples of specific rows of the data, which was sold on a “Data Market”.

Consumers may have¬†the option of sharing their personal¬†data with specific companies by proxy, through a ‘data bank’¬†granular to the data point¬†collected.¬† Sharing of Purchase and User Behavior Profiles:

  1. may lower [or raise] your insurance premiums
  2. provide discounts on preventive health care products and services, e.g. vitamins to yoga classes
  3. Targeted, affordable,  medicine that may redirect the choice of the doctor to an alternate.  The MD would be contacted to validate the alternate.

 

The curriated data collected may be harnessed by thousands of affinity groups to offer very discrete products and services.  Purchase and User Behavior Profiles,  correlated information stretches beyond any consumer relationship experienced today.

 

At some point, health insurance companies may require you to wear a tracker to increase or slash premiums.  Auto Insurance companies may offer discounts for access to car smart data to make sure suggested maintenance guidelines for service are met.

 

You may approve your “data bank”¬†to give access¬†to specific soliciting government agencies or private firms looking to analyze data for their studies. You may qualify based on the demographic, abstracted data points collected for incentives provided may be tax credits, or paying studies.

Purchase and User Behavior Profiles:  Adoption and Affordability

If ‘Data Banks’ are allowed to collect Internet of Things (IoT)¬†device profile and the devices themselves are cost prohibitive. ¬†here are a few¬†ways to increase their adoption:

  1.  [US] tax coupons to enable the buyer, at the time of purchase, to save money.  For example, a 100 USD discount applied at the time of purchase of an Activity Tracker, with the stipulation that you may agree,  at some point, to participate in a study.
  2. Government subsidies: the cost of aggregating and archiving Purchase and Behavioral profiles through annual tax deductions.  Today, tax incentives may allow you to purchase an IoT device if the cost is an itemized medical tax deduction, such as an Activity Tracker that monitors your heart rate, if your medical condition requires it.
  3. Auto, Life, Homeowners, and Health policyholders may qualify for additional insurance deductions
  4. Affinity branded IoT devices, such as American Lung Association may sell a logo branded Activity Tracker.  People may sponsor the owner of the tracking pedometer to raise funds for the cause.

The World Bank has a repository of data, World DataBank, which seems to store a large depth of information:

World Bank Open Data: free and open access to data about development in countries around the globe.”

Here is the article that inspired me to write this article:

http://www.marketwatch.com/story/you-might-be-wearing-a-health-tracker-at-work-one-day-2015-03-11

 

Privacy and Data Protection Creates Data Markets

Initiatives such as¬†General Data Protection Regulation (GDPR) and other privacy initiatives which seek to constrict access to your data to you as the “owner”, as a byproduct, create opportunities for you to¬†sell your data.¬†¬†

 

Blockchain: Purchase, and User Behavior Profiles

As your “vault”, “Data Banks” will collect and maintain your two primary datasets:

  1. As a consumer of goods and services, a Purchase Profile is established and evolves over time.¬† Online purchases are automatically collected, curated, appended with metadata, and stored in a data vault [Blockchain].¬† “Offline” purchases at some point, may become a hybrid [on/off] line purchase, with advances in traditional monetary exchanges, and would follow the online transaction model.
  2. User Behavior (UB)¬† profiles, both on and offline will be collected and stored for analytical purposes.¬† A user behavior “session” is a use case of activity where YOU are the prime actor.¬† Each session would create a single UB transaction and are also stored in a “Data Vault”.¬† ¬†UB use cases may not lead to any purchases.

Not all Purchase and User Behavior profiles are created equal.¬† Eg. One person’s profile may show a monthly spend higher than another.¬† The consumer who purchases more may be entitled to more benefits.

These datasets wholly owned by the consumer, are safely stored, propagated, and immutable with a solution such as with a Blockchain general ledger.

Blended Data Warehouse SW/HW Solutions Phased Into the Cloud

Relational Database Solutions “In a Box”

Several of the relational database software vendors, such as IBM, Oracle, and Teradata have developed proprietary data warehouse software to be tightly coupled with server hardware to maximize performance.¬† These solutions have been developed and refined as “on-prem” solutions for many years.

We’ve¬†seen the rise of “Database (DW)¬† as a Service” from companies like Amazon, who sell Redshift services.

Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools.  It allows you to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. Most results come back in seconds.

RDB Complex Software/Hardware Maintenance

In recent times, the traditional relational database software vendors shifted gears to become service providers offering maximum performance from a solution hosted by them, the vendor, in the Cloud.¬† ¬† On the positive side, the added complexity of configuring and tuning a blended software/hardware data warehouse has been shifted from the client’s team resources¬†such as Database Administrators (DBAs), Network Administrators,¬† Unix/Windows Server Admins,… to the database software service provider.¬† The complexity of tuning for scalability, and other maintenance challenges shifts to the software vendor’s expertise, if that’s the abstraction you select.¬† There is some ambiguity in the delineation¬†of responsibilities with the RDBMS vendor’s cloud offerings.

Total Cost of Ownership

Quantifying the total cost of ownership of a solution may be a bit tricky, especially if you’re trying to quantify the RDBMS hybrid software/hardware “on-prem” solution¬†versus the same or similar capabilities brought to the client via “Database (DW) as a Service”.

“On-Prem”, RDB Client Hosted Solution

Several factors need to be considered when selecting ANY software and/or Hardware to be hosted at the client site.

  • Infrastructure “when in Rome”
    • Organizations have a quantifiable cost related to hosting physical or virtual servers in the client’s data center and may be boiled down to a number that may include things like HVAC, or new rack space.
    • Resources used to maintain/monitor DC usage, there may be an abstracted/blended figure.
  • Database Administrators maintain and monitor RDB solutions.
    • Activities may range from RDB patches/upgrades to resizing/scaling the DB storage “containers”.
    • Application Database Admins/Developers may be required to maintain the data warehouse architecture, such as new requirements, e.g. creating aggregate tables for BI analysis.
  • Network Administrators
    • Firewalls, VPN
    • Port Scanning
  • Windows/Unix Server Administrators
    • Antivirus
    • OS Patches

Trying to correlate these costs in some type of “Apples to Apples” comparison to the “Data Warehouse as a Service” may require accountants and technical folks to do extensive financial modeling to make the comparison.¬† ¬†Vendors, such as Oracle, offer fully managed services to the opposite end of the spectrum, the “Bare Metal”, essentially the “Infra as a Service.”¬† The Oracle Exadata solution can be a significant investment depending on the investment in redundancy and scalability leveraging Oracle Real Application Clusters (RAC).¬†

Support and Staffing Models for DW Cloud Vendors

In order for the traditional RDB software vendors to accommodate a “Data Warehouse as a Service” model, they may need to significantly increase staff for a variety of technical disciplines, as outlined above with the Client “On-Prem” model.¬† A significant ramp-up of staff and the organizational challenges of developing and implementing a support model based on a variety of factors may have relational database vendors ask: Should they leverage a top tier consulting agency such as¬†Accenture, or¬†Deloitte to define, implement, and refine a managed service?¬† It’s certainly a tall order to go from a software vendor to offering large scale services.¬† With corporate footprints globally and positive track records implementing managed services of all types, it’s an attractive proposition for both the RDB vendor and the consulting agency who wins the bid.¬† Looking at the DW Service billing¬†models don’t seem sensical on some level.¬† Any consulting agency who implements a DW managed service would be responsible to ensure ROI both for the RDS vendor and their clients.¬† It may be opaque to the end client leveraging the Data Warehouse as a Service, but certainly, the quality of service provided should be nothing less than if implemented by the RDB vendor itself.¬† If the end game for the RDB vendor is for the consulting agency to implement, and mature the service then at some point bring the service in-house, it could help to keep costs down while maturing the managed service.

Oracle Exadata

Here are URLs for reference to understand the capabilities that are realized through Oracle’s managed services.

https://cloud.oracle.com/en_US/database

https://cloud.oracle.com/en_US/database/exadata/features

https://www.oracle.com/engineered-systems/exadata/index.html

Teradata

https://www.teradata.com/products-and-services/intellicloud

https://www.teradata.com/products-and-services/cloud-overview

Teradata
Teradata

DB2

https://www.ibm.com/cloud/db2-warehouse-on-cloud

IBM Mainframe
IBM Mainframe

Note: The opinions shared here are my own.