Tag Archives: Oracle

Blended Data Warehouse SW/HW Solutions Phased Into the Cloud

Relational Database Solutions “In a Box”

Several of the relational database software vendors, such as IBM, Oracle, and Teradata have developed proprietary data warehouse software to be tightly coupled with server hardware to maximize performance.  These solutions have been developed and refined as “on-prem” solutions for many years.

We’ve seen the rise of “Database (DW)  as a Service” from companies like Amazon, who sell Redshift services.

Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools.  It allows you to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. Most results come back in seconds.

RDB Complex Software/Hardware Maintenance

In recent times, the traditional relational database software vendors shifted gears to become service providers offering maximum performance from a solution hosted by them, the vendor, in the Cloud.    On the positive side, the added complexity of configuring and tuning a blended software/hardware data warehouse has been shifted from the client’s team resources such as Database Administrators (DBAs), Network Administrators,  Unix/Windows Server Admins,… to the database software service provider.  The complexity of tuning for scalability, and other maintenance challenges shifts to the software vendor’s expertise, if that’s the abstraction you select.  There is some ambiguity in the delineation of responsibilities with the RDBMS vendor’s cloud offerings.

Total Cost of Ownership

Quantifying the total cost of ownership of a solution may be a bit tricky, especially if you’re trying to quantify the RDBMS hybrid software/hardware “on-prem” solution versus the same or similar capabilities brought to the client via “Database (DW) as a Service”.

“On-Prem”, RDB Client Hosted Solution

Several factors need to be considered when selecting ANY software and/or Hardware to be hosted at the client site.

  • Infrastructure “when in Rome”
    • Organizations have a quantifiable cost related to hosting physical or virtual servers in the client’s data center and may be boiled down to a number that may include things like HVAC, or new rack space.
    • Resources used to maintain/monitor DC usage, there may be an abstracted/blended figure.
  • Database Administrators maintain and monitor RDB solutions.
    • Activities may range from RDB patches/upgrades to resizing/scaling the DB storage “containers”.
    • Application Database Admins/Developers may be required to maintain the data warehouse architecture, such as new requirements, e.g. creating aggregate tables for BI analysis.
  • Network Administrators
    • Firewalls, VPN
    • Port Scanning
  • Windows/Unix Server Administrators
    • Antivirus
    • OS Patches

Trying to correlate these costs in some type of “Apples to Apples” comparison to the “Data Warehouse as a Service” may require accountants and technical folks to do extensive financial modeling to make the comparison.   Vendors, such as Oracle, offer fully managed services to the opposite end of the spectrum, the “Bare Metal”, essentially the “Infra as a Service.”  The Oracle Exadata solution can be a significant investment depending on the investment in redundancy and scalability leveraging Oracle Real Application Clusters (RAC). 

Support and Staffing Models for DW Cloud Vendors

In order for the traditional RDB software vendors to accommodate a “Data Warehouse as a Service” model, they may need to significantly increase staff for a variety of technical disciplines, as outlined above with the Client “On-Prem” model.  A significant ramp-up of staff and the organizational challenges of developing and implementing a support model based on a variety of factors may have relational database vendors ask: Should they leverage a top tier consulting agency such as Accenture, or Deloitte to define, implement, and refine a managed service?  It’s certainly a tall order to go from a software vendor to offering large scale services.  With corporate footprints globally and positive track records implementing managed services of all types, it’s an attractive proposition for both the RDB vendor and the consulting agency who wins the bid.  Looking at the DW Service billing models don’t seem sensical on some level.  Any consulting agency who implements a DW managed service would be responsible to ensure ROI both for the RDS vendor and their clients.  It may be opaque to the end client leveraging the Data Warehouse as a Service, but certainly, the quality of service provided should be nothing less than if implemented by the RDB vendor itself.  If the end game for the RDB vendor is for the consulting agency to implement, and mature the service then at some point bring the service in-house, it could help to keep costs down while maturing the managed service.

Oracle Exadata

Here are URLs for reference to understand the capabilities that are realized through Oracle’s managed services.

https://cloud.oracle.com/en_US/database

https://cloud.oracle.com/en_US/database/exadata/features

https://www.oracle.com/engineered-systems/exadata/index.html

Teradata

https://www.teradata.com/products-and-services/intellicloud

https://www.teradata.com/products-and-services/cloud-overview

Teradata
Teradata

DB2

https://www.ibm.com/cloud/db2-warehouse-on-cloud

IBM Mainframe
IBM Mainframe

Note: The opinions shared here are my own.

Applying Artificial Intelligence & Machine Learning to Data Warehousing

Protecting the Data Warehouse with Artificial Intelligence

Teleran is a middleware company who’s software monitors and governs OLAP activity between the Data Warehouse and Business Intelligence tools, like Business Objects and Cognos.   Teleran’s suite of tools encompass a comprehensive analytical and monitoring solution called iSight.  In addition, Teleran has a product that leverages artificial intelligence and machine learning to impose real-time query and data access controls.  Architecture  also allows for Teleran’s agent not to be on the same host as the database, for additional security and prevention of utilizing resources from the database host.

Key Features of iGuard:
  • Policy engine prevents “bad” queries before reaching database
  • Patented rule engine resides in-memory to evaluate queries at database protocol layer on TCP/IP network
  • Patented rule engine prevents inappropriate or long-running queries from reaching the data
70 Customizable Policy Templates
SQL Query Policies
  • Create policies using policy templates based on SQL Syntax:
    • Require JOIN to Security Table
    • Column Combination Restriction –  Ex. Prevents combining customer name and social security #
    • Table JOIN restriction –  Ex. Prevents joining two different tables in same query
    • Equi-literal Compare requirement – Tightly Constrains Query Ex. Prevents hunting for sensitive data by requiring ‘=‘ condition
    • DDL/DCL restrictions (Create, Alter, Drop, Grant)
    • DQL/DML restrictions (Select, Insert, Update, Delete)
Data Access Policies

Blocks access to sensitive database objects

  • By user or user groups and time of day (shift) (e.g. ETL)
    • Schemas
    • Tables/Views
    • Columns
    • Rows
    • Stored Procs/Functions
    • Packages (Oracle)
Connection Policies

Blocks connections to the database

  • White list or black list by
    • DB User Logins
    • OS User Logins
    • Applications (BI, Query Apps)
    • IP addresses
Rule Templates Contain Customizable Messages

Each of the “Policy Templates”  has the ability to send the user querying the database a customized message based on the defined policy. The message back to the user from Teleran should be seamless to the application user’s experience.

iGuard Rules Messaging
iGuard Rules Messaging

 

Machine Learning: Curbing Inappropriate, or Long Running Queries

iGuard has the ability to analyze all of the historical SQL passed through to the Data Warehouse, and suggest new, customized policies to cancel queries with certain SQL characteristics.   The Teleran administrator sets parameters such as rows or bytes returned, and then runs the induction process.  New rules will be suggested which exceed these defined parameters.  The induction engine is “smart” enough to look at the repository of queries holistically and not make determinations based on a single query.

Finally, here is a high level overview of the implementation architecture of iGuard.  For sales or pre-sales technical questions, please contact www.teleran.com

Teleran Logical Architecture
Teleran Logical Architecture

 

Currently Featured Clients
Teleran Featured Clients
Teleran Featured Clients

 

FinTech: End to End Framework for Client, Intermediary, and Institutional Services

Is it all about being the most convenient,  payment processing partner, with an affinity to the payment processing brand?  It’s a good place to start; the Amazon Payments partner program.

FinTech noun : an economic industry composed of companies that use technology to make financial systems more efficient

Throughout my career, I’ve worked with several financial services  teams to engineer, test, and deploy solutions.  Here is a brief list of the FinTech solutions I helped construct, test,  and deploy:

  1. 3K Global Investment Bankers – proprietary CRM platform, including Business Analytics, Business Objects Universe.
  2. Equity Research platform, crafted based on business expertise.
    • Custom UI for research analysts, enabled the analysts to create their research, and push into the workflow.
    • Based on a set of rules,  ‘locked down’ part of the report would  “Build Discloses” , e.g. analyst holds 10% of co.
    • Custom Documentum workflow would route research to the distribution channels; or direct research to legal review.
  3. (Multiple Financial Org.) Data Warehouse middleware solutions to assist organizations in managing,  and monitoring usage of their DW.
  4. Global Derivatives firm, migration of mainframe system to C# client / Server platform
  5. Investment Bankers and Equity Capital Markets (ECMG)  build trading platform so teams may collaborate on Deals/Trades.
  6. Global Asset Management Firm: On boarding and Fund management solutions, custom UI and workflows in SharePoint

*****

A “Transaction Management Solution” targets a mixture of FinTech services, primarily “Payments” Processing.

Target State Capabilities of a Transaction Management Solution:

  1. Fraud Detection:  The ability to identify and prevent fraud exists within many levels of the transaction from facilitators of EFT to credit monitoring and scoring agencies.  Every touch point of a transaction has its own perspective of possible fraud, and must be evaluated to the extent it can be.
    • Business experts (SMEs)  and technologists continue to expand the practical applications of Artificial Intelligence (AI) every day.  Although extensive AI fraud detection applications  exists today incorporating human populated Rules Engines,  and AI Machine learning (independent rule creation).
  2. Consumer “Financial Insurance” Products
    • Observing a business, end to end transaction may provide visibility into areas of transaction risk.   Process  and/or technology may be adopted / augmented to minimize the risk.
      • E.g. eBay auction process has a risk regarding the changing hands of currency and merchandise.  A “delayed payment”, holding funds until the merchandise has been exchanged minimized the risk, implemented using PayPal.
    • In product lifecycle of Discovery, Development, and Delivery phases, converting concept to product.
  3. Transaction Data Usage for Analytics
    • Client initiating transaction,  intermediary parties, and destination of funds may all tell ‘a story’ about the transaction.
    • Every party within a transaction, beginning to end, may benefit from the use of the transaction data using analytics.
      • e.g. Quicken – personal finance management tool; collects, parses, and augments transaction data to provide client  analytics in the form of charts / graphs, and reports.
    • Clear, consistent, and comprehensive data set available at every point in the transaction lifecycle regardless of platform .
      • e.g. funds transferred between financial institutions may  have a descriptions that are not user friendly, or may not be actionable, e.g. cryptic name, and no contact details.
      • Normalizing data may occur at an abstracted layer
    • Abstracted, and aggregated data used for analytics
      • e.g. average car price given specs XYZ;
      • e.g. 2. avg. credit score in a particular zip code.
    • Continued growth opportunities, and challenges
      • e.g. data privacy v. allowable aggregated data
  4. Affinity Brand Opportunities Transaction Management Solution
    • eWallet affinity brand promotions,
      • e.g. based on transaction items’ rules; no shipping
      • e.g.2. “Cash Back” Rewards, and/or Market Points
      • e.g.3. Optional, “Fundraiser” options at time of purchase.
  5. Credit Umbrella: Monitoring Use Case
    • Transparency into newly, activated accounts enables the Transaction Management Solution (TMS) to trigger a rule to email the card holder, if eligible, to add card to eWallet

Is Intuit an acquisition target because of Quicken’s capabilities to provide users consistent reporting of transactions across all sources?  I just found this note in Wiki while writing this post:

Quicken is a personal finance management tool developed by Intuit, Inc. On March 3, 2016, Intuit announced plans to sell Quicken to H.I.G. Capital. Terms of the sale were not disclosed.[1]

For quite some time companies have attempted to tread in this space with mixed results, either through acquisition or build out of their existing platforms.  There seems to be significant opportunities within the services, software and infrastructure areas.  It will be interesting to see how it all plays out.

Inhibitors to enclosing a transaction within an end to end Transaction Management Solutions (TMS):

  • Higher level of risk (e.g. business, regulatory) expanding out service offerings
  • Stretching too thin, beyond core vision, and lose sight of vision.
  • Transforming tech  company to hybrid financial services
  • Automation, streamlining of processes, may derive efficiencies may lead to reduction in staff / workforce
  • Multiple platforms performing functions provides redundant capabilities, reduced risk, and more consumer choices

 Those inhibitors haven’t stopped these firms:

Payments Ecosystem
Payments Ecosystem

 

Human Evolution: Technology Continues to Transform Socieities for Generations

In the last 20 years, I’ve observed technology trends, and Tech achievements have risen and fallen from the mainstream.  Tech has augmented our lives, and enhanced our human capabilities.  Our evolution will continue to be molded by technology and shape humanity for years to come.

Digital Asset Management (DAM)

Everything you might find on your computer from emails to video are digital assets.  Content from providers, team collaboration,  push and/or pull asset distribution, and archiving content are the workflows of DAM.

DAM solutions are rapidly going main stream as small to medium sized content providers look to take control of their content from ingestion to distribution.  Shared digital assets will continue to grow rapidly.  Pressure by stockholders to maximize use of digital assets to grow revenue will fuel initiatives to  globally share and maintain digital asset taxonomies.  For example, object recognition applied to image, sound and video assets will dynamically add tags to assets in an effort to index ever growing content.  If standard taxonomies are not globally adopted, and continually applied to assets, digital content stored will become, in essence, unusable.

The Internet of Things (IoT)

All devices across all business verticals will become ‘Smart’ devices with bidirectional data flow.  Outbound ‘Smart’ device data flow is funneled into repositories for analysis to produce dashboards, reporting, and rules suggestions.

Inbound ‘Smart’ device data can trigger actions on the device. Several devices may work in concert defined by ‘grouping’ e.g. Home: Environmental. Remote programming updates may be triggered by the analysis of data.

  • AI Rules Engine runs on ‘backend’.  Rules defined by Induction,  through data analysis, and human set parameters,  executed in sequence
  • Device optimization updates, presets on devices may be tuned based on ‘transaction’ history, feedback from user, and other ‘Smart’ devices.
  • Grouped ‘Smart’ devices, e.g. health monitors’ data uploaded, analyzed, and correlating across group.  Updated rules, and notifications triggered.
  • Manual user commands, ad hoc or scheduled

… as a Service

Cloud ‘Services’ enables scalability on demand, relatively lower cost [CapEx] overhead, offsite redundancy, etc.  Provides software solutions companies to rapidly deploy to Dev., Test, and Prod. environments.  Gaming, storage, and virtual machines are just a few of the ‘…as a service’ offerings.  IoT analysis may reveal a new need for another service.

Human Interface

  • Augmented Reality A.R.

Integrates user to surrounding environment with overlay images to your eyes to REpresent anything, e.g. Identifies surrounding people with Twitter handle/user name above their heads.  Interacts with smartphone for Inbound and outbound data flow.  May allow App and OS programmers to enable users to interact with their ‘traditional’ software in new ways, e,g. Microsoft Windows 8+, current interaction with ’tiles’, may shift from a two to three dimensional manipulation and view of the tiles.  Tiles (apps) pop up when, through object recognition, predefined characteristics match, e.g.  Looking at a bank check sent to you from the mail?  Your Bank of America tile / app may ask if you want to deposit the check right now?

  • Virtual Reality, V.R.

As more drones, for example, collect video footage, may be used for people to experience the landscapes, beaches, cities, mountains, and other features of a potential destination, which may lead to tourism.  In fact, travel agencies may purchase the V.R. Headsets, and subscribe to a library of V.R. content.  Repository platform would need to be created.  Specs for the ‘How To’ on collecting V.R. Video footage should be accessible.  Hathaway real estate offers a V.R. tour of the house, from their office.

Autonomous  Vehicles (Average Consumer or hobbyist)

  • Cars 
  • Drones
  • Satellites 

Social Media Evolution

Driving forces to integrate with society puts pressure on individuals to integrate with the collective social conscious.  As digital assets are published, people will lunge at the opportunity to self tag every digital asset both self and community shared assets.  Tagging on social media platforms is already going ahead.   Taxonomies are built, maintained and shared across social media platforms.  Systematically tagged [inanimate] objects occur using object recognition. Shared, and maintained global taxonomies not only store data on people and their associated meta data, (e,g,  shoe size, education level completed, HS photo,etc.) but also store meta data about groups of people, relationships and their tagged object data.

The taxonomies are analyzed and correlated, providing better, more concise demographic profiles.  These profiles can be used for 

  • Clinical trials data collection
  • Fast identification of potential outbreaks, used by the CDC
  • The creation and management of AI produced Hedge Funds
  • Solicitation of goods and services

Out of Compliance

These three dreaded words you are guaranteed to see more and more often.  As all aspects of our lives become meta data on a taxonomy tree, the analysis of information will make correlations which drive consumers and members of society ‘out of compliance’.  For example, pointers to your shared videos of you skydiving will get added to your personal taxonomy tree.  Your taxonomy tree will be available and mandatory to get life insurance from a tier 1 company.  Upon daily inspection of your tree by an insurance AI engine, a hazardous event was flagged. Notifications from your life insurance company reminding you ‘dangerous’ activities are not covered on your policy.  Two infractions may drive up your premiums.

People Turn Toward “Data Banks” to Commoditize Purchase and User Behavior Profiles

Anyone who is anti “Big Brother”, this may not be the article for you, in fact, skip it. 🙂

The Pendulum Swings Away from GDPR

In the not so distant future, “Data Bank” companies consisting of Subject Matter Experts (SME) across all verticals,  may process your data feeds collected from your purchase , and user behavior profiles.  Consumers will be encouraged to submit their data profiles into a Data Bank who will offer incentives such as a reduction of insurance premiums to cash back rewards.

 

Everything from activity trackers, home automation, to vehicular automation data may be captured and aggregated.    The data collected can then be sliced and diced to provide macro and micro views of the information.    On the abstract, macro level the information may allow for demographic, statistical correlations, which may contribute to corporate strategy.

On a granular view, the data will provide “data banks” the opportunity to sift through data to perform analysis and correlations that lead to actionable information.

 

Is it secure?  Do you care if a hacker steals your weight loss information? May not be an issue if collected Purchase and Use Behavior Profiles aggregate into a Blockchain general ledger.  Data Curators and Aggregators work with SMEs to correlate the data into:

  • Canned, ‘intelligent’ reports targeted to specific subject matter, or across silos of data types
  • ‘Universes’ (i.e.  Business Objects) of data that may be ‘mined’ by consumer approved, ‘trusted’ third party companies, e.g. your insurance companies.
  • Actionable information based on AI subject matter rules engines

 

Consumers may have the option of sharing their personal data with specific companies by proxy, through a ‘data bank’ granular to the data point collected.  Sharing of Purchase and User Behavior Profiles:

  1. may lower [or raise] your insurance premiums
  2. provide discounts on preventive health care products and services, e.g. vitamins to yoga classes
  3. Targeted, affordable,  medicine that may redirect the choice of the doctor to an alternate.  The MD would be contacted to validate the alternate.

The curriated data collected may be harnessed by thousands of affinity groups to offer very discrete products and services.  Purchase and User Behavior Profiles,  correlated information stretches beyond any consumer relationship experienced today.

 

At some point, health insurance companies may require you to wear a tracker to increase or slash premiums.  Auto Insurance companies may offer discounts for access to car smart data to make sure suggested maintenance guidelines for service are met.

You may approve your “data bank” to give access to specific soliciting government agencies or private research firms looking to analyze data for their studies. You may qualify based on the demographic, abstracted data points collected for incentives provided may be tax credits, or paying studies.

 

Purchase and User Behavior Profiles:  Adoption and Affordability

If ‘Data Banks’ are able to collect Internet of Things (IoT) enabled, are cost inhibiting.  here are a few ways to increase their adoption:

  1.  [US] tax coupons to enable the buyer, at the time of purchase, to save money.  For example, a 100 USD discount applied at the time of purchase of an Activity Tracker, with the stipulation that you may agree,  at some point, to participate in a study.
  2. Government subsidies: the cost of aggregating and archiving Purchase and Behavioral profiles through annual tax deductions.  Today, tax incentives may allow you to purchase an IoT device if the cost is an itemized medical tax deduction, such as an Activity Tracker that monitors your heart rate, if your medical condition requires it.
  3. Auto, Life, Homeowners, and Health policyholders may qualify for additional insurance deductions
  4. Affinity branded IoT devices, such as American Lung Association may sell a logo branded Activity Tracker.  People may sponsor the owner of the tracking pedometer to raise funds for the cause.

The World Bank has a repository of data, World DataBank, which seems to store a large depth of information:

World Bank Open Data: free and open access to data about development in countries around the globe.”

Here is the article that inspired me to write this article:

http://www.marketwatch.com/story/you-might-be-wearing-a-health-tracker-at-work-one-day-2015-03-11

Privacy and Data Protection Creates Data Markets

Initiatives such as General Data Protection Regulation (GDPR) and other privacy initiatives which seek to constrict access to your data to you as the “owner”, as a byproduct, create opportunities for you to sell your data.  

Blockchain: Purchase, and User Behavior Profiles

As your “vault”, “Data Banks” will collect and maintain your two primary datasets:

  1. As a consumer of goods and services, a Purchase Profile is established and evolves over time.  Online purchases are automatically collected, curated, appended with metadata, and stored in a data vault [Blockchain].  “Offline” purchases at some point, may become a hybrid [on/off] line purchase, with advances in traditional monetary exchanges, and would follow the online transaction model.
  2. User Behavior (UB)  profiles, both on and offline will be collected and stored for analytical purposes.  A user behavior “session” is a use case of activity where YOU are the prime actor.  Each session would create a single UB transaction and are also stored in  a “Data Vault”.   UB use cases may not lead to any purchases.

These datasets wholly owned by the consumer, are safely stored, propagated, and immutable with a solution such as with a Blockchain general ledger.

Business Intelligence, Analogies, and Articulation of Data on Mediums

As I was reading the article from the New York Times, As Boom Lures App Creators, Tough Part Is Making a Living, the typical doom and gloom story about the get rich quick with creation of applications on Tablets is true of any start-up company, may it be a restaurant, clothing shop, or other.  You have idea, Sally has an idea, and so does Fred, and the likely hood everyone will be elated about every bar, restaurant, clothing store or application is ridiculous.   Simple economics, and opportunity cost, you cannot go to every restaurant in parallel every night.  One USD trades off an opportunity to spend it somewhere else.  One area I would suspect has massive opportunities in the coming weeks, months, and years is Business Intelligence, Analogies, and Articulation of Data on a Tablet medium.  Yes, it is true, there are established players in the marketplace, but being established also makes you less nimble for change.  Being able to look at a clients Data Warehouse, and create mediums for analogies expressing where there customers have been spending their money, why, and help predict trends in a KISS fashion to any level of a business organization is key.  That is why the innate talents of user interface, user interface engineering, or way back it was called industrial design.  In short, part of the appetite for corporate spending will always come from how do I make more money with the product I just bought, Return on Investment (ROI).  Business Intelligence is one area I have been studying for years, and as all people know, we all find it difficult to express, or analogize thoughts, and specifically, dive into ‘data’ and turn it into information a CEO, or business analyst can understand and turn that ‘information’ into a new marketing campaign, hence, business intelligence.  Until we can all read minds, and transfer like for like information, BI, and improving upon this space will be an area to derive income.

[dfads params=’groups=1177,1178&limit=1&orderby=random’]

WordPress Shortcode API to Cloud Storage to Sell Any Digital Intellectual Property.

So, I was a browsing, going through bills, and thinking, hey relating to my other article on Google Docs and their new API where you could use them as a data warehouse, it occurred to me.   Why can’t we have a public API for all the Cloud Storage systems like Amazon Web Services (AWS) S3 (or Box.com), create a plugin to WordPress, add E-Commerce, and you now have your own place to sell digital music, or any Digital intellectual, property store, or host your own database OLTP or OLAP.

And my bro, Fat Panda, might have been thinking the same thing.  He’s one step behind, but he will catch on.  I will try to update for ‘the cheap seats’ in a bit.

For the cheap seats, even those static files stored up in the cloud, you can use a similar model to Google Docs <-> Google Fusion where you add tabular data to storage, read,over-write, or update using home made table locking mechanism, and essentially use the cloud as a data warehouse, or even a database.  Microsoft seems to have a lead on transitional and analytical storage with Microsoft Azure, relational in nature in the cloud, but it is so much simpler than that with cloud storage, although if not implemented with ‘row’ locking,there is an issue with OLTP (On Line Transaction Processing) row level, high volume, but with OLAP, On Line Analytic Processing, not so much, analyzing the way your business does business, and profit more from your consumer data.  There are easy ways to implement row level locking for row level locking of tabular data stored in cloud storage like AWS or Box.Net,  The methods to implement row level locking for OLTP systems using storage in the cloud are easy to implement, and will remind you of old school type alternatives to supplement the AutoNumber columns in MS Access or Identity columns in SQL Server. At the end of the day to either sell digital intellectual property from a WordPress implementation, or run your entire business with a robust cloud database solution for OLTP or OLAP systems using flat file storage!  Why go through all this when the Amazons AWS and Microsoft Azure have or will yearn to start building these solutions in parallel?  Cost effective solutions, and the entire database arena monopolized by Oracle, IBM, Microsoft, and MySQL, just got extended to a whole lot of database vendors.  It may take a while, but we already know the big Gorilla in the room Google is the first to strike in this game, as a non-traditional database vendor, cloud storage provider with their updated Google Docs API, and optionally usage of their Fusion application.

Tablet Developers Make Business Intelligence Tools using Google as a Data Warehouse: Completing with Oracle, IBM, and Microsoft SQL Server

And, he shoots, and scores.  I called it, sort of.  Google came out of the closet today as a data warehouse vendor, at least they need a community of developers to connect the dots to help build an amazing Business Intelligence suite.

Google came out with a Google Docs API today, which using languages from Objective-C (iOS), C#, to Java so you can use Google as your Data Warehouse for any size business. All you need to do is write an ETL program which uploads and downloads tables from your local database to Google Docs, and you create your own Business Intelligence User Interface for the creation and viewing of Charts & Graphs.  It looks like they’ve changed strategies, or this was the plan all along.

Initially I thought that Google Fusion was going to be the table editing tool to manipulate your data that was transferred from your transactional database using the Google Docs API.  Today they released a Google Docs API and developers can create their own ETL drivers and a Business Intelligence User Interface that can run on any platform from an Android Tablet, iPad, or Windows Tablet.

A few days ago, I wrote the article, which looked like they were going to use a tool called Google Fusion, which was in Beta at the time to manipulate tabular data, and eventually extend it to create common BI components, such as graphs, charts, edit tables, etc.

A few gotchas: Google Docs on Apple iPad is version 1.1.1 released 9/28/12, so we are talking very early days, and the Google Docs API was released today.   I would imagine since you can also use C#, someone can make a Windows application on the desktop to manipulate the data tables, create and view graphs, so a Windows Tablet can be used.  The API also has Java compatibility, so from any Unix box, or any platform, Java is write once, run anywhere, wherever your transitional database lives, a developer is able to write a driver to transfer the data to Google Docs dynamically, and then use Google Docs API for Business Intelligence.  You can even write an ETL driver which all it does is rapidly transfer data, like an ODBC, or JDBC driver and use any business intelligence tools you have on your desktop, or a nightly ETL.  However, I can see developers creating business intelligence tools on Android, iPad, or Windows tables to modify tables, create and view charts, etc., using custom BI tool sets and their data warehouse now becomes Google Docs.

Please reference an article I wrote a few days back, “Google is Going to be the Next Public and Private Data Warehouse“.

At that time, Google Fusion was marked as Beta on 10/13/2012.  Google has since stripped off the word Beta, but doesn’t matter.  Its even better with the Google API to Google Docs.  Google Fusion could be your starter User Interface, however, if your Android, iOS (Apple iPad), and Windows developers really embrace this API, all of the big database companies like IBM, Oracle, and Microsoft may have their market share eroded to some extent, if not a great extent.

Update 10/19:

Hey Gs (Guys and Gals), I forgot to mention, you can also make your own video or music streaming applications perhaps, using the basic calls of get and receive file other companies are already doing such as AWS, Box, etc. It’s a simple get / send API, so not sure if it’s applicable to ‘streaming’ at this stage, just another storage location in the ‘cloud’, which would be quite boring.  Although thinking of it now, aren’t all the put / send cloud solutions potential data warehouses using ETL and the APIs discussed and published above?  Also, it’s ironic that Google would also be competing with itself, if it was a file share, ‘stream’ videos, and YouTube?

PostgreSQL / local database and SOA, mid tier for cloud solutions to improve performance

In an article I read from the NY Times, Salesforce.com may be making a play to banish Oracle as a supported platform. However, the system which might be interesting would be a PostgreSQL, or in memory database, acts as a local cache for the transaction based system, clears the local database records/cache after it uploads the ‘staged’ data from the local database to a cloud database where the data is ultimately stored. The activities on the local database should be fast, and the cloud database is a) data that may be transformed to any cloud based solution vendor(s), if necessary, if an SOA is built on top of the local database which communicates with the cloud via APIs. b) enables a local data-mart, if not transferred in real time, i.e. use a nightly transformation and have access to “day of” BI on a limited set of local data, c) again transaction performance and data segregation of the warehouse. This architecture is already in use at many firms, but I wanted to call it out. Another option is to use two cloud database solutions, one ‘local’ to your region, and one globally dispersed for performance and redundancy using an ETL, although I am not convinced this would be a great architecture.  The second cloud tier can be a transformation from the first for regulatory archiving, if required by law either for finance or DR (Disaster Recovery) policy.

Google is Going to be the Next Public and Private Data Warehouse

In an article I wrote a while back, Google to venture into Cloud, provide Open Source APIs, assist small businesses to be Cloud Solutions Integrators, I was talking in the abstract, but I saw on the Google site, buried way down their menus, under the ‘More’, and then select the ‘Even More’ option, and at the bottom left of the page you will see Innovation, Fusion Tables (Beta).  Google is advanced, ready to compete with the database vendors, with a user friendly UI, better than I thought.  They are currently providing a way to upload data to a Google Drive, then the user imports the data from the Google Drive, and using table views  and Business Intelligence tools, allows the user to manipulate and share the data.  The data allowed to be uploaded into tables seems limitless. Although, they state Google is still in Beta, and publicly are showing users can upload and link to Google data instead of allowing users to connect to external data sources, such as your sales transaction database, there may be an API in the works for 3rd parties to allow for integration using direct connections through drivers such as ODBC or a JDBC driver to integrate with transactional systems to stream data and not just uploaded Google data.  However, this may be their strategy, to host all of the data, and have a migration utility.  At this stage, they would like to house the data and have the cloud storage infrastructure, however, the strategic mid-term goal may be to allow you to house your RDBMS transaction data locally, and we could stream, and/or upload into their data warehouse to apply Business Intelligence to manipulate the data, and then publish it in multiple formats, e.g. they would display the data for public or private consumption, and I can also see you are able to then publish charts with commentary into your Google Plus stream with specific ‘Circles’.  Brilliant.  Hat’s off to you guys.  If Google allows streaming of the data, or what we call data transformations from your e.g. sales transaction system to the Google data warehouse, then they would be competing with IBM, Oracle, and Microsoft.

Update: 12/26/12
After all of that profound scoping, and keen insight, I was chatted by a developer that Google’s BigQuery does the job better.  I am curious why it has not taken off in the Marketplace?  Anti-Trust?  Also, why then create an abstraction layer like these other products like Fusion and call out explicitly Google Docs, maybe that would help them transition into the market space with a different level of user the consumer, or the target user would be different, such as the small business.
[dfads params=’groups=1177,1178&limit=1&orderby=random’]