Tag Archives: Data Warehouse

Popular Tweets from March, April, and May 2018

Tweet Activity Analytics

Leveraging Twitter’s Analytics, I’ve extracted my top Tweets from the last 91 day period.   During that time there were 42.2K impressions earned.  Looks like I took a slump on my statistics.

Summary:

  • 38 Link Clicks
  • 6 Retweets
  • 24 Likes
  • 14 Replies
March, April, May 2018 Tweets
March, April, May 2018 Tweets

Blended Data Warehouse SW/HW Solutions Phased Into the Cloud

Relational Database Solutions “In a Box”

Several of the relational database software vendors, such as IBM, Oracle, and Teradata have developed proprietary data warehouse software to be tightly coupled with server hardware to maximize performance.  These solutions have been developed and refined as “on-prem” solutions for many years.

We’ve seen the rise of “Database (DW)  as a Service” from companies like Amazon, who sell Redshift services.

Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools.  It allows you to run complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution. Most results come back in seconds.

RDB Complex Software/Hardware Maintenance

In recent times, the traditional relational database software vendors shifted gears to become service providers offering maximum performance from a solution hosted by them, the vendor, in the Cloud.    On the positive side, the added complexity of configuring and tuning a blended software/hardware data warehouse has been shifted from the client’s team resources such as Database Administrators (DBAs), Network Administrators,  Unix/Windows Server Admins,… to the database software service provider.  The complexity of tuning for scalability, and other maintenance challenges shifts to the software vendor’s expertise, if that’s the abstraction you select.  There is some ambiguity in the delineation of responsibilities with the RDBMS vendor’s cloud offerings.

Total Cost of Ownership

Quantifying the total cost of ownership of a solution may be a bit tricky, especially if you’re trying to quantify the RDBMS hybrid software/hardware “on-prem” solution versus the same or similar capabilities brought to the client via “Database (DW) as a Service”.

“On-Prem”, RDB Client Hosted Solution

Several factors need to be considered when selecting ANY software and/or Hardware to be hosted at the client site.

  • Infrastructure “when in Rome”
    • Organizations have a quantifiable cost related to hosting physical or virtual servers in the client’s data center and may be boiled down to a number that may include things like HVAC, or new rack space.
    • Resources used to maintain/monitor DC usage, there may be an abstracted/blended figure.
  • Database Administrators maintain and monitor RDB solutions.
    • Activities may range from RDB patches/upgrades to resizing/scaling the DB storage “containers”.
    • Application Database Admins/Developers may be required to maintain the data warehouse architecture, such as new requirements, e.g. creating aggregate tables for BI analysis.
  • Network Administrators
    • Firewalls, VPN
    • Port Scanning
  • Windows/Unix Server Administrators
    • Antivirus
    • OS Patches

Trying to correlate these costs in some type of “Apples to Apples” comparison to the “Data Warehouse as a Service” may require accountants and technical folks to do extensive financial modeling to make the comparison.   Vendors, such as Oracle, offer fully managed services to the opposite end of the spectrum, the “Bare Metal”, essentially the “Infra as a Service.”  The Oracle Exadata solution can be a significant investment depending on the investment in redundancy and scalability leveraging Oracle Real Application Clusters (RAC). 

Support and Staffing Models for DW Cloud Vendors

In order for the traditional RDB software vendors to accommodate a “Data Warehouse as a Service” model, they may need to significantly increase staff for a variety of technical disciplines, as outlined above with the Client “On-Prem” model.  A significant ramp-up of staff and the organizational challenges of developing and implementing a support model based on a variety of factors may have relational database vendors ask: Should they leverage a top tier consulting agency such as Accenture, or Deloitte to define, implement, and refine a managed service?  It’s certainly a tall order to go from a software vendor to offering large scale services.  With corporate footprints globally and positive track records implementing managed services of all types, it’s an attractive proposition for both the RDB vendor and the consulting agency who wins the bid.  Looking at the DW Service billing models don’t seem sensical on some level.  Any consulting agency who implements a DW managed service would be responsible to ensure ROI both for the RDS vendor and their clients.  It may be opaque to the end client leveraging the Data Warehouse as a Service, but certainly, the quality of service provided should be nothing less than if implemented by the RDB vendor itself.  If the end game for the RDB vendor is for the consulting agency to implement, and mature the service then at some point bring the service in-house, it could help to keep costs down while maturing the managed service.

Oracle Exadata

Here are URLs for reference to understand the capabilities that are realized through Oracle’s managed services.

https://cloud.oracle.com/en_US/database

https://cloud.oracle.com/en_US/database/exadata/features

https://www.oracle.com/engineered-systems/exadata/index.html

Teradata

https://www.teradata.com/products-and-services/intellicloud

https://www.teradata.com/products-and-services/cloud-overview

Teradata
Teradata

DB2

https://www.ibm.com/cloud/db2-warehouse-on-cloud

IBM Mainframe
IBM Mainframe

Note: The opinions shared here are my own.

FinTech: End to End Framework for Client, Intermediary, and Institutional Services

Is it all about being the most convenient,  payment processing partner, with an affinity to the payment processing brand?  It’s a good place to start; the Amazon Payments partner program.

FinTech noun : an economic industry composed of companies that use technology to make financial systems more efficient

Throughout my career, I’ve worked with several financial services  teams to engineer, test, and deploy solutions.  Here is a brief list of the FinTech solutions I helped construct, test,  and deploy:

  1. 3K Global Investment Bankers – proprietary CRM platform, including Business Analytics, Business Objects Universe.
  2. Equity Research platform, crafted based on business expertise.
    • Custom UI for research analysts, enabled the analysts to create their research, and push into the workflow.
    • Based on a set of rules,  ‘locked down’ part of the report would  “Build Discloses” , e.g. analyst holds 10% of co.
    • Custom Documentum workflow would route research to the distribution channels; or direct research to legal review.
  3. (Multiple Financial Org.) Data Warehouse middleware solutions to assist organizations in managing,  and monitoring usage of their DW.
  4. Global Derivatives firm, migration of mainframe system to C# client / Server platform
  5. Investment Bankers and Equity Capital Markets (ECMG)  build trading platform so teams may collaborate on Deals/Trades.
  6. Global Asset Management Firm: On boarding and Fund management solutions, custom UI and workflows in SharePoint

*****

A “Transaction Management Solution” targets a mixture of FinTech services, primarily “Payments” Processing.

Target State Capabilities of a Transaction Management Solution:

  1. Fraud Detection:  The ability to identify and prevent fraud exists within many levels of the transaction from facilitators of EFT to credit monitoring and scoring agencies.  Every touch point of a transaction has its own perspective of possible fraud, and must be evaluated to the extent it can be.
    • Business experts (SMEs)  and technologists continue to expand the practical applications of Artificial Intelligence (AI) every day.  Although extensive AI fraud detection applications  exists today incorporating human populated Rules Engines,  and AI Machine learning (independent rule creation).
  2. Consumer “Financial Insurance” Products
    • Observing a business, end to end transaction may provide visibility into areas of transaction risk.   Process  and/or technology may be adopted / augmented to minimize the risk.
      • E.g. eBay auction process has a risk regarding the changing hands of currency and merchandise.  A “delayed payment”, holding funds until the merchandise has been exchanged minimized the risk, implemented using PayPal.
    • In product lifecycle of Discovery, Development, and Delivery phases, converting concept to product.
  3. Transaction Data Usage for Analytics
    • Client initiating transaction,  intermediary parties, and destination of funds may all tell ‘a story’ about the transaction.
    • Every party within a transaction, beginning to end, may benefit from the use of the transaction data using analytics.
      • e.g. Quicken – personal finance management tool; collects, parses, and augments transaction data to provide client  analytics in the form of charts / graphs, and reports.
    • Clear, consistent, and comprehensive data set available at every point in the transaction lifecycle regardless of platform .
      • e.g. funds transferred between financial institutions may  have a descriptions that are not user friendly, or may not be actionable, e.g. cryptic name, and no contact details.
      • Normalizing data may occur at an abstracted layer
    • Abstracted, and aggregated data used for analytics
      • e.g. average car price given specs XYZ;
      • e.g. 2. avg. credit score in a particular zip code.
    • Continued growth opportunities, and challenges
      • e.g. data privacy v. allowable aggregated data
  4. Affinity Brand Opportunities Transaction Management Solution
    • eWallet affinity brand promotions,
      • e.g. based on transaction items’ rules; no shipping
      • e.g.2. “Cash Back” Rewards, and/or Market Points
      • e.g.3. Optional, “Fundraiser” options at time of purchase.
  5. Credit Umbrella: Monitoring Use Case
    • Transparency into newly, activated accounts enables the Transaction Management Solution (TMS) to trigger a rule to email the card holder, if eligible, to add card to eWallet

Is Intuit an acquisition target because of Quicken’s capabilities to provide users consistent reporting of transactions across all sources?  I just found this note in Wiki while writing this post:

Quicken is a personal finance management tool developed by Intuit, Inc. On March 3, 2016, Intuit announced plans to sell Quicken to H.I.G. Capital. Terms of the sale were not disclosed.[1]

For quite some time companies have attempted to tread in this space with mixed results, either through acquisition or build out of their existing platforms.  There seems to be significant opportunities within the services, software and infrastructure areas.  It will be interesting to see how it all plays out.

Inhibitors to enclosing a transaction within an end to end Transaction Management Solutions (TMS):

  • Higher level of risk (e.g. business, regulatory) expanding out service offerings
  • Stretching too thin, beyond core vision, and lose sight of vision.
  • Transforming tech  company to hybrid financial services
  • Automation, streamlining of processes, may derive efficiencies may lead to reduction in staff / workforce
  • Multiple platforms performing functions provides redundant capabilities, reduced risk, and more consumer choices

 Those inhibitors haven’t stopped these firms:

Payments Ecosystem
Payments Ecosystem

 

WordPress Shortcode API to Cloud Storage to Sell Any Digital Intellectual Property.

So, I was a browsing, going through bills, and thinking, hey relating to my other article on Google Docs and their new API where you could use them as a data warehouse, it occurred to me.   Why can’t we have a public API for all the Cloud Storage systems like Amazon Web Services (AWS) S3 (or Box.com), create a plugin to WordPress, add E-Commerce, and you now have your own place to sell digital music, or any Digital intellectual, property store, or host your own database OLTP or OLAP.

And my bro, Fat Panda, might have been thinking the same thing.  He’s one step behind, but he will catch on.  I will try to update for ‘the cheap seats’ in a bit.

For the cheap seats, even those static files stored up in the cloud, you can use a similar model to Google Docs <-> Google Fusion where you add tabular data to storage, read,over-write, or update using home made table locking mechanism, and essentially use the cloud as a data warehouse, or even a database.  Microsoft seems to have a lead on transitional and analytical storage with Microsoft Azure, relational in nature in the cloud, but it is so much simpler than that with cloud storage, although if not implemented with ‘row’ locking,there is an issue with OLTP (On Line Transaction Processing) row level, high volume, but with OLAP, On Line Analytic Processing, not so much, analyzing the way your business does business, and profit more from your consumer data.  There are easy ways to implement row level locking for row level locking of tabular data stored in cloud storage like AWS or Box.Net,  The methods to implement row level locking for OLTP systems using storage in the cloud are easy to implement, and will remind you of old school type alternatives to supplement the AutoNumber columns in MS Access or Identity columns in SQL Server. At the end of the day to either sell digital intellectual property from a WordPress implementation, or run your entire business with a robust cloud database solution for OLTP or OLAP systems using flat file storage!  Why go through all this when the Amazons AWS and Microsoft Azure have or will yearn to start building these solutions in parallel?  Cost effective solutions, and the entire database arena monopolized by Oracle, IBM, Microsoft, and MySQL, just got extended to a whole lot of database vendors.  It may take a while, but we already know the big Gorilla in the room Google is the first to strike in this game, as a non-traditional database vendor, cloud storage provider with their updated Google Docs API, and optionally usage of their Fusion application.

Tablet Developers Make Business Intelligence Tools using Google as a Data Warehouse: Completing with Oracle, IBM, and Microsoft SQL Server

And, he shoots, and scores.  I called it, sort of.  Google came out of the closet today as a data warehouse vendor, at least they need a community of developers to connect the dots to help build an amazing Business Intelligence suite.

Google came out with a Google Docs API today, which using languages from Objective-C (iOS), C#, to Java so you can use Google as your Data Warehouse for any size business. All you need to do is write an ETL program which uploads and downloads tables from your local database to Google Docs, and you create your own Business Intelligence User Interface for the creation and viewing of Charts & Graphs.  It looks like they’ve changed strategies, or this was the plan all along.

Initially I thought that Google Fusion was going to be the table editing tool to manipulate your data that was transferred from your transactional database using the Google Docs API.  Today they released a Google Docs API and developers can create their own ETL drivers and a Business Intelligence User Interface that can run on any platform from an Android Tablet, iPad, or Windows Tablet.

A few days ago, I wrote the article, which looked like they were going to use a tool called Google Fusion, which was in Beta at the time to manipulate tabular data, and eventually extend it to create common BI components, such as graphs, charts, edit tables, etc.

A few gotchas: Google Docs on Apple iPad is version 1.1.1 released 9/28/12, so we are talking very early days, and the Google Docs API was released today.   I would imagine since you can also use C#, someone can make a Windows application on the desktop to manipulate the data tables, create and view graphs, so a Windows Tablet can be used.  The API also has Java compatibility, so from any Unix box, or any platform, Java is write once, run anywhere, wherever your transitional database lives, a developer is able to write a driver to transfer the data to Google Docs dynamically, and then use Google Docs API for Business Intelligence.  You can even write an ETL driver which all it does is rapidly transfer data, like an ODBC, or JDBC driver and use any business intelligence tools you have on your desktop, or a nightly ETL.  However, I can see developers creating business intelligence tools on Android, iPad, or Windows tables to modify tables, create and view charts, etc., using custom BI tool sets and their data warehouse now becomes Google Docs.

Please reference an article I wrote a few days back, “Google is Going to be the Next Public and Private Data Warehouse“.

At that time, Google Fusion was marked as Beta on 10/13/2012.  Google has since stripped off the word Beta, but doesn’t matter.  Its even better with the Google API to Google Docs.  Google Fusion could be your starter User Interface, however, if your Android, iOS (Apple iPad), and Windows developers really embrace this API, all of the big database companies like IBM, Oracle, and Microsoft may have their market share eroded to some extent, if not a great extent.

Update 10/19:

Hey Gs (Guys and Gals), I forgot to mention, you can also make your own video or music streaming applications perhaps, using the basic calls of get and receive file other companies are already doing such as AWS, Box, etc. It’s a simple get / send API, so not sure if it’s applicable to ‘streaming’ at this stage, just another storage location in the ‘cloud’, which would be quite boring.  Although thinking of it now, aren’t all the put / send cloud solutions potential data warehouses using ETL and the APIs discussed and published above?  Also, it’s ironic that Google would also be competing with itself, if it was a file share, ‘stream’ videos, and YouTube?

PostgreSQL / local database and SOA, mid tier for cloud solutions to improve performance

In an article I read from the NY Times, Salesforce.com may be making a play to banish Oracle as a supported platform. However, the system which might be interesting would be a PostgreSQL, or in memory database, acts as a local cache for the transaction based system, clears the local database records/cache after it uploads the ‘staged’ data from the local database to a cloud database where the data is ultimately stored. The activities on the local database should be fast, and the cloud database is a) data that may be transformed to any cloud based solution vendor(s), if necessary, if an SOA is built on top of the local database which communicates with the cloud via APIs. b) enables a local data-mart, if not transferred in real time, i.e. use a nightly transformation and have access to “day of” BI on a limited set of local data, c) again transaction performance and data segregation of the warehouse. This architecture is already in use at many firms, but I wanted to call it out. Another option is to use two cloud database solutions, one ‘local’ to your region, and one globally dispersed for performance and redundancy using an ETL, although I am not convinced this would be a great architecture.  The second cloud tier can be a transformation from the first for regulatory archiving, if required by law either for finance or DR (Disaster Recovery) policy.

Google is Going to be the Next Public and Private Data Warehouse

In an article I wrote a while back, Google to venture into Cloud, provide Open Source APIs, assist small businesses to be Cloud Solutions Integrators, I was talking in the abstract, but I saw on the Google site, buried way down their menus, under the ‘More’, and then select the ‘Even More’ option, and at the bottom left of the page you will see Innovation, Fusion Tables (Beta).  Google is advanced, ready to compete with the database vendors, with a user friendly UI, better than I thought.  They are currently providing a way to upload data to a Google Drive, then the user imports the data from the Google Drive, and using table views  and Business Intelligence tools, allows the user to manipulate and share the data.  The data allowed to be uploaded into tables seems limitless. Although, they state Google is still in Beta, and publicly are showing users can upload and link to Google data instead of allowing users to connect to external data sources, such as your sales transaction database, there may be an API in the works for 3rd parties to allow for integration using direct connections through drivers such as ODBC or a JDBC driver to integrate with transactional systems to stream data and not just uploaded Google data.  However, this may be their strategy, to host all of the data, and have a migration utility.  At this stage, they would like to house the data and have the cloud storage infrastructure, however, the strategic mid-term goal may be to allow you to house your RDBMS transaction data locally, and we could stream, and/or upload into their data warehouse to apply Business Intelligence to manipulate the data, and then publish it in multiple formats, e.g. they would display the data for public or private consumption, and I can also see you are able to then publish charts with commentary into your Google Plus stream with specific ‘Circles’.  Brilliant.  Hat’s off to you guys.  If Google allows streaming of the data, or what we call data transformations from your e.g. sales transaction system to the Google data warehouse, then they would be competing with IBM, Oracle, and Microsoft.

Update: 12/26/12
After all of that profound scoping, and keen insight, I was chatted by a developer that Google’s BigQuery does the job better.  I am curious why it has not taken off in the Marketplace?  Anti-Trust?  Also, why then create an abstraction layer like these other products like Fusion and call out explicitly Google Docs, maybe that would help them transition into the market space with a different level of user the consumer, or the target user would be different, such as the small business.
[dfads params=’groups=1177,1178&limit=1&orderby=random’]

Big Data Creates Opportunities for Small to Midsize Retail Vendors

Big Data Creates Opportunities for Small to Midsize Retail Vendors through Collective Affinity Marketing outside Financial Institutions.

In the Harvard Business Review, there is an article, Will Big Data Kill All but the Biggest Retailers?  One idea to mitigate that risk is to create a collective of independent retailers under affinity programs, such as charities, and offer customers every N part of their purchase applies to the charity to reach specific goals as defined by the consumer.   Merchants, as part of this program, decide their own caps, or monetary participation levels.  Consumers belong to an affinity group, but it’s not limited to a particular credit card.  The key is this transaction data is available to all participating merchants for the affinity.  Transaction data spans all merchants within the affinity and not just the transactions executed with the merchant.

Using trusted, independent marketing data warehouses independent retail vendors share ‘big data’ to enable them to compete and utilize the same pool of consumer [habitual] spending data.

Affinity, marketing data companies can empower their retail clients/vendors with the tools for Business Intelligence and pull from the collection of consumer data.  Trusted, independent marketing data warehouses sprout up to collect consumer data and enable it’s retail vendor clients to mine the data.

These trusted loyalty affinity data warehouses, not affiliated with a single financial institution, as previously implemented with credit cards, but more in line with, or analogous to, supermarket style loyalty programs, however, all independent retail vendors may participate OR may cap these affinity program memberships for retail vendor from small to mid-size companies.

Note: Data obfuscation could be applied so customer identification on fields like social security number will not be transparent, limiting any liabilities for fraud.