Last week I attended the annual Data Warehousing (TDWI) event in Orlando. This is a fantastic, intimate setting of the top experts, and plenty of novices, in data warehousing and analytics. The event is totally focused on education, with only a small amount of vendor hype.
Some interesting new things came out of the conference, and plenty of improvements for older analytic concepts and technologies.
The major takeaways were:
Data visualization has gone completely mainstream.
Clearly, presenting data visually — vs. in tables, or distributed as printable PDFs — is the dominating trend, although in HC we still see a lot of visually challenged information presentation. Really, there is no excuse given the availability of easy to use and low cost visualization tools. Time to move beyond boring tables.
There are two flavors of data visualization along a continuum vs. distinct markets. Visualization products span from tools and usages for high-powered end-user analysts, to on-the-spot at-a-glance dashboards to assist production type activities. All products are beginning to overlap each other covering both ends of the visualization spectrum. Vendors with dashboards have made them interactive and vendors with complex data discovery and modeling approaches offer ways for mass deployment of the analytic conclusions converting them to end-user dashboards with one click.
The other manifestation of the popularity of data visualization is that the large BI vendors are catching on. SAP/BusinessObjects is offering a free, very useful desktop data visualization tool (Lumira), with a paid version adding more features (basically database connectors). IBM/Cognos is about to release a new data visualization tool, also. Then you have a growing number of third party data visualization tools that continue to improve (i.e. Yellowfin, Tableau, Qlikview, DOMO, etc.). Plus, cloud BI vendor Birst is also getting on the visualization bandwagon.
Finally, rationalization of the whole so-called “big data” market.
People are finally getting it that most users, especially in healthcare, do not have a big data problem. Most of the selling points for big data can be easily accomplished with traditional analytics tools and database technology. Last year I wrote a Monthly Update titled “Big Data for the Rest of Us.” If I had to do it again it would be “Big Data NOT for the Rest of Us.”
Seems like the appropriate entry point for real big data technology (basically Hadoop and variants) is when you have a lot, and I mean a lot, of unstructured data being produced and stored. This means, de facto, machine generated data and most likely unstructured machine generated data. I am talking about systems that generate 200,000 transactions per second and up. Certainly no EHR or LIS in even the largest IDN is producing these kinds of data volumes. Even Healthcare.gov, as is stumbles to life, will ever reach that kind of volume.
In healthcare, Big Data is mostly a big distraction – more hype than reality.
Database engineers are busy creating mind-blowing, architectures.
Lots of new ideas in terms of performance architecture and cloud solutions. memSQL was there showing mind bending performance doing hundreds of thousands of insertions per second into an relational database, while simultaneously executing complex multimillion row queries in seconds, this across a geographically dispersed cluster of cheap Linux servers. TreasureData had its innovative agent based cloud database product on display. This architecture greatly reduces the latency of hauling data from the generator systems to the analytics processes.
TDWI continues to amaze and is never boring. I just love this type of event where you can rub elbows with the best minds in the analytics world has to offer in a non-sales oriented forum. Truly a great event.