The volume of data in the digital universe is growing exponentially. We already have over two zettabytes (two billion terabytes) of data, and by 2020 estimates are that we’re going to be generating over ten times that amount every year.
The technology that we have at our disposal to process, store, and analyze all of this data is also improving very quickly. Since 2004, when Google published its groundbreaking work on the MapReduce paradigm, the Big Data community has created Apache Hadoop and dozens of related products spanning everything from machine learning through high performance ETL.
Recently, however, a new challenge has emerged. The early adopters of this powerful Big Data technology were the familiar consumer Internet companies of Silicon Valley like Yahoo, Facebook, Amazon, and LinkedIn. These companies were created essentially from scratch, and so were able to infuse software like Apache Hadoop at the very core of their offerings.
Within the last few years, this Big Data technology has seen rapid adoption by more “grown-up” entities across every industry, including financial services, retail, healthcare, and city government. These businesses and governments already have massive platforms, such as the mainframe, that have been built up over decades to handle their data systems of record and incredibly large transaction volumes. The new Big Data technology isn’t very useful if it can’t ingest and process the most important data stored in these systems!
But how do you efficiently and securely sync mainframe data to Apache Hadoop without disrupting existing high-performance mainframe systems? And how does a large enterprise manage and secure all of the Big Data technology sprawl being deployed? The early adopters of Big Data technology didn’t have to confront this challenge, since they didn’t have mainframes and they had much less complex infrastructure environments — but for most enterprises it turns out that these are hard problems to solve.
As software vendors, we’ve made a tremendous amount of progress already, and there’s a lot more to come, so please stay tuned. I recently gave a talk at CA World about the state of big data and how we might tackle some of the challenges that lie ahead. Hit play to hear more.
This video is part of a new online video speaker series called Luminaries Take 5 which was kicked off at CA World 2013. The five minute video segments, featuring technology industry visionaries, offer a provocative perspective on how disruptive technologies will drive innovation in today’s organizations. Other Luminaries Take 5 video segments can be found on ca.com and YouTube.