Many businesses have turned to big data to gain insight into their operations as well as build a better understanding of their customers, but the technology to report on this data has actually been deployed in a large variety of settings. These use cases ranging from allowing users to track the presence of agriculture in urban settings to monitoring the spread of viral infections. Just as important as how it is being used is just how large big data actually is. Here are some relevant statistics to show just how quickly information in the modern world is exploding:
- IDC predicts that the global digital universe will have grown by a factor of 300 between 2005 and 2020
- The amount of stored data is expected to double every two years in the same time period, reaching 40,000 exabytes
- Businesses will store and manage the majority (80%) of information in the digital universe
For context, the amount of storage required to produce the CGI effects in the movie Avatar was about one petabyte. If that’s not impressive enough, researchers estimated in 2007 that the sum of all human knowledge in the world was close to 295 exabytes. In other words, the amount of digital information in 2020 will exceed everything we knew in 2007 by a factor of 135.
Of course, most businesses aren’t just collecting and creating information. The promise of big data lies in the ability to use it, so it’s worth noting two other key factors: Where data is coming from and how much is being spent on it. Wikibon lists numerous statistics from various sources, but here are some relevant highlights:
- Google processed 20,000 terabytes of data daily in 2008
- IDC estimates that businesses will conduct 450 billion transactions every day in 2020
- AT&T's largest database contains 312 terabytes of data
- 30 billion pieces of content are uploaded to Facebook daily
- Wikibon predicts that big data spending will reach $50 billion by 2017
- Poor quality data costs organizations between 20% and 35% of their operating revenue
IT teams are going to be challenged over the next 5 years to build resilience and availability into the storage systems that are managing big data. Reporting on this data will be one issue and managing it will be another. Unfortunately, the latter is lagging in spend and awareness.
Most organizations have expressed interest in big data, with researchers predicting that 90% of Fortune 500 companies will invest in big data initiatives in the coming years. Despite widespread interest, the volume growth of stored information greatly exceeds the rate at which IT budgets are increasing—40% and 5%, respectively.
Luckily, IT teams can make preparations for the impending data deluge by getting a handle on the data they currently have and establishing best practices for organizing the disk space they already own. Since continually buying disk space gets to be expensive, inefficient, and inexact, software that automates data storage and management is the strategic alternative. Robot Space, for example, automatically cleans up redundant data and prevents disk space from exploding in the IFS. Robot Space also keeps systems stable with rule-based maintenance and even predicts future disk space requirements based on historical data, making it easier to plan and budget for what’s coming.
So, just how big is big data? Big enough to demand a proactive solution that eliminates wasted space before it happens.