Big Data: Big Deal or Just a Big Headache?
Organizations worldwide have caught the big data bug, as the promise of more informed decision-making is driving people towards analytics-driven insight. Big data is usually defined by the three Vs—volume, velocity, and variety—but one of those Vs has risen significantly in stature due to modern demand: velocity.
Data velocity holds significant implications for IBM i operators due to the way table partitioning works on the platform. When using legacy software, such as Query/400, users can only search a single partition at a time. While it is possible to construct multiple queries to perform searches, this can be time-intensive and will ultimately delay time-to-value. Before addressing this challenge, let's explore whether big data is even worth the time (and money) in the first place.
How is Big Data Being Used?
It's tough not to notice the prevalence of big data, and that interest is more than just technology journalist hype. According to Gartner, 42 percent of IT leaders are making investments in this area. However, this does come with the caveat that not many organizations are prepared for the big data demands their technology departments will soon face.
"Most organizations are still in the early stages, and few have thought through an enterprise approach or realized the profound impact that big data will have on their infrastructure, organizations and industries."
Even though many organizations have begun exploring big data, it still leaves the question of what they're ultimately getting out of it. The media may pay a lot of attention to the use of analytics in marketing environments, but it is essential to understand that big data's potential value extends across numerous sectors and disciplines. For instance, a 2012 report from IDC suggested that vendors in the big data market would find opportunities at every level of technology—software, infrastructure, and services. Further highlighting the potential in numerous cases, IDC highlighted several non-traditional areas that have benefited from big data:
- Life sciences research
- Equipment monitoring
- Legal discovery
- Traffic flow optimization
- Healthcare outcome analysis
- Supply chain management
Big data: Use Cases with Big Results
While many organizations are still in the planning stages of their big data initiatives; others have already achieved significant results. For example, Kaiser Permanente, a large HMO provider, implemented a big data exchange system across numerous medical facilities to share information and improve diagnosis and treatments. As McKinsey reported, this resulted in $1 billion in savings since patients did not have to visit the office as much, and doctors did not have to perform as many tests to arrive at an accurate diagnosis.
Another case for big data comes from Morgan Stanley. Using traditional tools, the firm had difficulty determining the connection between web events and database problems. However, analytics allowed its IT department to perform comprehensive log analysis and correlate events with database read/write errors.
"Let's assume there is a market event," said Morgan Stanley's Gary Bhattacharjee. "Now we have entire traceability in terms of who did what, when and how, what caused issues, and what kind of data is being transacted. We can tie the front office with what is going on in the back office, and what data goes haywire."
As these cases illustrate, big data is not just for marketers, and numerous industries can benefit from exploring such initiatives. For those on IBM i, it is critical to go back to the essential component of velocity.
Big Data on the i
As previously mentioned, Query/400 only queries one file partition at a time. This can be annoying for general data access, but it's a big migraine when you're considering big data-level volumes. Imagine the challenge of using multiple queries to explore hundreds of terabytes of information.
This is not the only issue presented by relying entirely on legacy software. Even assuming that it could more easily provide access to multiple partitions, those tools rely on the Classic Query Engine (CQE), which is much slower than the newer SQL Query Engine (SQE). In other words, without additional tools to facilitate data access, one of the core Vs of big data will be missing on the i.
Tools such as Sequel Data Access close the big data gap by making data access both faster and easier. Sequel is a great tool for big data on IBM i because it provides:
- Access to multiple partitioned files
- Visualizations for complex data sets
- Graphical user interface options for non-technical users
These core features ensure that organizations can handle a huge amount of data (volume) faster than they would be able to using traditional solutions (velocity).
Successful big data also depends on the speed end users can actually use the data, which is why accessibility tools designed to reduce potential technical barriers are important. While it may be tempting to strictly control who gets to run reports and access particular files, the growing number of employees that depend on big data has placed too much pressure on IT. It is important for IT to empower end users with data and adopt the tools that will secure and facilitate self-service.
Is it Time to Move Forward with Big Data?
Big data can present new obstacles without the right tools to handle it. However, research has shown that those that leverage large-scale analytics will create new opportunities, efficiencies, and help their organizations make much better decisions. It may be tempting to put off your big data project—after all, other organizations have only begun launching these initiatives. However, it is precisely this fact that makes it a great time to start addressing these challenges and integrate big data solutions into your enterprise sooner rather than later.
Need Help Managing Big Data?
Learn what big data means for IBM i and how Sequel can summon your data, even in the scariest scenarios. Check out our free, on-demand webinar, Deliver on Your Big Data with Big Data Access, to see how Sequel can access data to help you measure, monitor, and manage your big data easily and efficiently.