Used the right way, structured query language (SQL) can be an advantage for any organization's operations. Used the wrong way, it can be a headache for IT professionals.
A majority of computing operations are dependent on the ability to access information from a database. That's where SQL queries come in.
So the efficiency of SQL queries in DB2 can dramatically impact the tasks running on IBM i.
But as organizations store larger volumes of information (and leverage data-intensive applications), speed becomes a much greater concern. How can operators improve DB2 performance?
Kent Milligan, a Senior DB2 Consultant at IBM, has a lot to say on the subject. Catch every word of it. Watch the on-demand webinar >
Here's what you need to know to better understand the factors that affect DB2 performance on the i—and how to keep your SQL queries running smoothly.
SQL Requests on IBM i
DB2 runs an SQL request in three primary stages: optimize, open, and run.
The optimizer analyzes a request and looks at your system to determine the best way to run a query. This information is stored in a plan template for the run phase.
The plan template is sent to the open phase, which is what actually creates the processes that follow the blueprint. This is where the run-time structures that DB2 requires are implemented.
This phase is what actually runs those outlined processes to display a result to the user.
How These Queries Work
The first two steps generally happen pretty quickly, though time may increase for particularly large databases. There are many factors that can influence how quickly a particular request runs, like how much the hardware is already being utilized and the speed of the CPU.
One way to tune DB2 performance is to look at how the applications themselves perform scans.
An IBM's developerWorks blog post written by DB2 consultant Dave Beulke highlighted several practices that can improve overall system performance. This, in turn, has a positive influence on request run times:
- Avoid unnecessary table space scans
- Limit the number of prefetch and index scans used
- Use techniques such as limited expressions
A table space scan is one of the most resource-intensive DB2 paths. Excessively using these and other taxing processes such as prefetch and index scans can consume extra CPU and limit application throughput.
Beulke cited an example of one client's system review, which found that 1/10 of the organization's applications performed table space scans. That is significantly more than recommended industry best practices.
The only drawback to these strategies is that they require a thorough review of your organization's software. And they may also necessitate some programming changes to make software interact more efficiently with DB2.
IT teams that are already strapped for time (and what IT team isn't?) may find it difficult to make room in the schedule for such an extensive review process.
Fortunately, there are several DB2 performance tools that are easy to leverage and allow organizations to make improvements more quickly.
DB2: A Primer on Performance
There have historically been two engines for processing SQL queries on IBM i. The first is the classic query engine (CQE), which was designed to process queries that came from non-SQL sources. CQE was a temporary fix to address the growing popularity of SQL. Today, the CQE is used less and less.
There are significant performance improvements of SQE over CQE,
Milligan offered the example of a customer that converted its Query/400 reports into SQL statements. As a result, the time required to run reports dropped from 90 minutes to 20 to 30 seconds. This is less of a concern for organizations using IBM i 7.1 or later. In these releases, all SQL requests are processed using SQE.
Common DB2 Performance Tools
There are several handy tools for monitoring and improving query performance in DB2.
The plan caching tool made its debut in V5R4 and has been enhanced in major releases since then.
This tool can do SQL performance analysis without running any other processes on the system. It provides a graphical interface to show the plan templates the optimizer creates. You can also look at all of the plans stored in the cache at a given time with the option to filter by user or other criteria.
But there's one thing you should keep in mind. The data will be highly dynamic, so users may want to utilize the snapshot functionality to perform an analysis on static information.
The data in the plan-caching tool can be used to identify longest running statements to isolate and investigate performance issues. Or it can be used to look at statements that are currently active on the system for real-time analysis.
IBM introduced this tool with V5R4 to automatically record index advice from the optimizer. The graphical interface makes it easier to investigate whether indexing is the source of query performance issues.
While beneficial, it's important to note that this tool is not meant to replace human insight.
Yes, you will have a more robust understanding of the jobs running on your system than any automatic tool. But the advisor can act as a supplement to that knowledge.
Milligan used the example of a customer that had one index being advised two million times. This would be an indication that an index should be created as long as it fits with the tasks that are most often run.
With 6.1, IBM added a condenser to consolidate overlapping index key advice.
Creating a bunch of new indexes can contribute to index sprawl and wasted disk storage space.
The index evaluator analyzes existing indexes and identifies ones that are not used.
This tool is especially valuable for complex databases. For instance, the optimizer may not use a particular index in its process. But it can still leverage some of the information from that index for improved query optimization.
Evaluator tracks when indexes are used both in queries and for stored statistics.
Complete Visibility Over Your Database
The actual performance gains from each of these tools may vary depending on the types of queries being run and the complexity of the database in question.
Regularly using them will ensure that operators can maintain complete oversight of their systems and address performance issues more quickly.