Disk space is amongst the most precious of all IBM i resources. Inherently expensive and often susceptible to rapid consumption when problems arise, disk dramas, and how to avoid them, are always top of mind for managers of demanding systems environments. They will be all too aware that the consequences of disk issues can quickly radiate out across a system to manifest as poor performance, a lack of application availability to users, and in extreme cases, even lost data.
The impact of disk problems on the bottom line of the IT budget can be determined by two important factors. The first is how successfully disk can be carved up and portioned out to hungry objects, libraries and jobs. The second is the speed of problem identification and resolution that can be achieved during an escalating disk issue. If a cursory glance at your budget reveals that you’re dropping more dollars on sudden disk related issues than you’d care to, then it’s time to revise your approach to disk and evaluate whether you’re doing all you can to eliminate these problems.
Part of the challenge in dealing with disk issues successfully requires knowing the root cause that triggers rapid consumption and this can mean looking for the cause behind the cause. Is a surge in disk being caused by a looping job? (Just one of the possibilities.) If so, which one? (Thousands of possibilities!) Has a user ASP overflowed? (Another possibility.) If so, which one? Which user? (Multiple possibilities!) For as long as it takes to investigate these possibilities you still have the main problem of the disk spiraling ever upwards – unless you throw more disk at the problem to quite literally buy yourself additional investigation time. Ouch. There goes the budget, again. So the real question becomes, how do you break this pattern?
The answer is to make good use of the one thing that’s more valuable than disk: Real-time insight. With this at your disposal, you not only have the benefit of knowing about fluctuating disk levels as they happen but also, all the associated problems and their root causes, before any of these things has a chance to impact disk utilization, users, data, or availability. A solution like Robot Monitor allows administrators to be vigilant to potential threats to disk by monitoring and taking a broad approach to also cover the areas where issues commonly take hold. Our top 5 recommended areas to monitor for a proactive ‘best practice’ approach to disk would include: Journal Receivers; Temporary Storage; ASP’s; Important Files and Looping Jobs.
When used in conjunction with thresholds and alerts, real-time monitoring takes real-time insight to the next level. Whether the problem presents itself as a status condition change, such as a journal receiver that has suddenly become inactive, or whether some other resource is experiencing an unusual or unexpected surge in consumption, size or other metric definable, such as used space for an ASP, temporary storage or a suspicious surge in the history file, thresholds will help administrators identify these problems at an early stage. Fast investigation completes the process by immediately pinpointing the underlying job, subsystem or user causing the issue that left unchecked, would otherwise impact disk. The combination is beyond powerful as not only are problems stopped in their tracks before they reach any kind of ‘critical’ level, but investigation time is virtually eliminated to a single click in some cases for all the insight details required for immediate resolution.