Disk Danger Zones: How to Avoid the Top Five Most Expensive Disk Issues - Thank You

Introduction

Text

Many IT managers are all too familiar with the “disk or die” ultimatums that a disgruntled IBM i can issue. They’ve felt the resulting fallout of poor performance, lost data, and most commonly, the additional cost that can quickly run into many thousands of dollars of unplanned expense. 

A system that is equipped to monitor disk space effectively is one that can avoid the unnecessary expense of purchasing additional disk in a bid to buy extra investigation time to resolve the underlying issues that cause disk-related problems. 

If you’ve been held hostage to your system’s insatiable appetite for disk and are looking for alternative solutions to soothe the savage beast, look no further than your approach to disk monitoring. This guide identifies the critical areas that can lead to unplanned disk expense and helps you establish a bulletproof roadmap to bypass future disk problems.

Disk Hit List: Top Five

Text

Problems with files, journal receivers, data queues, or spooled files have the potential to impact disk availability, placing the availability of whole applications and even the entire system at risk. Immediately catching instances of runaway disk and isolating the cause are the two preliminary objectives of optimal disk management.

The following are areas that most frequently cause disk-related problems. When left unattended, these issues could have the greatest impact on your IBM i—and your budget.

ITEM ISSUE IMPACT
JOURNAL RECEIVERS A journal receiver has become inactive.
  • Performance is degraded
  • High availability switchover fails
  • Data is lost
  • Drop in user productivity
  • Breach in regulatory compliance
TEMPORARY STORAGE There is a sudden spike in temporary storage, which requires a thorough investigation.
  • Time spent identifying which job caused the spike
  • Time Spent identifying who submitted the job
  • Time spent identifying the subsystem where the job was located
  • Disk is rapidly consumed in the meantime
  • System availability at risk
AUXILIARY STORAGE POOLS (ASP) There is an ASP overflow.
  • Data loss at risk if the system fails
  • Breach in regulatory compliance
  • Disruption to the user community
  • Loss of productivity
  • Additional impact on human and disk resources
IMPORTANT FILES There is a sudden surge in file size.
  • Degradation of system performance
  • User productivity reduced during an important processing period
  • Additional disk space required for use before scheduled purge
QTEMP OBJECTS Erring jobs are storing massive amounts of data in QTEMP libraries.
  • Drop in available disk space
  • Costly resolution, as QTEMP issues are hard to diagnose
  • Erring jobs lose data

 

Counting the Cost of Disk

Text

The top five causes of disk issues not only incur the expense of additional disk usage, but also demand additional man-hours, reduce productivity for the organization’s user community, and could cause system downtime. This could generate financial penalties for managed environments
where service-level agreements (SLAs) are not met. In calculating the true cost of disk issues, a company should account for all of these factors.

Any disk monitoring solution should pay particular attention to these five areas and send notification to systems administrators whenever there is a change in individual size, collective quantity, percentage statistics, or status in real time. Doing so will put operators in a proactive position to manage and respond to disk space threats and will safeguard against the unnecessary expense of avoidable disk usage.

Why Real-Time Insight Resolves Issues

Text

Real-time insight into sudden changes in disk and an effective means of pinpointing the cause means unexplained temporary storage spikes no longer require time-consuming, manual investigation. 

Real-time monitoring eliminates the chance of data loss following a system crash. It also prevents system and user ASP overflow and makes it possible to carefully monitor and maintain independent auxiliary storage pools (IASPs) to ensure they do not breach their limits and compromise data. 

The availability and audit-ready status of files in the audit journal is crucial for compliance to Sarbanes-Oxley (SOX) and other regulations, including the Payment Card Industry Data Security Standard (PCI DSS) and the Minimum Internal Control Standards (MICS). If a looping job causes the journal receivers to continually fill up, systems administrators may be forced to make additional disk available to these files. Real-time monitoring means you have the economical option of running the receivers to tape while the problem is resolved instead. 

Commonly, issues impacting disk can form an epicenter of trouble that radiates out to other areas of the system. This is why it is not sufficient to monitor disk usage in isolation, but rather all the elements that could impact disk.

Requirements for Disk Monitoring

Text

A quick review of your present disk monitoring requirements can help to evaluate just how vulnerable your system is to the top five problem areas. Without the following criteria in place, IT managers will have little option but to resort to costly additional disk purchases and spend additional staff resources in order to resolve problems manually.

Requirements for Resolving the Top Five Issues Impacting Disk

  1. Real-time awareness of inactive journal receivers, temporary storage issues, ASP issues, the status of important files, and QTEMP objects
  2. Detailed information about jobs causing disk issues and the user profile under which they are active
  3. A fast means of checking all receivers at all times
  4. Alerts and thresholds attached to each ASP and IASP
  5. An immediate view on the space free and space used for ASPs and IASPs
  6. Immediate knowledge on number of files that may be building in a particular library at any given time
  7. Real-time thresholds and alerts for files that are prone to quick growth, including history files and management collection objects (created by IBM i Performance Collection)
  8. Real-time identification of looping jobs to minimize manual efforts

How to Buy Time and Save Disk with Robot Monitor

Text

In addition to fulfilling the requirements that will safeguard against the top five issues, Robot Monitor performance monitoring software has threshold capabilities that trigger alerts about impending disk space issues, giving you time to resolve issues before users are negatively impacted.

Robot Monitor provides a huge range of monitoring metrics that can be applied across your IBM i to protect disk space. Any system object with a variable parameter can be monitored by size, number, or percentage, including data queues, files, records, journal receivers, network files, objects, spooled files, and ASPs. 

ASP Busy monitors can also keep track of associated disk issues that impact performance rather than usage alone. Automated tasks can then be put in place to run routine operations that keep disk resource usage at acceptable levels, such as deleting certain files after they reach a predetermined number, percentage, or size.

Disk Information Available in Robot Monitor

Text

CLOSEST TO HARDWARE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


CLOSEST TO USERS

LEVEL INFORMATION PROVIDED ROBOT MONITOR CAPABILITY
PHYSICAL DISK
  • Physical disk failure
  • Cache battery remaining lifetime
YES
RAID
  • Loss of RAID protection
YES
ASP
  • ASP capacity used
  • ASP disk busy percent
  • ASP status
YES
IMPORTANT FILES
  • Temporary storage consumption
YES
QTEMP OBJECTS
  • Object count and object size
  • Costly resolution, as QTEMP issues are hard to diagnose
  • QTEMP object count and object size
YES
OBJECT TYPE-SPECIFIC
  • Journal receiver status
  • Total journal receiver size
  • Output queue status
  • Spooled files count
  • Records count
  • Deleted records count
  • Remote journal lag
YES
JOB
  • Disk I/O
YES

 

In addition to real- and near-real-time information, Robot Monitor tracks object-level disk usage in daily intervals, enabling granular analysis and long-term reporting on disk space consumption.

Call us at 800-328-1000 or email [email protected] to set up a personal consultation to review your current setup and see how Robot Monitor can help you achieve your monitoring goals.