As a system engineer, I’ve done hundreds of demos and helped many companies set up their IBM i monitoring. After 13 years in this business, I’ve developed a pretty good idea of what people want to be most aware of on IBM i, and it’s this: Are my applications running OK?
This makes sense because, ultimately, IBM i is a very well-made tool for running your applications. In turn, you want to keep your applications running because applications are tools to make your end users happy.
In theory, asking whether your end users or customers are satisfied would the best gauge of success, but trying to steer your business on that information alone would be too time-consuming and subjective. Instead, you turn to your applications and ask them how they are doing. That’s monitoring in a nutshell.
Monitoring can be done fast and repeatedly, especially when a program does it. Your organization could (and would) never hope to survey your end users every 30 seconds.
Job Status Tells All
An application on IBM i is incorporated by jobs. There are many metrics that you can measure over jobs, including:
- CPU usage
- Disk I/O
- Page faults
- Temporary storage
These are all valid and important metrics, but on IBM i we are able to know the single most important piece of information about a job: its job status.
Knowing a job’s CPU usage, disk I/O, temporary storage, and so on is fine, but what you really want to know is how the job is doing. The idea of the job status is so radical—Windows has nothing that can compare—and yet so simple that we sometimes take it for granted.
Application Monitoring Must-Haves
Now, with your application monitoring solution, you’re monitoring the job status, CPU usage, disk I/O, page faults, and temporary storage for your application’s jobs.
You’re checking the system’s disk to make sure it has enough space left for your application to move and do what it needs to do in order to make your end users happy.
You’re also monitoring things like the size of the journal receivers that your application fills up and using SQL-based monitoring to pull error information from your application’s files.
On top of that, when a job has an issue you need to know right away. If you’re using a good application monitoring solution, you will.
For example, this is what a job in good shape looks like in Robot Monitor:
TRNPUSH is the name of a job related to the Misys Equation banking application. Robot Monitor has automatically translated the job’s good status (e.g., Dequeue Wait or Delay Wait) into the color green and the text “OK”.
If the job has an issue and goes into (Inquiry) Message Wait, this is what Robot Monitor shows within 30 seconds:
As you can see, the bad job status (e.g., MSGW) has been translated to the color red, a different text (“Error”), and flagged with an icon. In addition, a message was sent to an IBM i message queue:
If you’re also using a message management solution, the message can be pushed out automatically via email or SMS.
When it comes to notification, these are the three must-haves for your application monitoring solution:
- It must be quick.
- It must be clear.
- It must be multi-channel.
In addition, your solution’s method of monitoring should also be easy on resources, so you can have this kind of check performed on many jobs, again and again, without your applications suffering.
A Cautionary Tale
In the demos that I do, showing these three must-haves are some of the biggest “Ah ha!” moments for many. This is probably because they are still suffering from what I call “Watcher in the Dungeon” syndrome. It goes like this:
Down in a dungeon deep below your business, behind a great number of reinforced doors, sits the Watcher. The Watcher’s job is to watch your applications. He is very good at this kind of subterranean green-screen monitoring. Whenever something goes wrong with a job, he sees it instantly. He then makes a note on a piece of paper, which he drops through a small slot in the door.
Twice a day, someone on the other side of the door picks up the pieces of paper, passes through the great number of reinforced doors, and ascends the stairs that separate the dungeon from the business above to hand the pieces of paper to you, thus making you aware of your application’s problem.
Unfortunately, this kind of twice-a-day, delayed notification is the best that checklist-based manual monitoring can do. Feels a little bit like the Dark Ages, doesn’t it? Conversely, you’ve now seen how a sophisticated monitoring solution like Robot Monitor brings color and clarity to your IBM i monitoring, from job status through to application availability, and we’ve barely scratched the surface.
The moral of the story? Don’t dungeon-monitor your applications.