Based in Las Cruces, New Mexico, and El Paso, Texas, Mesilla Valley Transportation is one of the largest locally owned truckload carriers in the U.S. The company began in 1982 and specializes in time-sensitive service between major manufacturing areas in the U.S., Canada, and Mexico.
Message Management Was a Long Haul
For Peter Scordamaglia, IBM i Administrator for Mesilla Valley, message management had become a time-consuming and round-the-clock problem. Not only were he and his counterpart losing sleep dealing with intermittent but consistent overnight application outages, but the company was also experiencing increased costs or missing out on valuable business opportunities such as when the outage was with the EDI system or the connectivity to the truck routing software.
The team was using a basic but functional product to monitor QSYSOPR—watching for jobs, lines, and devices—but it was generating thousands of messages each day, making it difficult to determine which messages needed action, and which were simply noise.
With 30 years of in-the-trenches IT experience, Scordamaglia figured there had to be a way to apply logic and automation that would make the response process simpler and easily repeatable. “If I have to do the same thing three or more times, it should be automated,” he said. “There were simple reactions the old software could take, but to get to the complex or programmatically intuitive reactions, it just couldn’t do what I wanted.”
Getting a Handle on an Unruly Message Queue
It turned out that 40 percent of the message activity for the IBM i platform was in the form of CPF9897 and CPF9898 notices generated from the transportation management system (TMS). But there was no way for Scordamaglia to react to these values efficiently.
“That was the biggest hurdle,” he said. “For example, we were having connectivity problems to our routing software. This mission-critical system calculates the most efficient trucking route for drivers and the associated costs from point A to point B. Several times a week, typically at night, it stopped working.”
“Waking up every couple of hours to make sure the system was functioning was terrible,” Scordamaglia said. He knew he needed to automate the process so when a problem occurred, the program could make several attempts to rectify the situation before notifying him.
Implementing Monitoring and Automation That Goes the Distance
Scordamaglia and the team did their due diligence in looking for the right message queue monitoring solution. After going through an extensive RFP process with demos, they ultimately selected Robot Console from HelpSystems. The support ecosystem, the ability to have an IBM i expert on the other end of the phone, and bounty of related IBM i software and solutions as well as solutions for other platforms made for a strong business case.
The value came quickly. With Robot Console, Scordamaglia could manage messages by exception, filtering out the noise and getting to the important items buried within thousands of messages. “That third or fourth night when the QSYSOPR message center had 40 lines in it instead of 40,000—that was the moment when I knew this easily would do everything I wanted. That alone was a night and day difference,” he said.
With Robot Console, Scordamaglia was able to see the important things that were going wrong—and hide the messages that mean nothing. “Robot allowed me to filter the noise away, and I could start to focus on the messages that were left to start to see the patterns within those,” he said. It was a matter of picking the ones he saw most frequently and thinking logically about how to solve them with automation.
“I probably have 45 message rules that I’ve created,” he said. These run the gamut from ‘job waiting for reply,’ and ‘don’t escalate’ inquiry messages, to autocorrective actions for attempts to write duplicate records and many more.
The add-on Robot Alert functionality has also made time-saving improvements in notifying the right people about important situations as they arise. Sometimes this means the after-hours help desk needs to be notified in addition to Scordamaglia’s team.
In particular, they like that Robot Console stops escalating alerts once the condition is gone. The previous software continued to escalate and notify people even after the issue was resolved until someone acknowledged the condition, which would almost only be the IBM i administrative team.
Achieving Critical Uptime for EDI and Other Systems
Other departments are benefitting from Robot Console as well. It’s now preventing errors that could shut down systems such as EDI and other mission-critical, third-party applications that interface with IBM i. If the team sees a particular error message in QSYSOPR, they can notify a specific admin or the TMS expert directly. For example, if the fuel system is down, the fuel analyst gets notified. Scordamaglia and his team have complete trust and peace of mind in the capability and reliability of Robot Console. There are no more overnight checks or unplanned downtime with these applications.
One of the biggest impacts has been with the EDI system. “The industry has a requirement for those companies using EDI that if they send you an EDI request, you must respond to that request within one hour or you lose the chance at that business,” Scordamaglia said. “So, if the EDI system has been offline from 2 a.m. to 7 a.m. or later, that’s five hours of potential business lost.” Now, the automated recovery process will attempt to call the failed program to restart it. If this doesn’t work after three tries, the team is notified to take direct action.”
Driving Efficiency and Cost Savings to the Tune of $100,000 Per Week
Scordamaglia figures he’s personally saved about 80 hours over a three-month period with Robot Console. “I was checking subsystem and overall system functionality every four or five hours just to give myself peace of mind,” he said of the previously uptime-challenged integration points. He was also spending two hours almost every Sunday dealing with a consistent problem with the mileage product in addition to all the time lost during the work week sifting through messages.
He estimates that Robot Console is reducing potential revenue loss for Mesilla Valley by approximately $100,000 per week by keeping the TMS system up and running consistently and ready to respond to tendered EDI jobs, etc.
“To me it’s simple logic—it’s how we should attack life!” Scordamaglia said of Robot Console’s message monitoring and automation prowess. “That’s what this whole thing is about: being able to logically divide up the problems and automate. That’s everything that Robot gave me.”
Maintain critical system uptime with Robot Console message monitoring software for IBM i.