Quick Q&A with DR Expert Debbie Saugen
Tom Huntington: By now, word has spread that Debbie Saugen has left IBM and brought her considerable experience to HelpSystems. But who is Debbie Saugen? What would you say is your IT mission?
Debbie Saugen: Having spent more than 37 years with IBM as Technical Owner of IBM i Backup/Recovery and the National Lead of IBM i Resiliency Services, I’m very passionate about business continuity on IBM i. It’s my desire that every customer I work with has the best solution for their business and is recovery ready. For myself, it’s absolutely the best feeling when an IBM i customer’s recovery is flawless and performed in the desired amount of time or less.
TH: Tell us a little about your new role at HelpSystems. What are HelpSystems business continuity services?
DS: My new role at HelpSystems is Director of Business Continuity Services. HelpSystems business continuity services are designed to provide expert advice, audits, and assistance in the IBM i business continuity arena. The services include backup and recovery assessments, business continuity architecture reviews, disaster recovery and role swap testing, and monthly auditing.
TH: We’re excited to have you on our team, and we hope our customers will see the unique value your expertise adds in this area. What are some examples of the items you would be able to review for them?
DS: Thanks Tom, happy to be here! I will review their daily, weekly, and monthly backups to evaluate their recovery readiness and identify any backup and recovery issues. I’ll also investigate alternative technologies to improve RTOs and RPOs and provide a detailed report of my findings along with recommendations to improve their current backup and recovery process.
TH: You mention recovery time objectives (RTO) and recovery point objectives (RPO). What are these important?
DS: RTO refers to how long it will take a business to get up and running once it goes down due to a number reasons, including hardware failure, software failure, user error, or a natural disaster. Recovering your data from tape on IBM i typically takes 12 to 24 hours once your tapes become available. Recovery with replicated systems is quicker, typically 15 minutes to a few hours.
RPO refers to the age of the data that must be recovered in order to resume normal business operations. This could be minutes, hours, or days and is a determining factor in how often a business needs to backup or replicate its data.
TH: Speaking of backing up, I’ve heard you say many times that a backup is not a backup without testing. Why is that?
DS: You should design your backup strategy based on the size of your backup window. You should also design your recovery strategy while you are designing your backup strategy to ensure that your backup strategy meets your system recovery needs.
The final step in designing a backup strategy is to test a full system recovery. This is the only way to verify that you’ve designed a good backup strategy that will meet your system recovery needs. Your business may depend on your ability to recovery your system.
TH: You’ve been helping IBM i shops recover for many years now. What kind of things have you seen that cause IBM i backups to go awry?
DS: I’ve seen situations where critical data was not being backed because it was never included in the backup or because there were object locks when the backup occurred. Some customers never back up everything that would be required to perform a full system recovery because they simply do not have the backup window. Unfortunately, sometimes this isn’t brought to the attention of the business until an actual disaster occurs.
Where IBM i shops have added a virtual tape library (VTL), I’ve seen teams turn off saving the access paths so that the save will perform within their backup window. The problem with this is that the system recovery no longer meets the customer’s RTO.
There are alternative strategies and solutions that can be implemented to resolve these issues. Companies go out of business when their data can’t be recovered or they suffer huge losses because the data cannot be recovered in a timely manner.
TH: Interesting observation about VTL. What are some other technology problems you’ve seen?
DS: Tape media for backup on IBM i has become extremely fast and is still the least inexpensive solution, but it needs to be managed and handled appropriately. I’ve seen nonlabelled tapes, broken tapes, tapes inserted in a drive with yellow stickie labels, melted tapes, and wet tapes.
An important part of any recovery strategy is storing the tapes in a safe yet accessible location. You must ensure that the tape volumes have external labels and are well organized so you can locate them easily when the time comes. For disaster recovery, you should store a complete set of your backups at a safe, accessible location away from your site.
TH: What role does/should high availability have in business continuity?
DS: High availability is the key to eliminating IBM i data loss and to reducing recovery times to one hour or less. It can also eliminate your backup windows since it provides the capability to perform your tape backups on the replicated system.
TH: How long is a typical business continuity services engagement?
DS: A typical engagement to review your daily, weekly, and monthly backups and evaluate your recovery readiness is two days. A business continuity architecture review for existing backup and recovery issues—which includes an investigation into alternative technologies that could improve backup and recovery processes as well as your RTO and RPO—is five days. I can also assist during your disaster recovery testing or role swap testing, which typically is one to two days.
TH: Thanks so much for talking with me, Debbie! One last question: how does a customer get started?
Partner with business continuity experts to gain confidence in your backup strategy and ensure that you are ready to recover when disaster strikes.