Virtual I/O Server (VIOS) has been around since 2004 and was initially available on IBM POWER5 servers running AIX and Linux workloads. It first surfaced for IBM i workloads on IBM Power Systems in 2008 with the announcement of POWER6 and is now considered a standard in many organizations running IBM i, AIX, and Linux workloads.
What Is VIOS?
VIOS can be defined as special logical partitions that host I/O resources in order to provide advanced virtualization capabilities across other client logical partitions (LPARs). VIOS really comes into its own when implementing external storage (SAN) and the increasingly popular and powerful live partition mobility (LPM). It helps maximize the use of physical resources that are often underutilized on the system.
Typically, you would have one or more VIOS partitions on each physical server. VIOS looks like AIX, but the command set is vastly restricted and the padmin user interface is often totally different from anything administrators have previously encountered. Over time, many functions that were traditionally accomplished with VIOS and the command line can now be done via the Hardware Management Console (HMC).
Is VIOS Important?
Yes! It gives you the flexibility of being able to share resources across multiple client LPARs. By sharing resources such as fibre channel adapters, network adapters, and external SAN-housed disks, you can dramatically reduce the physical footprint and the power consumed—very relevant in today’s green-thinking age.
By implementing pairs of VIOS partitions you can also build in redundancy to alleviate the worry of a single VIOS failure. Learn more with the IBM DeveloperWorks VIOS cheat sheet.
What About VIOS Monitoring?
You absolutely should be monitoring VIOS, without a doubt.
It is good practice to build some redundancy into your VIOS configuration so that client LPARs have the ability to use multiple VIOS. Once you’ve implemented this, in the event of a VIOS failure—or if you were to take it down for planned maintenance—the client LPARs would remain active.
But how would you know if one of the pair had failed? Worse still, what would the consequences be if the second in the pair failed before the first was back online? You would suffer from client LPAR failures resulting in the downtime of key business-critical applications.
What Should I Monitor?
As VIOS owns the hardware, it’s imperative that you monitor for permanent hardware errors. In addition, should temporary hardware issues start to increase, you would get a valuable early warning signal to potentially catastrophic situations.
The underlying layer of software used to manage the VIOS partition is extremely restricted in what you can and cannot do. This layer should remain relatively intact, escalating any anomalies immediately.
Although VIOS requires minimal processing power a tell-tale sign of a looping process could be a sustained high percentage of processor being utilized over a period of time. While monitoring for processor activity it’s imperative that any spikes in activity should be ignored – this is the processor simply doing its job.
Both paging file used and paging space available should be closely monitored as either could impact the response times served by the VIOS partition and ultimately the availability of VIOS.
With VIOS’s clever ability to write away to log files comes the downside that log files take up valuable space. Over time, space occupied by log files increases and—unless tracked—could lead to a full file system situation.