Virtualization Best Practices
There are two common avenues organizations can follow to exploit virtualization. The first is primarily motivated by cost and environmental savings. The second is one of dynamic routing of transactions and images to provide massively scalable operating environments.
If your motivator is primarily cost savings and you aren’t pursuing the massively scalable environments using tools such as DRS, then there are significant additional savings to be had after your initial consolidation effort. Perhaps consolidating virtual machine instances—running more than one app per virtual machine—is something you should consider.
Unfortunately many people don’t realize that virtualization is not a cure-all. There is still substantial money left on the table after the work. Yes, virtualization enables many quick wins, provides substantial cost and environmental benefits to the business and is easy to accomplish, but it doesn’t solve costs associated with staffing, management and software, all of which comprise a substantial part of IT budgets. In fact, many virtualization adopters have exacerbated the problem with an explosion of new system images, all on the assumption that they are “free” because no additional hardware is needed, forgetting that hardware is only a part of the IT cost equation. The explosion of new images is commonly called “virtual server sprawl” and it can have serious implications to IT budgets from a staffing, management and software license cost perspective.
It may be the case that your organization needs dynamic routing to improve customer service during peak periods, such as holiday seasons for retail establishments. Or you may need to facilitate rapid responses to changing business conditions in order to gain competitive advantage. If that is the case, costs are not as important as flexibility and speed to market. Whether you employ dynamic routing or are using virtualization to save costs, the first part of this paper will help you understand the components of costs associated with managing and sustaining virtual servers. The remaining sections, which are about consolidating virtual machines, will help those interested in further reducing costs to understand the work that needs to be accomplished to be successful.
So how do we extend the gains we accomplished by implementing a virtualization strategy? Well, the good news is you can reuse most if not all the work you did implementing virtualization, adding to it more in depth analyses to take virtualization to the next level — consolidating images. The more detailed work entails understanding how the various applications and services interact with each other and the IT resources they consume. The savings come from at least three areas – software licenses, management tasks and staffing resources.
According to a recent Forrester study “The State of Enterprise IT Budgets: 2009,” software licenses, on average, comprise about 20% of the IT budget. There are many different aspects to consider when analyzing the impacts and risks of software licensing. Arguably the easiest to address is the operating system (OS) itself: Windows-based, UNIX, Solaris, and Linux to name just a few.
Each image requires an OS to run programs and deliver IT services. In addition, each image requires the use of third party-products: software programs that enhance the usability of the operating system andprovide essential value-added services such as security, backup and recovery and management capabilities. Examples of third-party products include antivirus/firewall, backup/restore, remote management, tape management, performance monitoring, heartbeat monitoring, troubleshooting, and patch management/change propagation utilities.
Taking software licensing considerations a step further, there are application program aspects as well. Examples are vendor supplied database systems, transaction processing systems, financial applications (chargeback, general ledger), business specific applications such as Customer Relationship Management (CRM), order entry, claims adjustment, fulfillment and distribution applications just to name a few.
Unless unlimited enterprise-wide contracts are negotiated with your vendors, limits on the use of operating systems, third-party products and business applications are in place and managing them can be daunting, especially if the numbers and types of systems are many and vary during the year. However, in most cases, software costs can be more manageable and reduced by consolidating the number of images.
There are numerous tasks associated with managing the many and assorted virtualized images. These tasks generate work (overhead and staffing) which in turn generate costs. Some of the more common tasks performed by System Administrators, Technical Support and Computer
Operations on each image are:
- Monitoring — If you don’t want to wait for calls from your customers reporting service outages, you need to monitor systems. Generally there are two types of monitoring – heartbeat and performance.Heartbeat monitoring ensures the server is operational. Performance monitoring ensures that applications and IT services are being delivered at expected levels.
- Maintenance — Operating system, third-party and application software do not remain static. Updates are continually made to improve security, reliability, and functionality. This means coordinating the work and managing the change. The more images you have, the more changes that need to be made.
- Security — Depending upon the number of users, access control can be a labor-intensive task. Inaddition to managing users and their access to information, security staff also manages anti-virus utilities and firewall configurations. Both are important to keeping information available to business users, secure and away from unauthorized eyes.
- Back-up and Recovery — Backing up information on a regular basis is just plain good business practice, and virtual machines are no exception. Work is required to manage the backup/recovery tasks for every virtual or physical server and to maintain the DR plan which must be addressed as changes are made to the environment.
- Tuning — Tuning is the process by which you identify ways to improve the performance of systems and applications and commission the implementation of the efficiencies. This work usually only gets initiated when performance problems are experienced. Tuning can substantially reduce the amount of computing resources required to perform a specific piece of work.
- Reporting — Since maintenance tasks are being performed by people, reporting needs to be done to provide status and results. Reporting is generally performed on a monthly basis and reports actualresults against expectations. Reporting also gives management a good idea of the health of images and systems, usually reflected in the number of trouble tickets generated.
As you can see, there is more to maintaining systems than meets the eye and the work is not trivial.
According to “The State of Enterprise IT Budgets: 2009” by Forrester, staffing, on average, comprises about 30% of the IT capital budget. It takes people to perform the previously mentioned maintenance tasks. It also takes people to perform troubleshooting when things go wrong. There are rules-of-thumb for support personnel ratios — usually expressed in terms of servers per system administrator. With virtualization, that ratio is usually expressed in terms of virtual server images per system administrator. The ratio is dependent upon the different workloads and activity levels of the servers. Hardware and software age and maintenance levels also affect the ratios.
So as you can see, each image requires some level of care and feeding independent of the hardware platform that generates both work and cost. By using proven consolidation techniques, you can reduce images, providing additional savings to the organization. It takes work, but the additional efficiencies and cost savings to the business make the work well worthwhile. Let’s look at the steps needed to reach your goal of reducing the number of images needed by your organization.
If you have reached this point, I assume you are looking to find more savings in your virtualized infrastructure. The following steps are derived from best practices and success stories we have seen both in our customer base and in the industry as a whole.
As in any successful endeavor, planning is important. The first step is to review the virtualization planning data your organization uncovered during the virtualization project. It will contain some of the same data needed for virtual consolidations – things like hours of operation and maintenance windows. Reviewing this information will help you set your consolidation goals.
Besides defining goals, you also need to define metrics for measuring success. Generally these metrics cover number of servers decommissioned, software license reduction and cost savings.
Your consolidation plan should, at a minimum, contain the following steps:
- Survey, measure and inventory
- Select promising candidates
- Determining best fit – easiest through use of analytic modeling
- Commission the work – it takes teamwork to do the work
- Monitor progress, measure results and publish them
Once the plan is completed, the work approved and the project team assembled, you can start the consolidation work.
Survey, Measure and Inventory
Before you can start consolidating servers, you need to know with what you are dealing. You need to develop an inventory of the images, the resources they consume (and when),
maintenance requirements and times/days of operation. Without that information, you cannot
accurately select candidate systems for consolidation and it will add substantial time to the Predict Consolidation Effects step below.
You need to measure current service delivery metrics. Without the “before” metrics, you have no way of measuring your success once the consolidation work has been completed. Ideally you would provide the same or better levels of service than before the work. Sometimes, however, it may be acceptable to slightly reduce levels of service if really substantial reductions in platforms can be accomplished, resulting in remarkable costs savings.
To accomplish software license reductions, we recommend that you start by working with your contracts administrator to understand the licensing limits of current agreements and the possible impacts of reducing or increasing the number of machines and images that can be propagated. It is also important to remember that it is cheaper to do your homework upfront and manage to the numbers than face Digital Millennium Copyright Act (DMCA) penalties or additional vendor fees sometime in the future. Once the bounds and limits of your licensing are known, you can plan steps to manage and reduce their costs through image consolidation.
Select the Candidates
Now that you have the inventory, you can start the process of finding candidates. The ideal candidates would be two different applications, one that runs during the business day and one that runs at night. Not-so-ideal candidates would be similar network- intensive or I/O-intensive applications that would compete for the same resources during the same time periods. You will find some “perfect” matches, but the majority of consolidation will be accomplished with similar operating conditions. In those cases, some of the important considerations for finding best fit are:
- Application & OS maintenance considerations — hours & frequency
- Hours of operation
- User profiles and locations
- DLLs and other similar operating system components
- CPU/memory/buffer usage profiles
- Business impacts to take server(s) down for some reason
- Choreographing software version changes
- Staffing impacts and costs if all servers have to be updated concurrently — i.e. OS or database version upgrade
- Peak usage cycles and times
- Contention potential
- Impacts as result of growth
- Disaster and component recovery impacts and considerations
Use your experience to judge each set of candidates. Experience often times plays a big role in selecting the right systems to combine. In watching the ebb and flow of work on systems over the years, you gain a feel, or intuition, of how work flows and interacts with different applications. These experiences can help you winnow out potential candidates and reduce the amount of analysis work.
Once you have finished your selection process, you should have a pretty large list of potentialopportunities for consolidation and reduction of images. It is now time to analyze the information yougathered on the images and predict service performance after the candidate consolidations have taken place.
Predict Consolidation Effects
This is arguably the most important step in the process. The goal of this step is to perform work to predict the effects of consolidating services and applications onto a single or cluster of virtual servers. Some of the work should be done at a small scale in the testing or QA lab. Performance monitors should be active, capturing data from the tests and saving it for further analysis. Once all pertinent data is collected, analytic modeling tools should be employed to take the small- scale test lab performance/usage data and predict how systems will perform when workloads are scaled to production transaction levels on production systems.
Why pre-production predictions? The same diverse set of scenarios in a brute-force load-testingapproach would call for extensive efforts in terms of software licenses, allocation of hardware resources and many person-hours. Using the power of software-driven analytic modeling, predictions become swift and inexpensive – and won’t limit the number of different scenarios you are able to evaluate.
Analytic modeling permits you to verify that your selections will play well together. And using analytic modeling you can easily find any bottlenecks and points of contention that might negate this iteration and one of your choices. Modeling also permits you to perform sensitivity analyses to see the impacts of different levels of business growth on the consolidated image.
From our experience, you will change some of your consolidation and configuration choices because of information analytic modeling reveals. Once you have validated performance using modeling, you can develop a list of your final selections and proceed with the work of consolidation.
Commission Consolidation Work
By now you should have a good comfort level over how applications will perform once consolidated. It is time to proceed with consolidating your final selections. Since you usually cannot do all the work by yourself, it is necessary to commission a project team, consisting of members of the affected application and support teams. Moving applications requires careful coordination, especially where critical changes such as database location descriptions (AKA pointers) are concerned so comprehensive project plans will be necessary.
Once project plans are completed and change control processes navigated, you are ready to begin the physical work of consolidation. Each member of the project team will have their relocation tasks. Each team member should be very clear on their tasks and the coordination/ timing of the moves. Your capacity planning team should have a monitoring role in the work, validating service performance soon after the consolidation tasks have been completed to ensure the original service goals are met and to quickly address any unforeseen performance problems.
Monitor, Measure, Report
If you remember, in the very first step we chose metrics and measured levels of service before we started. That information now comes into play. We take same measurements on the combined systems and compare before and after results. Hopefully there will be no change or an improvement in service quality and performance. You now take that information and develop your management reports. Bypublishing results to management, you build credibility in your work. It is also a good public relations opportunity, permitting you to improve relationships with business units and other IT departments.
This may sound like lot of work, and it is, but you can reduce risks of potential contention problems through careful planning. Cost savings can be considerable, easily justifying the work. The work requires patience and discipline, but a well-run project can provide impressive returns to the business.