While searching for additional info on the Dynamic Memory feature of Hyper-V 2012, I bumped into various articles which discuss this improved Hyper-V v3 feature. Dan Stolts of the IT Pro Guru Blog published a good article on the concept of dynamic memory and how Hyper V leverages this technique to give more flexible memory options for running virtual machines.
Dynamic Memory is frequently compared to vSphere memory optimization techniques. Unfortunately there are quite some misconceptions wandering around here (as in the article of Dan Stolts), so I thought it would be interesting to do some investigation on the subject and publish the results on viktorious.nl. Time to clear up some things!
Read on to learn more about Dynamic Memory and how it compares to vSphere 5.1 memory optimization techniques…
Hyper-V 2012 Dynamic Memory
Dynamic memory, as stated before, is build in Hyper-V version 3 which is part of Windows Server 2012. Dynamic Memory is improved compared to the first version of this technology available in Hyper-V version 2. It’s a bit oversimplified, but one could compare dynamic memory (as a concept) with vSphere’s memory ballooning feature. This comparison is not 100% valid…but it gives you an impression of what it does.
Dynamic memory is used to reallocate memory between virtual machines that are running on a Hyper-V host. The idea behind this concept is to give virtual machines only the memory they need. So, if a virtual machine has a peak usage of 4 GB but only for 5% of the time, Hyper-V is capable of reallocating the memory of this particular virtual machine for the remaining 95% to other virtual machines. Important here, Hyper-V will never over-commit…thus provide more memory to the virtual machines than is physical available. Dynamic memory is just about memory reallocation.
The advantage here is that if memory peak usage of the virtual machines is spread out through a period of time, Hyper-V is capable of moving the memory to the virtual machines that peak in memory usage and thus remove it from other less memory demanding virtual machines.
Note: Dynamic memory is only available for Windows based virtual machines. More specifically: Windows 2003 SP2 and higher, Windows 7 and Windows 8. Linux cannot leverage the technology.
Dynamic Memory Configurable Options
To set some boundaries you have to following configurable options considering Dynamic Memory:
Startup RAM – This is the amount of memory the virtual machine will be started with;
- Minimum RAM – This is the low value of the guest OS will be allowed to use. Hyper-V will never take more memory away than allowed by this value. This is a new setting for Windows 2012 Dynamic Memory. Minimum RAM is typically the same or lower than Startup RAM.
- Maximum RAM – This is the amount of RAM that Hyper-V will give to the guest, this will determine the maximum share for a virtual machine. Maximum RAM is specifically higher than Startup RAM (and of course minimum RAM).
- Memory buffer – The memory buffer specifies how much (extra) memory a virtual machine will actually get compared to the memory actually required by the virtual machine (OS + applications).
- Memory weight – Memory weight will determine a relative prioritization if memory is scarce and the Hyper-V cannot satisfy memory request. Compare this to ‘memory shares’ in vSphere.
You will need knowledge of operating system and application memory requirements running in the guest to set the correct values for all the virtual machines that are using Dynamic Memory. Dynamic Memory will only work after configuring these values. A virtual machine will startup with “Startup RAM” GB of memory; this amount of memory can be increased to a maximum of “Maximum RAM” GB of memory. Decreasing the available memory is achieved through the synthetic memory driver which should be available in the Guest OS. You can compare this driver with the balloon driver available on ESXi.
Hypervisor Swapping: Smartpaging
On top of this (new in W2K12) Hyper-V can also leverage a memory swapping technique called “Smart Paging”. This technology will leverage disk resources as additional, temporary memory, but only when more memory is required to restart a virtual machine:
This approach has both advantages and drawbacks. It provides a reliable way to keep the virtual machines running when no physical memory is available. However, it can degrade virtual machine performance because disk access speeds are much slower than memory access speeds.
To minimize the performance impact of Smart Paging, Hyper-V uses it only when all of the following occur:
- The virtual machine is being restarted.
- No physical memory is available.
- No memory can be reclaimed from other virtual machines that are running on the host.
Smartpaging will close the gap between the Minimum RAM and Startup RAM in the case the configured startup RAM is not available in the physical RAM when rebooting a virtual machine. In this case a Hyper-V will use the disk as a temporary source for virtual machine RAM. This results in a performance degrade, but only during startup…thus assuming the booting virtual machine needs more memory during startup than during normal operations.
As a VMware administrator you might think at this point, why do I have to configure all these dynamic memory settings? Well because Hyper-V will never over-commit (taking into account the smartpaging in case of a vm reboot), the available memory to all virtual machines will never be more than the physical available memory minus memory overhead (e.g. the Windows parent OS that’s running on your host). So, Hyper-V has to do a calculation before you can power up a virtual machine.
Note: You can set (or pre-configure) most of these (almost similar) settings in a VMware environment as well (think of memory shares, memory reservation). In most cases this is not a best practice in a vSphere environment because it reduces flexibility. Read on to learn more about this.
If you choose to NOT use Dynamic Memory, because you didn’t configure it or when running Linux as a Guest OS, the memory will be ‘reserved’ for the fully 100% (remember, the memory has to be available in the physical RAM). When using Dynamic Memory, you’re actually creating a dynamic memory reservation. You set the minimum & maximum reservation for memory usage and then Hyper-V will automatically move or reallocate memory between the boundaries depending on the memory pressure on a host.
vSphere 5.1 Memory Techniques
Now let’s talk about memory techniques available in vSphere 5.1. vSphere 5.1 leverages four technologies that are available for all guest OSes (there’s no restriction here) running on ESXi. You might think four techniques? I thought there were only three? Well, actually this changed quite a while ago.
The four technologies are:
Transparent Page (Memory) Sharing
Transparent Page Sharing (TPS) is a memory de-deplucation technique. TPS leverages CPU idle time (so you can neglect performance impact) and scans ESXi memory pages. When identical memory pages are found, these pages will be shared. Actually this means that one physical memory page can (and will be used) several times by the same and/or different virtual machines.I’ve seen some posts about people worrying about security and performance in this case:
“However, in practice, you have just opened a HUGE security hole [by enabling TSP] because now two different machines are sharing the same memory space. This is not allowed in high security environments and is risky in all environments. Additionally this is very expensive. In order to use this technology, the hypervisor must hash all the memory pages and then compare the hashes. If the hashes are identical, it then has to compare the actual pages to make sure they are the same. Then when a guest actually rights to that shared space the hypervisor, will page fault and the error handler for the page fault will create a local copy of the page, then allow the right to happen on its own dedicated memory page. This is incredibly expensive in terms of CPU utilization and yes, you guessed it, expensive in memory utilization while doing all these comparisons.”
Well, let’s first conclude that shared memory is always read-only of course not read/write…from a conceptual perspective this is impossible. Is there still a security risk? Well, the question here is more fundamental: because the physical memory is shared by definition in a virtual environment (this counts for both Hyper-V and ESXi) the question you should ask yourself is “Do I trust the memory manager of my virtualization platform?” TSP is not introducing an additional risk here, the memory manager of ESXi just includes advanced functionality which can de-duplicate memory. On top of that vSphere 5.0 has achieved Common Criteria Certification at EAL4+ (5.1 is in evaluation) and VMware is deploying this technique for several years. More info on EAL here.
Because searching duplicate memory pages leverages CPU idle time, the performance penalty for this process can be neglected. When a write is executed to a shared memory page, ESXi will run a Copy on Write (COW) action to create a unique memory page again. The steps are: write memory and change the memory pointer… incredible expensive? Not really, because we had to write the memory page anyway. Just a smart mechanism offering additional memory capacity at no (very little) cost. TSP will decrease actual memory pressure on the ESXi host.
Some people are worrying that TSP is not so effective anymore because a modern OS leverages so called “large pages”. There’s some information on this in this KB article:
“In hardware-assisted memory virtualization systems, ESX will preferentially back guest physical pages with large host physical pages (2MB contiguous memory region instead of 4KB for regular pages) for better performance. If there is not a sufficient 2MB contiguous memory region in the host (for example, due to memory overcommitment or fragmentation), ESX will still back guest memory using small pages (4KB). ESX will not share large physical pages because:
- The probability of finding two large pages that are identical is very low.
- The overhead of performing a bit-by-bit comparison for a 2MB page is much higher than for a 4KB page.“
One final word on TSP: Although TSP is a technique to overcommit, this doesn’t mean the actual physical memory load is 100%. When your policy is to have a maximum physical memory load of 80%, TSP allows you to distribute (as an example) 20% more memory within this 80% memory load resulting in a 96% memory load when TSP wouldn’t do a thing….get the point?
Memory ballooning is the ESXi memory reclamation technique and can be compared to the Dynamic Memory option in Windows Server 2012. Memory ballooning will only work when there’s high memory pressure on the host and can loan memory from one virtual machine to another virtual machine. Ballooning uses VMware Tools and uses the Guest OS swapping (just like Dynamic Memory) to free up Guest OS memory and loan this to another virtual machine that needs more memory. Ballooning works for both Windows and Linux guests as long as the VMware Tools are installed. The guest OS is in a much better position than the ESXi hypervisor to decide which memory regions it can give up without impacting performance of key processes running in the VM.
At this point you might think, in Windows Server 2012 I have several options to tune Dynamic Memory behavior…what options has ESXi available? Well you can use memory reservation and memory shares to change ballooning and swapping behavior. Memory reservation guarantees the availability of memory for a virtual machine, with memory shares you can set a relative priority on virtual machines when memory is scarce. Note: by default, ESXi will never balloon more than 65% of the configured memory.
In a normal situation I would advice not to configure reservation and shares. The ESXi hypervisor (VMkernel) is a intelligent hypervisor which performs very good with default reservation and shares settings. Only in case you want to change memory priorities or guarantee resources for a particular virtual machine you should change these settings. This makes the management of the environment a lot more easier and reduces administrative overhead.
VMkernel (Hypervisor) Swapping leveraging the vswp file
When memory pressure on a host is too high and both TSP and Ballooning cannot satisfy memory requirements, ESXi can use Hypervisor Swapping. In this case memory pages are swapped to disk. This swapping has of course a performance impact, although with the introduction of SSD disk (and swapping to these type of disks) ideas about this type of swapping are changing, because the performance degrade might be acceptable. VMware introduced the concept “swap to host cache” in vSphere 5, a good article about this technology is available on yellow-bricks.com.
When an ESXi is showing VMkernel swapping this can be ok for a short period of time….when it last longer you might want to increase available memory. Especially with swapping always look to “swap out” en “swap in” activity because this will degrade performance. Memory pages just ‘sleeping’ in VMkernel swap have less performance impact.
Memory Compression (yes, that’s number 4)
vSphere 4.1 introduced the concept of memory compression, an additional technology to reduce hypervisor swapping. The idea is to delay the need to swap hypervisor pages by compressing VM memory pages that are candidates for swap to disk. This compressing and decompressing is faster that performing Disk I/O operations. Memory Compression will only take place when there’s contention for physical memory resources.
Both Windows Server 2012 Hyper V and ESXi 5.1 leverage some intelligent memory optimization techniques. Where Hyper-V version 3 has Dynamic Memory and Smart Paging, ESXi 5.1 can leverage Transparent Page Sharing, Guest Ballooning, Hypervisor Swapping and Memory Compression.
Hyper-V Dynamic Memory is designed to offer flexibility by moving the available memory around the virtual machines, with Smart Paging ready to do some paging to disk in case of a VM reboot in a situation of high memory pressure. VMware has memory de-deduplication on board. This feature enables you to run more virtual machines in the same amount of memory and thus achieve a higher consolidation ratio (thus you will need less physical servers or less memory available in your servers). Although recently there has be some discussion about effectiveness and the actual increase in capacity, my personal experience is that TSP will succeed in sharing at least 20% of the memory between virtual machines. On top of that VMware also uses ballooning, swapping and compression techniques to guarantee maximum flexibility.
From an administrative standpoint there’s a different approach regarding memory optimization. Hyper-V demands you to pre-configure dynamic memory settings, where as ESXi memory techniques work just right out-of-the-box….allowing you to set certain values depending on the exact use-case.
I hope this article helped you to better understand memory optimization techniques. I am looking forward to your comments!
For further reading I suggest you to take a look at these articles:
- Virtual Memory Management: Dynamic Memory-Much Different Than Memory Over Commit – Dan Stolts (got some inspiration from this article)
- Hypervisor Memory Management Done Right – Eric Horschman (got some inspiration from this article as well)
- Swap to host cache aka swap to SSD? – Duncan Epping
- Hyper-V Dynamic Memory Overview – Microsoft
- Windows 7 Transparent Page Sharing and the ASLR story – Andre Leibovici
- vSphere 5 memory management explained (worth a read!): part 1 and part 2 – Erik Scholten of vmguru.nl
- And some good discussion here: My Frustration with HyperV, do you really save anything? – Justin Paul