Memory statistics_VMware vSphere Troubleshooting-QQ阅读女生青春网

上QQ阅读APP看书，第一时间看更新

Memory statistics

VMware vSphere hosts are designed to utilize memory efficiently like other resources. The resource management policies are implemented in the vSphere host to allocate memory to the virtual machines it is hosting. This allocation is based on the allocated memory setting of a virtual machine and the current system load. The vSphere host reveals different memory statistics that can be viewed using esxtop. Before we start looking into memory statistics, let's look at a brief introduction of how a vSphere host manages its memory.

Figure 2.6

Memory management in a vSphere host

Before we take a look at vSphere host memory metrics, I will walk you through memory management in a vSphere host system briefly. The memory management concept in a vSphere host will help you to understand the metrics displayed by esxtop. A vSphere host reclaims memory to provision memory overcommitment.

Memory overcommitment

The vSphere host system reserves physical memory for guaranteed delivery of memory to all the running virtual machines. The system uses the technique of overcommitting in order to ensure that it can allocate more memory than its capacity. The memory of a vSphere host is considered to be overcommitted when the total amount of virtual machine physical memory increases the total amount of the vSphere host. You can understand memory overcommitting through this example: let's say you have a host with 8 GB of physical memory and you are running five virtual machines with 2 GB each. The overcommitment of memory allows the vSphere host system to improve and balance the memory usage of physical memory. I will not discuss this in further detail as this is out of the scope of this book.

Memory overhead

Virtual machines have two types of memory overhead: the extra time (time overhead) to access a virtual machine's memory and a specific amount of overhead memory that is required to power on virtual machines. The total amount of memory for a virtual machine depends on the number of vCPUs, allocated memory, and the overhead memory for that virtual machine. Once the virtual machine starts running, the overhead memory varies than shown in Table – 2.1.

Total memory for 1 vCPU VM = allocated memory + overhead memory

You should have knowledge of this overhead to troubleshoot memory overhead problems. The following table has been taken from the VMware vSphere 5.1 documentation, and the sample values in the table have been collected with MMU enabled for virtual machines. These overhead values can be slightly different than those listed in the table.

Table - 2.2 Memory Overhead for each VCPU

Transparent page sharing

I will briefly describe transparent page sharing (TPS). VMware ESXi systems can efficiently use physical memory using TPS. Let's say you have some of your virtual machines running a common OS; some of these can have the same blocks of memory. The ESXi host can use the TPS to reclaim the identical pages of memory and keep a single memory page to share among all the virtual machines. This results in better host memory consumption and the host attains better memory overcommitment.

Tip

TPS is enabled by default in all vSphere versions, except in the 5.0, 5.1, 5.5 updates. In the future releases of vSphere (version 6.0 and above), TPS will be disabled by default. The TPS setting can be enabled from vSphere Advanced Settings.

Ballooning

Ballooning is a memory reclamation technique that dispatches a message to running virtual machines stating that the hypervisor is low on memory. A vSphere host uses a memory balloon driver called vmmemctl installed with VMware tools in the guest virtual machines to reclaim the free memory. When a vSphere host needs to reclaim the virtual machine memory, it uses the memory balloon driver vmmemctl to do it. The memory balloon driver vmmemctl creates a balloon size for the driver by expanding the balloon and allocating guest physical pages in the guest virtual machines to reclaim the memory. The driver tries to reclaim memory pages that it believes are less valuable for the guest operating system using appropriate ballooning techniques.

Memory compression

VMware vSphere hosts use a compression cache within physical memory to save pages instead of swapping these pages out to the disk. Memory compression provides a better method of page swapping because the host only needs to decompress a page directly from memory instead of accessing a disk, which is slower.

Reference: http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdf

Esxtop for memory statistics

Let's use esxtop to view memory metrics:

Connect to a vSphere host using SSH and log in as root or an administrative user.
In the command prompt, type esxtop without any flags.
Press m to go to the memory screen. This screen displays detailed information about memory usage.
Enable some additional memory statistics fields in esxtop for the following field: MCTL.
Press f to go to the Current Field Order screen.
Press j to enable MCTL memory statistics, and press it again to remove this field.
Press the Esc key to return to the esxtop memory statistics screen.

Figure 2.7

Table – 2.3 Important Memory Metrics

Diagnosing memory blockage

The following four host free memory stats are very important when it comes to diagnosing memory bottleneck and memory overcommitment: hard, low, high, and soft, represented by four thresholds. The threshold values for these metrics depend on how much physical memory a vSphere host has.

The threshold value for highstate is represented by minfree. You can see this in the following screenshot. VMkernel keeps some amount of memory free, which is shown by minfree.

Figure 2.8

As page-sharing is enabled in the vSphere host system by default, it manages to reclaim memory with a very small overhead. It tries to determine from highstate when to reclaim physical memory using swapping or ballooning. A vSphere host system will try to reclaim memory that has already been allocated to virtual machines once it gets low on memory resources. When a vSphere system gets low on memory, the aforementioned metrics can be examined to determine if the vSphere host system is trying to reclaim memory.

As highlighted in Figure 2.8, the vSphere host is reporting highstate. That means the vSphere host does not presently have memory contention. If this changes into softstate, it means a vSphere host will use ballooning to reclaim memory. If this changes into hardstate, then a vSphere host will use compression and swapping to reclaim memory. Finally, if the vSphere host shows lowstate, all memory reclamation methods (ballooning, compression, swapping) are used together to reclaim memory.

Your host should not be swapping memory, as that can have a negative effect on the virtual machines and the vSphere host's performance itself. This can be monitored from the vCenter performance charts that I will cover later in the chapter. The preceding values should be as low as possible on a healthy vSphere host system. Whenever you see a vSphere host reporting softstate, it indicates that the host is having a memory contention problem. The ballooning can be viewed by enabling the MTCL and MCTLSZ metrics. As mentioned in the previous topic, Esxtop for memory statistics, enable the j field to view MCTL? and MCTLSZ. This observation can save you a lot of time if your vSphere host system's memory is in good shape or if it's time for a memory upgrade.

A lot of memory swapping is also not good for a vSphere host. If a vSphere host keeps swapping memory actively, it will have a bad impact and result upon the virtual machine's performance degradation. You can observe this by monitoring the %SWPWT field to see if a virtual machine is being affected by swapping. This field is not in the memory screen but can be found in the CPU screen. As I have explained earlier, %SWPWT shows the percentage of swap waiting time for a virtual machine to swap its pages in the memory.

Figure 2.9

You can see in the preceding figure that the %SWPWT is 3.57. This represents the percentage swap waiting time for the virtual machine to wait for its memory pages to be swapped. This will affect the virtual machine's performance. The threshold value of this field is 5, but any value above zero is not ideal for the performance of virtual machines. If this reaches to 5, the cause needs to be inspected minutely.

You can troubleshoot this by examining why memory is overcommitting and if its allocation among virtual machines is according to available memory resources. You should also check if the memory ballooning drivers are correctly installed and present in the virtual machines to ensure ballooning is being used for swapping instead of hard swapping. You should always install VMtools on the virtual machines to ensure the installation of ballooning drivers. The MCTL? column (Figure 2.3) can be used to examine if the ballooning driver is installed. In Figure 2.3, N in MCTL? indicates that the ballooning driver is not installed on the virtual machine. The value Y in the MCTL? column indicates that the ballooning driver is installed in the virtual machine. The MCTLSZ column shows how much of the balloon is inflated within the virtual machine. If the value is 200 MB in the MCTLSZ column, that means the balloon driver is able to reclaim 200 MB of memory.

Tip

The minfree can be tuned with the mem.memfreepct advanced setting.