Virtual SLES11SP1 shows bad IO performance when accessing raw disks

  • 7009616
  • 24-Oct-2011
  • 16-Feb-2015

Environment

SUSE Linux Enterprise Server 11 Service Pack 1
VMware ESX Server 4.1
Iometer


Situation

Attaching raw disks on a SLES 11 SP1 guest running on VMware ESX 4.1 and measuring throughput using Iometer (32 Outstanding I/Os, 8K blocks)  performance shows

Random Read: 150 IOps
Random Write: 1800 IOps
Seq. Read: 2000 IOps

compared to a Windows VM with

Random Read: 6000 IOps
Random Write: 7000 IOps


Resolution

The kernel uses by default the IO scheduler cfq which performs IO optimization primarily designated for locally attached disks. When using a disk from a storage system, these optimizations are redundant, since the storage does i/o optimizations itself. The usage of the cfq scheduler on storage disks can affect i/o performance and will create unnecessary overhead on the server. This symptom may not be specific to VMware but could also be seen on any other Hypervisor such as Hyper-V, XEN or KVM. 
To address this behavior, edit /boot/grub/menu.lst and set elevator=noop. This will turn the the IO optimization inside off and the virtual machines IO is basically a FIFO (first in, first out) to the underlying system. All IO optimization with this setting will be done on the storage system.

Example of /boot/grub/menu.lst:

###Don't change this comment - YaST2 identifier: Original name: linux###
title SUSE Linux Enterprise Server 11 SP1 - 2.6.32.46-0.3
    root (hd0,1)
    kernel /boot/vmlinuz-2.6.32.46-0.3-pae root=/dev/system/root resume=/dev/sysem/swap splash=verbose showopts elevator=noop
    initrd /boot/initrd-2.6.32.46-0.3-pae

Instead of rebooting a system to apply this change, it is possible to apply the new IO scheduler to a single device to check whether this has an effect on performance. Execute as root:

echo SCHEDNAME > /sys/block/DEV/queue/scheduler

replace SCHEDNAME with the scheduler name and DEV with the device the scheduler should be used on.

In addition, append the elevator=noop option to the DEFAULT_APPEND variable in /etc/sysconfig/bootloader. This will make sure the option gets added if a new kernel gets installed.

Additional Information

For more information see:

/usr/src/linux/Documentation/kernel-parameters.txt
/usr/src/linux/Documentation/block/switching-sched.txt
/usr/src/linux/Documentation/block/as-iosched.txt
/usr/src/linux/Documentation/block/deadline-iosched.txt

This files are provided by the kernel-source RPM which is available on the installation media or via regular SLES online update repositories.

Feedback service temporarily unavailable. For content questions or problems, please contact Support.