Network performance problems and network connectivity issues with virtual machines.

  • 7017597
  • 10-May-2016
  • 13-May-2016

Environment

SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
Novell Open Enterprise Server 11 (OES 11) Linux

Situation

Customer has VMware virtual machines with TCP Segmentation offload (TSO) enabled.  Problems seem to be worse when copying a large amount of data.  If connections are reset it seems to work intermittently.
 
Customer is seeing various communication problems.  Connections sometimes work and sometimes hang or time out.  Network seems to be slow.  Unable to copy large amounts of data.  Analyzing LAN Traces shows packet loss, multiple retransmissions of data, as well as out of sequence packets and duplicate acknowledgments.

Resolution

Disable TSO for virtual NICs
 
You can check if the setting is "on" by running:  ethtool -k eth0 | grep tcp-segmentation-offload
 
You can disable TSO like this;  ethtool -K eth0 tso off
 
These changes need to be added to a script, like /etc/init.d/after.local, so they are run each time the server boots up.

Notice: A lowercase -k is used to see the current setting and an uppercase -K is used to change the setting.

Cause

This is a problem with the virtual drivers supplied by VMware.

Additional Information

Using TSO on physical machine NICs improves performance by reducing the CPU/Kernel overhead for TCP/IP network operations by off loading that to the LAN adapter.  Virtual machines have virtual/software LAN adapters.  The benefit gained is minimized on virtual adapters when compared to actual physical hardware.  When some of the TCP functions are offloaded to the virtual LAN adapter the kernel has more CPU cycles to run when compared to letting the kernel do the TCP segmentation.  Some of the virtual LAN adapters for VMware have problems using TSO.  This causes communication problems.
 
If TSO is enabled on a network interface, that network interface divides larger data chunks into TCP segments.
If TSO is disabled, the CPU performs segmentation for TCP.
On virtual machines the operating systems' network interface is virtual therefore the hardware vs. software advantage is negated.
TSO needs to be disabled on virtual interfaces.