Thursday, November 21, 2013

vmxnet3 and Windows Operating System and network problems?

So lately we have had some customers complaining about network related problems on virtual machines running windows operating system with strange behavior like the following:
- Web page rendering issues for website on IIS
- Web page buttons that load with prompt to open a file.
- SQL errors from the web applications related to network.
- Page loads successfully on one browser, but not from a different browser on the same computer.


Debugging the problem:

Ran wireshark with mixed results.
On further investigation we were able to replicate the problem with random results through an external network on Org VDC's Edge gateway.
My firsts questions was to look at the vAPP network to see if users can replicate this locally within the vAPP with own network and access the webpage through localhost, which was eventually reproduced.


We also figured out that when the network card was disabled and enabled again on the VM it seems to resolve the network issues but deteriorates after a while again.  This lead me to believe the issue is related to network card and windows OS and looked into the Chimney offload feature.

I remembered a few years ago we had some problems with remote desktop connections being slow for users from one international office to the other and was able to track this down to the "Windows auto tuning" feature. So had to give this a go.

Resolution:

We disabled the setting with the following commands:

Netsh int tcp set global chimney=Disabled 
Netsh int tcp set global autotuninglevel=Disabled 

After about 5 minutes we started seeing the network running without problems.

I contacted VMware about this and was told following:

The KB that says VMXNET3 doesn't support TOE is only internal at the moment.  But basically they recommend the following options be turned off in the OS:

Netsh int tcp set global RSS=Disable
Netsh int tcp set global chimney=Disabled 
Netsh int tcp set global autotuninglevel=Disabled 
Netsh int tcp set global congestionprovider=None
    for windows 8 & 2012 this command deprecated so use: Netsh int tcp set supplemental custom congestionprovider=None
Netsh int tcp set global ecncapability=Disabled 
Netsh int ip set global taskoffload=disabled 
Netsh int tcp set global timestamps=Disabled

From a practical point of view, offloading part of the TCP stack to a network card make sense in the physical world but not so much in the virtual world.  Hopefully VMware will address this in upcoming adapter improvements.

Links: