Friday, July 22, 2016

WinSCP connection to VCSA failed: "Received too large SFTP packet. Max supported packet size is 1024000 B"


The following error might appear when you try to connect with WinSCP to your VCSA.


This is due to login scripts that are printing words and the first 4 characters cast into the number(represents the first 4 bytes read from the server)

To fix the problem you can usually move the command that print the login script text to another proper interactive script or just remove completed, however in VMware the scenario is different and the default shell has change from bash to appliancesh.

VMware's resolution is to use the SCP file protocol through bash shell.  However after I change to SCP I received the following error: (when default shell not set to bash)





This was fixed after changing the default shell.  I am using a newly created user account that can be used to access the server through WinSCP.  Just remember you would have to modify permissions on your files to copy them if going down this route.  You can use root account to temporary change the shell from bash to appliance to access with WinSCP. Entirely up to you.


>shell.set --enabled True
>shell
>useradd winscp
>passwd winscp
>visudo (add user with root access)
>chsh -s /bin/bash winscp


If you are using root you temporary change to bash shell and then return to appliance shell:
To return:
>chsh -s /bin/appliancesh useraccount







Links:

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2107727

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2115983

https://winscp.net/eng/docs/message_large_packet



Thursday, July 21, 2016

VSAN - migrate VSAN cluster to new vCenter Server

I had to recently perform a VSAN cluster migration from one vCenter Server to another. This sounds like a daunting task but ended up being very simple and straight forward due to VSAN's architecture to not have a reliance on vCenter Server for its normal operation(nice on VMware!) As a bonus the VMs does not need to be powered off or loose any connectivity.(bonus!)

Steps to perform:
  • Deploy a new vCenter Server and create a vSphere Cluster
  • Enable VSAN on the cluster.
  • Install VSAN license and associate to cluster
  • Disconnect one of the ESXi hosts from your existing VSAN Cluster
  • Add previously disconnected Host to the new VSAN Cluster on your new vCenter Server.
    • You will get a warning within the VSAN Configuration page stating there is a "Misconfiguration detected". this is normal due to the ESXi not being able to communicate with the other hosts in the cluster it was configured with.
  • Add the rest of the ESXi hosts.
  • After all the ESXi are added back the warning should disappear.


links:



VSAN upgrade - Dell Poweredge servers

I have been meaning to write up on a VSAN upgrade on a Dell R730xd's with PERC H730 which I recently completed at a customer.  This is not going to be lengthy discussion on this topic but primarily want to provide some information on tasks I had to perform for upgrade to VSAN 6.2

  1. The VSAN on-disk metadata upgrade is equivalent to doing a SAN array firmware upgrade and therefore requires a good backup and recovery strategy to be in place before you proceed.
  2. Migrate VM’s off of host.
  3. Place host into maintenance mode.
    1. You want to use whatever the quickest method is to update the firmware, for VSAN's sake. Normally Dell FTP update if network available to configure.
    2. When you put a host into maintenance mode and choose the option to "ensure accessibility", it doesn't migrate all the components off but just enough so that the policies will be in violation.  A timer starts when you power it off, and if the host isn't back in the VSAN cluster after 60 minutes, it begins to rebuild that host's data elsewhere in the cluster  If you know it will take longer than 60min or where possible select full data migration.
    3. You can view the resync using the RVC command "vsan.resync_dashboard <cluster/host>"
  1. Change advanced settings required for PERC H730
    1. https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144936
    2. esxcfg-advcfg -s 100000 /LSOM/diskIoTimeout
    3. esxcfg-advcfg -s 4 /LSOM/diskIoRetryFactor
  2. Upgrade the lsi_mr3 driver. VUM is easy!
  3. Login to DRAC and perform firmware upgrade:
  4. Upgrade Backplane expander (BP13G+EXP 0:1)
    1. Firmware version 1.09 ->  3.03
  5. Upgrade DRAC H730 version
      1. 25.3.0.0016 ->  25.4.0.0017
  1. Login to lifecycle controller and set/verify BIOS configuration settings for controller
    1. https://elgwhoppo.com/2015/08/27/how-to-configure-perc-h730-raid-cards-for-vmware-vsan/
    2. Disk cache for non-raid = disabled
    3. Bios mode = pause on errors
    4. Controller mode = HBA (non-raid)
  2. After all hosts upgraded, verify VSAN cluster functionality and other prerequisites:
    1. Verify no stranded objects on VSAN datastores by running python script on each host.
    2. Verify persistent log storage for VSAN trace files.
    3. Verify advanced settings still set from task 3!
  3. Place each host into maintenance mode again.
  4. Upgrade ESXi host to 6.0U2.
  5. Upgrade the on-disk format to V3.
    1. This task runs for a very long time and has alot of sub-steps which takes place in the background.  It also migrates the data off of each disk group to recreate as V3 .  This has not impact on the VMs.
    2. This process is repeated for all disk groups.
  6. Verify all disk groups upgrade to V3.
  7. Completed

Ran into some serious trouble and had a resync task that ran for over a week due to a VSAN 6.0 KB 2141386 which appears on  heavy utilization storage utilization.  Only way to fix this was to put host into maintenance mode with full data migration, destroy and recreate the disk group.

Also ALWAYS check the VMware HCL to make sure your firmware is compatible. I can never say this enough since it is super important.

This particular VSAN 6.0 was running with outdated firmware for both backplane and PERC H730. Also found that controller was set to RAID for disks in stead of non-raid (passthrough or HBA mode).


Links:

VMware as a kick@ass KB on best practices for Dell PERC H730 for VSAN implementation. Link  provide below.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2109665

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144614

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2144936


https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2141386




VSAN upgrade - prerequisites

This is just my list of prerequisites for a VSAN upgrade.  Most of these items are applicable to a new install as well but more focused on a upgrade.

Please feel free to provide feedback and would to add to my list with your experiences.

Prerequisites:

  • The VSAN on-disk metadata upgrade is equivalent to doing a SAN array firmware upgrade and therefore requires a good backup and recovery strategy to be in place before we can proceed.


Links:



Friday, July 15, 2016

Dell rack servers - Upgrade firmware using Dell repository manager

I had to recently perform some firmware upgrades for customer on their Dell R710 and R730xd servers.  As you all know there are multiple ways to successfully upgrade the firmware and I want to touch on the upgrade through bootable virtual cd, optical cd or USB since this was the only method available to me at the time.

Firmware upgrade methods available:
  • Upgrade using bootable linux iso
  • Upgrade using server update utility(SSU) iso/folder with Dell lifecycle controller
  • Upgrade using Dell FTP Site with lifecycle controller

All of these methods have some great information out there on Dell website as well as blogs but I wanted to just go through my steps using bootable linux iso and primarily how to create it that ISO.

My preferred method is using the Dell FTP site with the lifecycle controller but this not always possible especially if you have trunked ports and have to specify a VLAN (in later iDRAC firmware it is now possible to specify a VLAN!)
The reason why the FTP site method is better in my opinion is because the firmware comparison is done upfront and only the necessary firmware is downloaded for component that are outdate. This decrease the firmware upgrade process considerably compared to the bootable iso that compares everything single component.(this only when you use the bundle, which I do in most instance since who wants to go manually through every single component and check which is required for your server:) 


Steps: