Storage Appliance

VMware VSAN – Storage Virtualisation

VMware Virtual SAN (VSAN)

VMware have clearly laid out plans for the complete convergence of the IT infrastructure, as seen with the release of NSX and VSAN at VMworld 2013. I’ve highlighted some important features of VMware’s new storage solution ‘VSAN’.

VMware VSAN is fully integrated with VMware vSphere automatically aggregating local host storage in a cluster so that it can be presented as block level shared storage for virtual machines. It’s function is to provide both HA (High Availability) and scale-out-storage features.

This is leaps ahead of the VMware VSA (Virtual Storage Appliance) aimed at the SMB with 2-3 hosts released in 2011.

Here is a quick reminder of some of the strict requirements & limitations we had with the VSA.

  • Does not support memory overcommitment.
  • Once installed you cannot add additional storage to the VSA cluster.
  • The VSA can only be configured in a two or three node cluster (which can’t be changed after install).
  • You cannot run vCenter on the back-end storage supported by VSA cluster.
  • Requires a minimum of 4 NICs (two front-end : two back-end communication).
    • Total IP addresses required for a 2 Node cluster = 11.
    • Total IP addresses required for a 3 Node cluster = 14.
    • Requires greenfield hosts (cannot be hosting VM’s prior to VSA install).

VSAN Requirements:

vCentre Server

  • VSAN requires a vCentre server running vSphere 5.5 (VSAN is also supported on the new vCentre server appliance!).
  • VSAN is configured and monitored using the vSphere 5.5 Web Client.

Host/Storage:

  • Each vSphere 5.5 host participating in the VSAN cluster requires a disk controller (SAS/SATA – RAID Controller).
  • Passthrough mode is required because VSAN ‘talks’ directly to the SSD and HDD.
  • Disks to be used should not have any RAID configuration applied (Parity/Striping is looked after by the VSAN).
  • Check to make sure that your controller is listed in the HCL.
  • Each host participating in the VSAN cluster must have at least one SSD & HDD. (The SSD provides read/write cache for I/O to backing HDD’s, similar to conventional cache on a storage processor. Note: the SSD’s do not contribute to the size of the VSAN datastore its used for cache only operations.

Note: beta version will have an 8 host limit, this figure will be adjusted when it’s GA.

Host Network Interface Cards:

  • Each host participating in the VSAN cluster must have at least one 1Gb NIC, however VMware recommends 10Gb CNA’s for VM’s with high workloads.
  • VSAN is supported on both standard virtual switches as well as distributed virtual switches.
  • A VMkernel port must be created for VSAN communication; this is used for inter-cluster-node communication and read/write operations for virtual machines residing on parent hosts belonging to a VSAN cluster.

VSAN Performance

As with any storage solution understanding your IOPs requirement is paramount, and to effectively achieve the necessary IOPs for the solution you need to understand what your workloads are. vSphere VSAN supports SSDs which act as read and write cache for I/O this significantly improves performance, although we are yet to see any real world numbers.

SSD based Read Caching on the VSAN:

This is a cache of locally accessed disk blocks for virtual machines, specifically these are blocks used by the application running on the virtual machine. The virtual machine might not be on the same host the controller uses to communicate with the SSD. vSphere VSAN mirrors a directory of cached blocks between hosts in a VSAN cluster, if the virtual machine is utilising cache not local to the vSphere host, the interconnect (VMKernel port) is used to retrieve cached blocks from the host SSD that does hold the information. Hence why VMware recommend using 10Gb CNA’s to reduce latency.

Note: If the required disks blocks are not in cache on any of the hosts the information is retrieved directly from the backing HDD’s.

SSD Write Caching on the VSAN:

The SSD is also used for write cache to reduce I/O latency. Because write operations go to the SSD storage, a copy of the data must exist elsewhere in the VSAN cluster encase of node failure. This is so that write operations written to cache are not lost. When a write operation is initiated by an application in the virtual machine the write operations are mirrored to both local cache and to remote hosts in the VSAN cluster.

Note: Write operations must be committed to SSD on hosts before it is acknowledged and committed to the HDD’s

Availability

vSphere VSAN uses RAIN (Reliable Array of Independent Nodes) – object level redundancy, so in a converged infrastructure you can survive the loss of an underlying component (NIC port(s), disk, vSphere host).

In the past when defining HA admission control policies we defined enforcement policies as (a) ‘Host Failures the Cluster tolerates’, (b) ‘Percentage of cluster resources reserved…’  and (c) ‘Specify failover hosts’. vSphere administrators can now define how many host, network or disk failures a virtual machine can tolerate in a VSAN cluster.

Note: In the event of node failure there is no need for all of the data to be migrated to other nodes in the cluster (copies of virtual machines (replicas) reside on multiple nodes in the cluster).

Note: For an object to be accessible in VSAN, more than 50 % of its components must be accessible.

Configurable Options

NumberOfFailuresToTolerate  – This allows us to define the ‘number of failures to tolerate’ – (network or disk) in the cluster and still maintain availability. If this is set, it specifies that configurations must contain at least NumberOfFailuresToTolerate +1 replicas.

Note: VMware state that ‘any disk failure on a single host is treated as a “failure” for this metric. Therefore, the object cannot persist if there is a disk failure on host01 and another disk failure on host 02 when you have NumberOfFailuresToTolerate set to 1.’

Number of Disks Stripes per Object – This defines the number of physical disks across which each replica of a storage object is distributed. A value higher than 1 might result in better performance if read caching is not effective, but it will also result in a greater use of system resources. VMware state that a default disk stripe number of 1 should meet the requirements of most if not all virtual machine workloads. VMware recommend that the disk striping value should change only when a small number of high-performance virtual machines are running.

Flash Read Cache Reservation  – The amount of flash reserved as read cache for the storage object. This is specified as a percentage of the logical size of the virtual machine disk storage object. If cache reservation is not defined the VSAN scheduler manages ‘fair’ cache allocation.

Object Space Reservation – This defines the percentage of the logical size of the storage object that should be reserved on the HDD’s during initialisation. VSAN datastores are thin provisioned by default, the ObjectSpaceReservation is the amount of disk space reserved on the VSAN datastore specified as a percentage of the VM disk.

Note: VMware state that if you provision a virtual machine and select disk format to be either thick lazy or eager zeroed this setting overrides the ObjectSpaceReservation setting in the virtual machine storage policy.

Force provisioning (disabled by default) – This is an override to forcibly provision an object even if the capabilities in the virtual machine storage policy cannot be satisfied .VSAN will attempt to bring that object into compliance if and when resources become available.

Virtual Machine Policies & Compliance:

When a VSAN datastore is created, a set of policies are defined to identify the capabilities of the underlying environment (high performance disks, NL disks etc..) The administrator uses a virtual machine storage policy to classify the application workload requirements. Using VM storage policies, administrators can specify s set of required storage capabilities for a virtual machine, or more specifically a set of requirements for the application running in the virtual machine. When a virtual machine is deployed, and if the requirements in the VM storage policy attached to the virtual can be met by the VSAN datastore, the VSAN datastore will be identified as complaint.

Here is the whitepaper released by VMware : VMware_Virtual_SAN_Whats_New.pdf

I’m looking forward to revisiting this post after the public beta is released with some more information on how it works.

You can register for the beta here : http://www.vmware.com/vsan-beta-register

vmw-dgrm-virt-san-overview-lg