Windows Server

Covering: Microsoft Server Operating Systems

WSFC/MSCS vSphere 6.x Enhancements

For those that aren’t aware, VMware released an updated Microsoft WSFC Setup and Deployment Guide for vSphere 6.x.
In a previous blog post I covered Microsoft Clustering Design Implications in vSphere 5.x. Fundamentally the deployment of WSFC has not changed significantly. However, there are a couple of new features that I wanted to cover here.
New Features and Requirements:
  • vMotion supported for cluster of virtual machines across physical hosts (CAB deployment) with passthrough RDMs. Note, you must use VM-hardware version 11.
    • VMware recommends updating the heart-beat timeout ‘SameSubnetThreshold’ registry value to 10. Additional info can be found on MS Failover Clustering and NLB Team Blog and in VMware’s updated WSFC Setup and Deployment Guide.
    • The vMotion network must be a 10Gbps.
      • 1Gbps Ethernet link for vMotion of MSCS virtual machines is not supported.
        • Fair enough, but most customer deployments using 10GbE also share that with other workloads. In addition using NIOC to prioritise traffic to prod workloads. So its not clear if the minimum requirement is 10GbE or higher bandwidth that can be provided by 1GbE.
    • vMotion is supported for Windows Server 2008 SP2 and above. Windows Server 2003 is not supported.
    • SCSI bus sharing mode set to Physical.
  • ESXi 6.0 supports PSP_RR for Windows Server 2008 SP2 and above releases (same as ESXi 5.5 but with restrictions)
    • Shared disk quorum or data must be provisioned to guest in PassThrough RDM mode only
  • All hosts must be running ESXi 6.x
    • Mixed mode operating with older ESXi revisions not supported.
    • Rolling upgrades of cluster hosts from previous versions of ESXi to ESXi 6.x is not supported.
  • MSCS (Windows Server Failover Clustering (WSFC)) is supported with VMware Virtual SAN (VSAN) version 6.1 and later. See VSAN 6.1 Whats New!.
  • In vSphere 6.0, VMware introduced support for using Windows Server Failover Clustering or Microsoft Server Failover Clustering to protect a Windows-based vCenter Server.
Recommendations:
  • Modifying the MSCS heartbeat time-out: An MSCS virtual machine can stall for a few seconds during vMotion. If the stall time exceeds the heartbeat time-out interval, then the guest cluster considers the node down and this can lead to unnecessary failover.
    • VMware recommends changing the DWORD ‘SameSubnetThreshold’ registry value in each WSFC node to 10.
  • VMware also warns of deploying WSFC in vSphere environments with memory overcommitment. Memory overcommitment (worse active memory reclamation like compression, swapping) can cause virtual machine I/O latency to increase, potentially causing failover. Set memory reservations if you are concerned this may affect your WSFC/MSCS nodes.

Not Supported / Limitations:

  • No Storage vMotion for VMs that are configured with shared disks.
  • No support for WSFC on NFS.
  • Running WSFC nodes on different ESXi versions (Pity as this would have been ideal for ESXi 5.x to ESXi 6.x upgrades).
  • Cant use WSFC in conjunction with VMware FT.
  • NPIV not supported.
  • Server 2012 storage spaces not supported.
References:

Infrastructure Design & Project Framework

Successfully planning, designing and implementing a virtualisation project can be a very rewarding experience. Whether you are working alone or in a team you may find the task initially daunting, be unsure of where to start or lack the appropriate framework from which to work from. Hopefully this information will support you, if you have been given the task or have successfully completed a virtualization project, but want to identify ways to make the next upgrade or implementation more efficient.

Infrastructure design is a deep subject with many facets and interlinking dependencies of design choices. The four pillars, referred to as compute (see my compute design post), storage (see my storage design post), networking and management, can be very complex to integrate successfully, when considering all the options. A great deal of emphasis should be placed on understanding design decisions, as poor planning can lead to additional costs, the project not meeting organisation goals and ultimately a failure to deliver. Through each part of the design process it is important that you validate design decisions against requirements identified through the information gathering process.

Furthermore, design decisions should be continually evaluated against infrastructure qualities such as availability, manageability, performance, recoverability and security.

Project Framework

Use the following key points/stages to plan and build your project:

1. Information Gathering
2. Current State, Future State and Gap Analysis
3. Conceptual, Logical & Physical Design Process
4. Migration and Implementation Planning
5. Functional Testing / Quality Assurance
6. Continuous Improvement
7. Monitoring Performance, Availability and Capacity

1. Information Gathering: Information should be gathered from stakeholders / C-level executives, application owners and subject matter experts to define and identify:

  • The Project scope / project boundaries, for example, upgrade the VMware vSphere infrastructure at the organisations central European offices only.
  • Project goals, what is it the organisation wants to achieve? For example reduce physical server footprint by 25% before the end of the financial year.
  • Service Level Agreements (SLA), Service Level Objectives (SLO), Recovery Time Objectives (RTO), Recovery Point Objectives (RPO) : [Maximum Tolerable Downtime MTD].
  • Key Performance Indicators (KPI), relating to application response times.
  • Any requirements, both functional and non-functional i.e regulatory compliance – HIPAA, SOX, PCI etc. Understand the impact on the design required to meet HIPAA compliancy (a US standard, but acknowledged under EU-ISO/IEC 13335-1:2004 information protection guidelines), which states that data, communication must be encrypted (HTTPs, SSL, IPSEC, SSH). A functional requirement specifies something the design must do for example support 5000 virtual machines, whereas a non-functional requirement specifies how the system should behave, For example: Workloads deemed as business critical must not be subject to resource starvation (CPU, Mem, Network, Disk) and must be protected using appropriate mechanisms.
  • Constraints:  Limit design choices based on data consolidation from the information gathering exercise. An example could be that you need to use the organisations existing NFS storage solution. A functional requirement may be that the intended workload you need to virtualize is MS Exchange. Currently virtualising MS Exchange on NFS is not supported – if the customer had a requirement to virtualise MS Exchange but only had an NFS-based storage solution, the proposal would lead to an unsupported configuration. Replacing the storage solution may not be feasible and out of scope due to financial reasons.
  • Risks: Put simply are defined by the probability of a threat, the vulnerability of an asset to that threat, and the impact it would have if it occurred. Risks throughout the project must therefore be recorded and mitigated, regardless of which aspect of the project they apply to.  An example risk, a project aimed at a datacentre that doesn’t have enough capacity to meet the anticipated infrastructure requirements. The datacentre facilities team is working on adding additional power but due to planning issues may not be able to meet the expected deadlines set by the customer. This risk would therefore need to be mitigated and documented to minimise / remove the chance of it occurring.
  • Assumptions: The identification or classification of a design feature without validation. For example: In a multi-site proposal the bandwidth requirements for datastore replication is sufficient to support the stated recovery time objectives. If the site link has existing responsibilities how will the inclusion of additional replication traffic affect existing operations? During the design phase you may identify additional assumptions each of which must be documented and validated before proceeding.

 2. Current state, Future state and Gap Analysis:

  • Identifying the current state can be done by conducting an audit of the existing infrastructure, obtaining infrastructure diagrams, system documentation, holding workshops with SME’s and application owners.
  • A future state analysis is performed after the current state analysis and typically outlines where the organization will be at the end of the projects lifecycle.
  • A gap analysis outlines how the project will move from the current state to the future state and more importantly, what is needed by the organization to get there.

3. Conceptual, Logical & Physical Design Process:

  • A conceptual design identifies how the solution is intended to achieve its goals either through text, graphical block diagrams or both.
  • A logical design must focus on the relationships between the infrastructure components – typically this does not contain any vendor names, physical details such as amount of storage or compute capacity available.
  • A physical design shows a detailed description of what solutions have been implemented in order to achieve the project goals. For example: How the intended host design would mitigate against a single point of failure.
  • Get stakeholder approval on design decisions before moving to the implementation phase. Throughout the design process you should continually evaluate design decisions against the goal requirements and the infrastructure qualities (Availability, Manageability, Performance, Recoverability, Security).
    • Availability: Typically concerned with uptime and calculated as a percentage based on the organisations service level agreements (SLA). The key point is mitigating against single point of failure across all components. Your aim is to build resiliency into your design. Availability is calculated as a percentage or 9s value : [Availability % = ((minutes in a year – average annual downtime in minutes) / minutes in a year) × 100].
    • Manageability: Concerned with the operating expenditure of the proposed solution or object. How well will the solution scale, manage, implement, upgrade, patch etc..
    • Performance: How will the system deliver required performance metrics, typically aligned to the organisations KPIs and focus on workload requirements: response times, latency etc..
    • Recoverability: RTO/RPO | MTD : Recovery Time Objective: Time frames associated with service recovery. Recovery Point Objective: How much data loss is acceptable? Maximum Tolerable Downtime: A value derived from the business which defines the total amount of time that a business process can be disrupted without causing any unacceptable consequences.
    • Security: Compliance, access control. How best can you you protect the asset, workload, from intruders or DOS attacks. More importantly what are the consequences/risks of your design decisions.

4. Migration and Implementation Planning:

  • Identify low risk virtualisation targets and proceed with migrating these within the organisation first. This is beneficial in achieving early ROI, build confidence and assist other operational aspects of future workload migrations.
  • Work with application owners to create milestones and migration schedules.
  • Arrange downtime outside of peak operating hours, ensure you have upto date and fully documented rollback and recovery procedures.
  • Do not simply accept and adopt best practises; understand why they are required and their impact on the design.

Additional Guidelines: 

  • Create service dependency mappings: These are used to identify the impact of something unexpected and how best to protect the workload in the event of disaster. DNS for example plays an important role in any infrastructure – if this was provided through MS Active Directory in an all virtualised environment, what impact would the failure of this have on your applications, end users, external customers? How can you best mitigate the risks of this failing?
  • Plan for performance then capacity: If you base your design decisions on capacity you may find that as the infrastructure grows you start experiencing performance related issues.  This is primarily attributed to poor storage design, having insufficient drives to meet the combined workload I/O requirements.
  • Analyse workload performance and include capacity planning percentage to account for growth.
  • What are the types of workloads to be virtualised – Oracle, SQL, Java etc.  Ensure you understand and follow best practices for virtualized environments – reviewing and challenging where appropriate. Oracle for example has very strict guidelines on what they deem as cluster boundaries and can impact your Oracle licensing agreement.
  • Don’t assume something cannot be virtualised due to an assumed issue.
  • Benchmarking applications before they are virtualised can be valuable in determining a configuration issue post virtualisation.
  • When virtualising new applications check with the application vendor regarding any virtualisation recommendations. Be mindful of oversubscribing resources to workloads that won’t necessarily benefit from it. “Right sizing” virtual machines is an important part of your virtualisation project. This can be challenging as application vendors set specific requirements around CPU and memory.
    • For existing applications be aware of oversized virtual machines and adjust resources based on actual usage.
  • What mechanisms will you use to guarantee predicable levels of performance during periods of contention? See vSphere NIOC, SIOC.
  • VARS/Partners may be able to provide the necessary tools to assess current workloads, examples of which include VMware Capacity Planner (can capture performance information for Windows/Linux), IOStat, Windows Perfmon, vscsiStats,  vRealize Operations Manager…

5. Functional Testing / Quality Assurance :

This is a very important part of your design as it allows you to validate your configuration decisions. Also ensuring configurational aspects of the design are implemented as documented. This stage is also used to ensure the design meets both functional and non-fuctional requirements. Essentially the process maps the expected outcome against actual results.

  • Functional Testing is concerned with exercising core component function. For example, can the VM/Workload run on the proposed infrastructure.
  • Non-functional testing is concerned with exercising application functionality using a combination of invalid inputs, some unexpected operating conditions and by some other “out-of-bound” scenarios. This test is designed to evaluate the readiness of a system according to several criteria not covered by functional testing. Test examples include; vSphere HA, FT, vMotion, Performance
, security…

6. Continuous Improvement:

The ITIL framework is aimed at maximising the ability of IT to provide services that are cost effective and meet the expectations and requirements of the organisation and customers. This is therefore supported by streamlining service delivery and supporting processes by developing and documenting repeatable procedures. The ITIL Framework CSI (Continual Service Improvement) provides a simple seven-step process to follow.

Stage 1: Define what you should measure
Stage 2: Define what you currently measure
Stage 3: Gather the data
Stage 4: Processing of the data
Stage 5: Analysis of the data
Stage 6: Presentation of the information
Stage 7: Implementation of corrective action

  • Workloads rarely remain static. The virtualised environment will need constant assessment to ensure service levels are met and KPIs are being achieved. You may have to adjust memory and CPU as application requirements increase or decrease. Monitoring is an important part in the process and can help you identify areas which need attention. Use built-in alarms to identify latency in storage and vCPU ready times, which can be easily set to alert you to an issue.
  • Establish a patching procedure  (Host, vApps, VMs, Appliances, 3rd party extensible devices).
  • Use vSphere Update Manager to upgrade hosts, vmtools, virtual appliances. This goes deeper than just the hypervisor – ensure storage devices, switches, HBA, firmware are kept up-to-date and inline with vendor guidelines.
  • Support proactive performance adjustments and tuning, analyse issues : determine the root cause, plan corrective action, remediate then re-assess.
  • Document troubleshooting procedures.
  • Use automation to reduce operational overheads.
  • Maintain a database of configuration items (these are components that make up the infrastructure), their status, lifecycle, support plan, relationships and which department assumes responsibility for them when something goes wrong.

7. Monitoring Performance Availability and Capacity:

  • Ensure the optimal and cost effective use of the IT infrastructure to meet the current and future business needs. Match resources to workloads that require a specific level of service. Locate business critical workloads on datastores backed by tier 1 replicated volumes on infrastructure that mitigates against single point of failure.
  • Make use of built-in tools for infrastructure monitoring and have a process for managing / monitoring service levels.
  • Monitor not only the virtual machines but the underlying infrastructure, using built-in tools already mentioned above, to monitor latency.
  • Performance and capacity reports should include, hosts / clusters, datastores and resource pools.
  • Monitor and report on usage trends at all levels, compute, storage and networking.
  • Scripts for monitoring environment health (see Alan Renouf’s vCheck script).
  • A comprehensive capacity plan uses information gathered from day-to-day tuning of VMware performance, current demand, modeling and application sizing (future demand).

Additional Service Management Tasks:

  • Integrate the virtual infrastructure into your configuration and change management procedures.
  • Ensure staff are trained to support the infrastructure – investment here is key in ensuring a) staff are not frustrated supporting an environment they don’t understand and b) the business gets the most out of their investment.
  • Develop and schedule maintenance plans to ensure the environment can be updated and is running optimally.
  • Plan and perform daily, weekly, monthly maintenance tasks.  For example, search for unconsolidated snapshots, review VMFS volumes for space in use and available capacity (anything less than 10% available space should be reviewed). Check logical drive space on hosts. Check that any temporary vms can they be turned off or deleted. Monthly maintenance tasks, create a capacity report for the environment and distribute to IT and management. Update your VM templates, review the vmware website for patches, vulnerabilities and bug fixes.

Reference Documentation:

Conceptual Logical Physical It is Simple, by John A Zackman
Leveraging ITIL to Manage Your Virtual Environment, by Laurent Mandorla, Fredrik Hallgårde, BearingPoint, Inc.
Performance Best Practises for VMware vSphere
ITIL v3 Framework, Service Management Guide
Control Objectives for Information and Related Technology (COBIT) framework by ISACA
Oracle Databases on VMware vSphere Best Practise Guide
VMware vSphere Monitoring Performance Guide

Microsoft Clustering with VMware vSphere Design Guide

Update: 28/01/2016 – Post updated to reflect vSphere 6.x enhancements.

Microsoft Cluster Services (MSCS) has been around since the days of Windows NT4, providing high availability to tier 1 applications such as, MS Exchange and MS SQL Server. With the release of Server 2008, Microsoft Clustering Services was renamed to Windows Server Fail-Over Clustering (WSFC) with several updates.
This post will focus on the design choices when implemented with VMware vSphere and proposing alternatives along the way. This is not intended as a ‘step-by-step install guide‘ of MSCS/WSFC.

vSphere 5, 6.x Integration – MSCS/WSFC Design Guide

Update: Whats new vSphere 6

  • vMotion supported for cluster of virtual machines across physical hosts (CAB deployment).
  • ESXi 6.0 supports PSP_RR for Windows Server 2008 SP2 and above releases of MS Windows Server.
  • MSCS / WSFC is supported with VMware Virtual SAN (VSAN) version 6.1.
  • WSFC supported on Windows deployments of the VMware vCenter Server.
  • Be sure to review the WSFC VMware KB page for updates on supported configurations.

Requirements:

  • Virtual disk formats should be thick provisioned eager zeroed.
  • Update: vMotion support in ESXi 6.x only and for cluster of virtual machines across physical hosts (CAB) with passthrough RDMs you must use VM-hardware version 11.
    • VMware recommends updating the heart-beat timeout ‘SameSubnetThreshold’ registry value to 10. Additional info can be found on MS Failover Clustering and NLB Team Blog
    • The vMotion network must be a 10Gbps Ethernet link. 1Gbps Ethernet link for vMotion of WSFC virtual machines is not supported.
  • Synchronise time with a PDCe/NTP server (disable host based time synchronisation using VMware tools).
  • WSFC/MSCS requires a private or heartbeat network for cluster communication.
  • Modify the windows registry disk timeout value to 60 seconds or more.(HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\TimeOutValue.)
  • Guest operating system and SCSI adapter support requirements:
    Operating System SCSI Adapter
    Windows 2003 SP1 or higher LSI Logic Parallel
    Windows 2008 SP2 or higher LSI Logic SAS
    Windows 2008 R2 SP1 and higher LSI Logic SAS
    Windows 2012 and above LSI Logic SAS
  • A shared storage drive (quorum) that is presented to all hosts in the environment that might host the MSCS/WSFC-virtual machines.
  • Quorum/Shared Storage Requirements:

    Storage Cluster in a Box (CIB) Cluster Across Box (CAB) Physical and Virtual Machine
    Virtual Disks (VMDK’s) Yes (recommended) No No
    Physical Mode Raw Device Mapping No Yes (recommended) Yes
    Virtual Mode Raw Device Mapping Yes No No
    In-guest iSCSI Yes Yes Yes
    In-guest SMB 3.0 Yes (Server 2012 only) Yes (Server 2012 only) Yes (Server 2012 only)

Limitations:

  • Windows 2000 VM’s are no longer supported from vSphere 4.1 onwards; Windows 2003 SP2 and 2008 R2, 2012, 2012 R2 is supported.
  • Five node clusters are possible (only two nodes in vSphere 5.0).
  • You must use at least VM hardware version 7. Update: For vMotion support in ESXi 6.x use VM hardware version 11.
  • Shared disks need to be the thick provision eager zeroed.
  • Only fibre channel SANs are supported. iSCSI, Fibre Channel over Ethernet (FCoE), and NFS shared storage aren’t. Update:  in vSphere 5.5, 6.x iSCSI, FCoE limitations have been lifted – restrictions apply see vmware KB2052238.
  • There is no support for vMotion/Storage vMotion, any attempt vmotion a VM will fail and may result in a node failing over. Technically vMotion is possible when using an iSCSI initiator inside the guest VM to connect your shared disk. Update: vMotion in CAB deployment supported in ESXi 6.x see requirements
  • NPIV, and Round-Robin multipathing is not recommended when using vSphere native multipathing. Third-party multipathing using round robin may be supported but check with your storage vendor. Note: In vSphere 5.5 PSP_RR is supported with restrictions. Update: PSP_RR is also now supported with ESXi 6.x.
  • WSFC/MSCS not supported with vSphere FT.
  • Increasing the size of the disks, hot add/CPU and memory is not supported.
  • Memory overcommitment is not recommended, overcommitment can be disruptive to the clustering mechanisms (optionally set VM, memory reservations).
  • Paravirtualised SCSI controllers not currently supported (this may be lifted – check VMware compatibility guides).
  • Pausing or resuming the VM state is not supported.

Use Cases:

  • Invariably it depends on the application, the application needs to be cluster-aware, not all applications support Microsoft Cluster Services.
  • Microsoft Exchange Server
    • See MSCS alternative with Exchange 2010 Database Availability Groups (Cluster Continuous Replication – CCR & Standby Continuous Replication – SCR).
  • Microsoft SQL Server
    • Alternative to MSCS/WSFC use SQL always on availability groups.
  • Web, DHCP, file and print services

Implementation Options:

Before we look at the various implementation options it may be worth covering some of the basic requirements of an MSCS cluster, a typical clustering setup includes the following:

  1. Drives that are shared between clustered nodes, this is a shared drive which is accessible to all nodes in the cluster. This ‘shared’ drive is also known as the quorum disk.
  2. A private heartbeat network that the nodes can use for node-to-node communication.
  3. A public network so the virtual machines can communicate with the network.
  • Cluster In A Box (CIB): When virtual machines are clustered running on the same ESXi host. The shared disks or quorum (either local or remote) is shared between the virtual machines. CIB can be used in test or development scenarios, this solution provides no protection in the event of hardware failure. You can use VMDK’s (SCSI bus set to virtual mode) or RAW Device Mappings (RDMs), RDMs are beneficial of you decide to migrate one of the VM’s to another host.

CIB_Diagram

  • Cluster Across Boxes (CAB): MSCS is deployed to two VM’s and the two VM’s are running across two different ESXi hosts. This protects against both software and hardware failures. Physical RDMs are the recommended disk choice. Shared storage/quorum should be located on fibre channel SAN or via an in-guest iSCSI initiator (be aware of the impact with the latter).

CAB_Diagram

  • Virtual Machine + Physical:  [VM N + 1 physical] clusters allows one MSCS cluster node to run natively on a physical server, while the other runs as a virtual machine. This mode can be used in order to migrate from a physical two node deployment to a virtualised environment. Physical RDMs are the recommended disk option here. Shared storage/quorum should be located on fibre channel SAN or via an in-guest iSCSI initiator (again be aware of the impact of using in-guest iSCSI).

V & P_Diagram

SCSI/Disk Configuration Parameters:

  • SCSI Controller Settings:
    • Disk types : An option when you add a new disk. You have the choice of VMDK, virtual RDM – (virtual compatibility mode), or physical RDM (physical compatibility mode).
    • SCSI bus-sharing setting: virtual sharing policy or physical sharing policy or none.
      • The SCSI bus-sharing setting needs to be edited after VM creation.
  • SCSI bus sharing values:
    • None: Used for disks that aren’t shared in the cluster (between VM’s), such as the VM’s boot drives/temp drives pages files.
    • Virtual: Use this value for cluster in box deployments (CIB).
    • Physical: Recommended for cluster across box or physical and virtual deployments (CAB, & virtual + physical).
  • Raw Device Mapping (RDM) used by quorum drive (see requirements for deployment use cases).
  • RDM options:
    • Virtual RDM mode (non-pass through mode)
      • Here the hardware characteristics of the LUN are hidden to the virtual machine, the VMkernel only sends read/write I/O to the LUN.
    • Physical RDM mode (pass-through mode)
      • The vmkernel passes all SCSi commands to the LUN, with the exception of the REPORT_LUNs command, so the VMKernel can isolate the LUN to the virtual machine. This mode is useful for SAN management agents (or SAN snapshots) FC SAN backup other SCSI target-based tools.
  • ESXi 5.x and 6.x uses a different technique to determine if Raw Device Mapped (RDM) LUNs are used for MSCS cluster devices, by introducing a configuration flag to mark each device as “perennially reserved” that is participating in an MSCS cluster. For ESXi hosts hosting passive MSCS nodes with RDM LUNs, use the esxcli command to mark the device as perennially reserved: esxcli storage core device setconfig -d <naa.id> --perennially-reserved=true. See KB 1016106 for more information.

Design Impact:

  • MSCS requires a private or heartbeat network for cluster communication.
    • This means adding a second virtual nic to the VM which will be used for heartbeat communication between the virtual machines. This potentially means separate virtual machine port groups (recommended to use separate VLAN per port group for L2 segmentation).
  • If using in-guest iSCSI initiators, be aware that SCSI encapsulation is performed over virtual machine network. The recommendation is to separate this traffic from regular VM traffic (ensure you identify and dedicate bandwidth accordingly).
  • If you currently overcommit on resources, identify the root cause and address any issues (CPU/Memory), setting reservations is highly recommended in such scenarios. If you intend to use reservations, have a look at reservations using resource pools versus reservations at a VM level (Inappropriate use of reservations or badly architected resource pools can make the problem worse).
  • Factor in the application workload requirements, often providing high availability is only part of the solution. Account for application workload (CPU/Memory/Network/IOPs,Bandwidth) in order meet KPI’s or SLA’s.
    • If you intend to reserve memory or CPU, factor in the impact to the HA slot size calculation. For example: if you use ‘Admission Control, host failures the cluster can tolerate’. Larger slot sizes impact consolidation ratios.
  • You cannot use RDMs on local storage (see limitations), shared storage is required. Virtual storage appliances on local disks using iSCSI can be looped back to support this (not recommended for tier 1 applications).
  • Account for HA/DRS
    • Using vSphere DRS does not mean high availability is assured, DRS is used to balance workloads across the HA cluster. Note: For MSCS deployments, DRS cannot vMotion clustered VMs (vMotion supported in vSphere 6.x, allowing for DRS automation) DRS can make recommendations at power on for VM placement.
    • DRS anti-affinity rules should be created prior to powering on the clustered VM’s using raw device mappings.
    • To make sure that HA/DRS clustering functions don’t interfere with MSCS, you need to apply the following settings:
      • VMware recommend setting the VM’s with individual DRS setting to partially automated (forVM placement only). You should use ‘Must Run’ rules here, also set the advanced DRS setting ForceAffinityPoweron to 0.
  • For CIB deployments create VM-to-VM affinity rules to keep them together.
  • For CAB deployments create VM-to-VM anti-affinity rules to keep them apart. These should be ‘Must Run’ rules as there is no point in having two nodes on the same ESXi host.
  • Physical N+1 VMs don’t need any special affinity rules as one of the nodes is virtual and the other physical, unless you have a specific requirement to create such rules.
  • Important:  vSphere DRS also needs additional ‘Host-to-VM’ rule-groups, because HA doesn’t consider the ‘VM-to-VM’ rules when restarting VM’s in the event of hardware failure.
    • For CIB deployments, VM’s must be in the same VM DRS group, which must be assigned to a host DRS group containing two hosts using a ‘Must Run’ on hosts in group rule.
    • For CAB deployments VM’s must be in different VM DRS groups. The VM’s must be assigned to different host DRS groups using a ‘Must Run’ on hosts in group rule.

Recoverability:

  • If you are using physical mode RDM’s you will not be able to take advantage of VM-level snapshots which leverage the vSphere API for Data Protection (VADP). In scenarios using physical mode RDM’s you may want to investigate SAN level snapshots (with VSS integration) or in-guest backup agents.
    • SAN snapshots using physical RDM’s allow you to take advantage of array based snapshot technologies if your SAN supports it.
      • In guest backup agents can be used but be aware that you may need to create a separate virtual machine port group for backups, unless you want backups to be transported over your VM network! It is also worth noting that this doesn’t provide you with a backup of the virtual machine configuration information (.vmx or any of the vmware.log information) in the event you need to restore a VM.
    • Account for the impact of MSCS at your disaster recovery site, do you plan on a full MSCS implementation (CAB/Physical N+1).
    • What type of infrastructure will be available to support the workload at your recovery site? Is your recovery site cold/hot or a regional data centre used by other parts of the business?
    • Have you accounted for the clustered application itself? What steps need to be taken to ensure the application is accessible to users/customers?
    • Adhere to the RTO/RPO – MTD requirements for the application.
    • Lastly it goes without saying make sure your disaster recovery documentation is up to date.

Benefits:

  • If architected correctly with VMware vSphere, a virtual implementation can meet the demands for tier 1 applications.
  • Can reduce hardware and software costs by virtualising current WSFC deployments.
  • Leveraging DRS for VM placement:
    • Use VM-Affinity and Anti-affinity to help avoid  clustered vm’s on the same host during power on operations or host failures.
    • Use VM-to-Host rules (DRS-Groups) you can be used to locate specific VMs on particular hosts, ideal for use on blade servers where a VM can be housed on hosts from different blade chassis or racks.

Drawbacks:

  • When using RDM’s remember that each presented RDM reserves one LUN ID of which the maximum is 256 per ESXi host.
  • Taking into account the limitations stated above, in my opinion the biggest drawback is the high operational overhead involved with looking after virtual WSFC clusters. WSFC solutions can be complex to manage. Keep the solution simple! think about the operational cost to support the environment whilst obeying the availability requirements.

Conclusion:

If you are looking at virtualising your MS clusters – vSphere is a great choice with many features to support your decision (HA, DRS – Anti-Affitity Rules). However, before making any decisions an assessment of the design implications should be performed in order to identify how they affect availability, manageability, performance, recoverability and security. Migrating from physical to virtual instances of MSCS/WSFC may also offer a reduction in hardware and software costs (depending on your licensing model). It is also worth looking at other solutions which could be implemented, for example: native high availability features of VMware vSphere (VM & App monitoring, HA, Fault-Tolerance), these can provide a really good alternative to MS clustering solutions.

In the end the decision to use WSFC essentially comes down to the workload availability requirements defined by the customer or business owner, this will ultimately drive the decision behind your strategy.

Reference Documents:

 

HP 3PAR / Windows Server 2008/2012 Boot from SAN Guide

Use Case: 

  • Store operating systems on the SAN, generally this provides higher availability, redundancy & recovery depending on the RAID & SAN configuration.
  • In diskless server builds to reduce power consumption by having no internal disks.
  • Blade architecture, where internal disks aren’t large enough to hold application and OS (not so much of an issue now considering the density of modern 3.5/2.5” disks).

Benefits:

  • Minimize system downtime, perhaps a critical component such as a processor, memory, or host bus adapter fails and needs to be replaced. You need only swap the hardware and reconfigure the HBA’s BIOS, switch zoning, and host-port definitions on the storage processors.
  • Enable rapid deployment scenarios.
  • Boot from SAN alleviates the necessity for each server to have its own direct-attached disk, eliminating internal disks as a potential point for failure. Thin diskless servers also take up less rack space, require less power, and are generally less expensive because they have fewer hardware components.
  • Centralised management when operating system images are stored on networked disks, all upgrades and fixes can be managed at a centralized location. Changes made to disks in a storage array are readily accessible by each server (This includes benefits in capacity planning as you typically have a holistic view of your SAN environment) .
  • All the boot information and production data stored on local SAN ‘A’ can be replicated to local SAN ‘B’ (see 3PAR Peer Persistence) or one at a geographically dispersed disaster recovery site. If a disaster destroys functionality of the servers at the primary site, the remote site can take over with minimal downtime.
  • Recovery from server failures is simplified in a SAN environment. With the help of snapshots, mirrors of a failed server can be recovered quickly by booting from the original copy of its image. As a result, boot from SAN can greatly reduce the time required for server recovery.

Risks:

  • With older windows operating systems – (Windows 2003) it was recommended that the boot LUN should be on separate SCSI bus from the shared LUNs (to avoid issues with SCSI-bus resets disrupting I/O – causing a BSOD. In windows 2008/2012 this is not an issue, boot LUNS can share the same bus/path).
  • Financial risk, understand that CAPEX costs can be higher than if you were to boot off you local disks (additional HBA’s, cabling & sfp’s). If you calculate the  £ per GB cost of a typical high-end SAN versus the cost of mirrored drives it can be allot more. Do you have enough physical capacity in your array to support this? if not you will need to buy more disks increase throughput/IOPs.

Potential Design Impact:

  • If a host/nodes swap out pages frequently this could result in heavy I/O traversing your storage fabric, this may negatively impact services (especially latency-critical apps). This might not be apparent if you have a few servers, but what if you have many that have a BFS (Boot From SAN) requirement. This can be mitigated to some extent by moving page-files to local disks or installing more memory. If this is a SQL server investigate using ‘Lock Pages in Memory’ function, to prevent SQL from paging workloads out unnecessarily (let SQL manage it’s working set size, check buffer pool to RAM ratio too).
  • Migrating OS to BFS in some situations can have a negative impact on the array or fabric switches potentially causing contention (check ISL fan-in ratios). This is more of an issue with iSCSI/NFS than FC due to the nature of the protocol. Ensure that your fabric switches – core/access have enough bandwidth to supply I/O demands. In some situations even storage processors could be overwhelmed (check host queue depth settings – this allows the host to throttle back I/O).
  • Check that you have enough FC/FCOE/iSCSI/NFS uplinks to service any high I/O workloads. Certainly the most opted for solution is to increase array side cache, but this is often the most costly option and doesn’t really address the root cause of any latency or throughput constraints.
  • Be mindful of boot storms after an outage or in VDI deployments, you may have to selectively boot tier1 apps in phases (bear in mind tier 1 application dependencies such as DNS, LDAP or Active Directory servers, they need to be started first). Review your tier 1 app service dependencies and their IOPs requirements (see point above for throughput considerations).

Key Points (3PAR):

  • The Boot LUN should be the lowest-ordered LUN number that exports to the host (3PAR recommended), however some arrays assign LUN0 to the controller in which case LUN1 can be used.
  • NOTE: With the introduction of the Microsoft Storport driver, booting from a SAN has become less problematic. Refer to http://support.microsoft.com/kb/305547.
  • For the initial boot, restrict the host to a single path connection on the 3PAR array. Only a single-path should be available on the HP 3PAR StoreServ Storage and a single path on the host to the VLUN that will be the boot volume (this can be changed after the host has booted and you have installed the MPIO driver).
  • It goes without saying check that your SAN, FC switch, server & HBA cards are running the latest firmware.
  • Ensure appropriate zoning techniques are applied (see my best practice guide)
  • If you are using clustering ensure nodes in a cluster have sole access to the boot LUN (1:1 mapping), using LUN masking (array side).
  • Server side HBA configuration (This can vary depending on HBA vendor – check your documentation)
  • Use soft zoning (zoning per pWWN), generally this is a requisite for HP 3PAR – but in terms of booting from SAN provides more flexibility. However, if the HBA card fails you will need to update LUN masking and soft zoning configurations.

 3PAR: Creating & Exporting Virtual Volumes

Virtual volumes are the only data layer visible to hosts. After devising a plan for allocating space for host servers on the HP 3PAR StoreServ Storage, create the VVs for eventual export as LUNs to the Windows Server 2012/2008 host server.

You can create volumes that are provisioned from one or more common provisioning groups (CPGs). Volumes can be fully provisioned from a CPG or can be thinly provisioned. You can optionally specify a CPG for snapshot space for fully-provisioned volumes. (Don’t forget, that if your requirements change – and you need to convert these volumes to thin provisioned volumes or vice-versa use the 3PAR System Tune operation).

Using the HP 3PAR Management Console :

  1. From the menu bar, select: Actions→Provisioning→Virtual Volume→Create Virtual Volume
  2. Use the Create Virtual Volume wizard to create a base volume.
  3. Select one of the following options from the allocation list: ‘Fully Provisioned’ / ‘Thinly Provisioned’

Next perform softzoning / LUN masking, see key point mentioned earlier about only presenting a single path to host. After you have installed the MPIO drivers (post OS install) present additional paths.

Configuring Brocade HBA to boot from SAN:

  1. Check and enable HBA BIOS (BIOS for arrays must be disabled that are not configured for boot from SAN).
  2. Enable one of the following boot LUN options:
  • Auto Discover—When enabled, boot information, such as the location of the boot LUN,is provided by the fabric (This is the default value).
  • Flash Values—The HBA obtains the boot LUN information from flash memory.
  • First LUN —The host boots from the first LUN visible to the HBA that is discovered in the fabric.
  1. Select a boot device from discovered targets.
  2. Then just save changes and exit.

Configuring Emulex HBA to boot from SAN:

  1. Boot the Windows Server 2012/2008 system following the instructions in the BootBios update manual.
  2. Press Alt+E. For each Emulex adapter, set the following parameters:
  3. Select Configure the Adapter’s Parameters.
  4. Select Enable or Disable the BIOS; for SAN boot, ensure that the BIOS is enabled.
  5. Press Esc to return to the previous menu.
  6. Select Auto Scan Setting; set the parameter to First LUN 0 Device; press Esc to return to the previous menu.
  7. Select Topology.
  8. Select Fabric Point to Point for fabric configurations.
  9. Select FC-AL for direct connect configurations.
  10. Press Esc to return to the previous menu if you need to set up other adapters. When you are Finished, press x to exit and reboot.

Configuring Qlogic HBA to boot from SAN:

Note: use the QLogic HBA Fast!UTIL utility to configure the HBA. Record the Adapter Port Name WWPN for creating the host definition  in the 3PAR IMC (however if server is zoned correctly you should see the HBA pWWN’s when adding a new host).

  1. Boot server; as the server is booting, press the Alt+Q or
  2. Ctrl+Q keys when the HBA BIOS prompts appear.
  3. In the Fast!UTIL utility, click Select Host Adapter and then select the appropriate adapter.
  4. Click Configuration Settings→Adapter Settings.
  5. In the Adapter Settings window, set the following.
  6. Host Adapter BIOS: Enabled
  7. Spinup Delay: Disabled
  8. Connection Option:0 for direct connect 1 for fabric
  9. Press Esc to exit this window.
  10. Click Selectable Boot Settings. In the Selectable Boot Settings window, set Selectable Boot Device to Disabled.
  11. Press Esc twice to exit; when you are asked whether to save NVRAM settings, click Yes.

Connecting Multiple Paths for Fibre Channel SAN Boot

After the Windows Server 2012/2008 host completely boots up and is online, connect additional paths to the fabric or the HP 3PAR disk storage system directly by completing the following tasks.

  1. On the HP 3PAR StoreServ Storage, issue createhost -add <hostname> <WWN> to add the additional paths to the defined HP 3PAR StoreServ Storage host definition.
  2. On the Windows Server 2012/2008 host scan for new devices.
  3. Reboot the Windows Server 2012/2008 system.
  4. Install following patches: KB2849097
  5. Setup Multipathing, Install the following patches: KB2406705 and KB2522766

Windows Server 2008, Server 2012 implementation steps:

On the first Windows Server 2012 or Windows Server 2008 reboot following an HP 3PAR array firmware upgrade, whether a major upgrade or an MU update within the same release family, the Windows server will mark the HP 3PAR LUNs “offline.”

This issue occurs only in the following configurations:

  1. HP 3PAR LUNs on Windows standalone servers.
  2. HP 3PAR LUNs that are used in Microsoft Failover Clustering and are not configured as “shared storage” on the Windows failover cluster. If HP 3PAR LUNs that are used in Microsoft Failover Clustering are configured as shared storage, then they will not experience the same problem (that is, be marked offline) as in a Windows standalone-server configuration.

When the HP 3PAR LUNs are marked offline, the you must follow these steps so that applications can access the HP 3PAR LUNs again:

  1. Click Computer Management→Disk Management.
  2. Right-click each of the HP 3PAR LUNs.
  3. Set the LUN online.

HP recommends the execution of Microsoft KB2849097 on every Windows Server 2008/2012 host connected to a HP 3PAR array prior to performing an initial array firmware upgrade. Subsequently, the script contained in KB2849097 will have to be rerun on a host each time new HP 3PAR LUNs are exported to that host.

KB2849097 is a Microsoft PowerShell script designed to modify the Partmgr Attributes registry value that is located at: HKLM\System\CurrentControlSet\Enum\SCSI\<device>\<instance>\DeviceParameters\Partmgr.

NOTE: The following procedure will ensure proper execution of KB2849097, which will prevent the HP 3PAR LUNs from being marked offline when the Windows server is rebooted following an array firmware upgrade.

Save the following script as ‘.ps1’ on your system:

$val = 0
$vendor = Read-Host &quot;Enter Vendor String&quot;
$devIDs = Get-ChildItem &quot;HKLM:\SYSTEM\CurrentControlSet\Enum\SCSI\Disk*Ven_$vendor*\*\Device Parameters\&quot;
 
 foreach ($id in $devIDs)
{
    $error.Clear()
    $regpath = $id.PSPath + &quot;\Partmgr\&quot;
    Set-ItemProperty -path $regpath -Name Attributes -Value $val -ErrorAction SilentlyContinue
 
    if ($error) # didn't find the path, create it and try again
    {
        New-Item -Path $id.PSPath -Name Partmgr
        Set-ItemProperty -path $regpath -Name Attributes -Value $val -ErrorAction SilentlyContinue
        $error.Clear()
    }
 
   Get-ItemProperty -Path $regpath -Name Attributes -ErrorAction SilentlyContinue | Select Attributes | fl | Out-String -Stream
}

Windows Server 2008/2012 requires the PowerShell execution policy to be changed to RemoteSigned to allow execution of external scripts. This must be done before the script is executed. To change the PowerShell execution policy, open the PowerShell console and issue the following command:

Set-ExecutionPolicy RemoteSigned

You might be prompted to confirm this action by pressing y.

The next step is to save the script as a .ps1 file to a convenient location and execute it by issuing the following command in a PowerShell console window:

C:\ps_script.ps1

The above command assumes that the script has been saved to C: under the name

ps_script.ps1.

You will then be prompted to provide a Vendor String, which is used to distinguish between different vendor types. The script will only modify those devices whose Vendor String matches the one that has been entered into the prompt. Enter 3PAR in the prompt to allow the script to be executed on all HP 3PAR LUNs currently presented to the host as shown in the output below:

Enter Vendor String: 3PAR

The script will then run through all HP 3PAR LUNs currently presented to the host and set the Attributes registry value to 0. In order to verify that the Attributes value for all HP 3PAR LUNs were properly modified, issue the following command:

Get-ItemProperty -path
“HKLM:\SYSTEM\CurrentControlSet\Enum\SCSI\Disk*Ven_3PARdata*\*\Device Parameters\Partmgr” -Name Attributes
The Attributes value should be set to 0 as shown in the example below:
PSPath :
Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM

\CurrentControlSet\Enum\SCSI\Disk&Ven_3PARdata&Prod_VV\5&381f35e2&0&00014f\Device Parameters\Partmgr

PSParentPath :
Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM

\CurrentControlSet\Enum\SCSI\Disk&Ven_3PARdata&Prod_VV\5&381f35e2&0&00014f\Device Parameters
PSChildName : Partmgr
PSDrive : HKLM PSProvider : Microsoft.PowerShell.Core\Registry
Attributes : 0 (so you are good to go)

Setting up Multipathing (Windows 2008/2012)

For high-availability storage with load balancing of I/O and improved system and application performance, Windows Server 2012/2008 requires the native Microsoft MPIO and the StorPort miniport driver.

Configuring Microsoft MPIO for HP 3PAR Storage required to resolve issues with MPIO path failover, it is recommended that hotfixes KB2406705 and KB2522766 be installed for all versions of Windows Server 2008 up to and including Windows Server 2008 R2 SP1.

Windows Server 2008, Windows Server 2008 SP1, and Windows Server 2008 SP2 also require that hotfix KB968287 be installed to resolve an issue with MPIO path failover. All three patches (KB2522766, KB968287, KB2406705) are required for non-R2 versions of Windows Server 2008. Only two patches (KB2522766 and KB2406705) are required for R2 versions of Windows Server 2008.

  1. If you have not already done so, check HBA vendor documentation for any required support drivers, and install them.
  2. If necessary, install the StorPort miniport driver.
  3. If the MPIO feature is not enabled, open the Server Manager and install the MPIO feature. This will require a reboot.
  4. After rebooting, open the Windows Administrative Tools and click MPIO.
  5. In the MPIO-ed Devices tab, click the Add button; the Add MPIO Support popup appears.
  6. In the Device Hardware ID: text box, enter 3PARdataVV, and click OK.
  7. Reboot as directed, (You can also use MPIO-cli to add 3PARdataVV).

The command is:

“mpclaim -r -I -d “3PARdataVV”

Configuring MPIO for Round Robin

Note from HP: Windows Server 2008 server connected to an HP 3PAR StoreServ Storage running HP 3PAR OS 2.2.x or later requires that the multipath policy be set to Round Robin.

Windows Server 2012 or Windows Server 2008 R2 servers do not need to change the multipath policy, as it defaults to Round Robin. If the server is running any supported Windows Server 2008 version prior to Windows Server 2008 R2, and if the Windows Server 2008 server is connected to an HP 3PAR array that is running HP 3PAR OS 2.2.x, the multipath policy will default to failover and must be changed to Round Robin. However, if the OS version on the HP 3PAR array is HP 3PAR OS 2.3.x or later, then you must use HP 3PAR OS host persona 1 for Windows Server 2008 R2 or host personal 2 for Windows Server 2008 non-R2 so that the multipath policy defaults to Round Robin.

To verify the default MPIO policy, follow these steps:

  1. In the Server Manager, click Diagnostics; select Device Manager. Expand the Disk drives list.
  2. Right-click an HP 3PAR drive to display its Properties window and select the MPIO tab.
  3. Select Round Robin from the drop-down menu.

Reference Documents:

HP 3PAR Windows Server 2008 /2012 Implementation Guide