VMware

Troubleshooting Storage Performance

Properly assessing the workload requirements is key to good storage design. This process should be reviewed continuously as requirements change and the infrastructure expands – I/O characteristics are rarely static. In relation to performance a good storage design should account for growth in workload and throughput. In this post we will cover some of the potential areas affecting performance and look at common solutions.

Principle Areas Affecting SAN Performance

  • Failure, to correctly assess and understand the workload characteristics and requirements (IO, Bandwidth, IO Profiles – Random vs Sequential)
  • Subsystem bottlenecks such as choice of physical disks (whether you have enough back-end IOPs to support workloads).
    • Drive response times, this is defined as the time it takes for a disk to execute an I/O request.
      • Nearline (NL) provide less I/O performance than 10/15K SAS disks.
      • Response time = (queue length + 1) x the service time to complete task.
    • Slow array  response times might be caused by a number of issues some include:
      • SAN fabric fan-in ratios / increasing contention causing throughput bottlenecks.
      • Front-end cache saturation (memory storing I/Os before they are de-staged to the backend drives).
      • Heavy workloads
  • Large workload I/O sizes can saturate interfaces impacting performance.
    • To identify VM I/O sizes use the vscsiStats tool, Cormac Hogan explains it nicely.
  • Poor change control mechanisms and workload awareness (Beware of unnecessary operations being performed on the array. New SAN – “Wouldn’t it be cool if we run IOmeter @100% random ops to see what this thing can really do!” – not if its in production.
  • I/O path saturation, incorrectly configured path selection policies.
  • Poor choice of RAID configuration for the supporting workloads. There is a penalty for every write operation performed, this varies depending on RAID policy (see below). Note that RAID rebuilds can impact performance and depending on the size of the disk can take a long time to complete.
RAID I/O Penalty
1 2
5 4
6 6
10 2
RAID-DP (NetApp) 2

Fabric Switching, IP & FC Impact on Performance FC Fabric Check physical switches for:

  • CRC errors, indicating code violations within FC frames.
    • Potential causes are: Faulty SFP’s or damaged cables. The impact is additional CPU overhead caused by retransmissions.
  • Loss of Sync – indicative of multiple code violations (character corruption), due to incompatible speeds between initiators and targets (don’t forget your ISL’s).
    • Loss of Sync errors, can also be caused by faulty SFP’s or HBA’s.
  • Class 3 discards, no acknowledgement from the receiver, FC frames can be discarded due to port congestion (check ISL oversubscription).
  • Exchange completion times, An ‘exchange’ is made up of [frames] & [sequences]. Frames include information about source and destination ID’s, originating exchange and sequence ID’s. Often a true measure of performance in switched fabric environments is how long each exchange takes to complete (specifically reads/write operations). Its important to note that latency here, cascades up the stack ultimately impacting the application.
    •  Issues affecting exchange rates: HBA’s (check firmware), physical server, number of hops, disk speeds, interfaces, configuration issues, I/O transaction sizes (Ref I/O sizes : see Brocade doc on buffer credit management).
  • Check Fan-in/Fan-out ratios, ensure there is enough bandwidth available – see Class 3 discards (above).

IP Fabric

  • The switch should be able to process and forward a continuous stream of data at full 
wire speed (1 Gbps) on all ports simultaneously, also known as ‘non-blocking’.
  • Adequate port buffering, this is working space for inbound (ingress) and outbound (egress). There are two different modes: 1) Shared port buffering, 2) Dedicated port buffering. Shared port buffering dynamically allocates memory to to ports as needed, where dedicated port buffering has a fixed memory amount per port.
  • Best practise: Disable spanning tree on initiator and target ports. If you have to use STP enable rapid spanning tree to reduce the convergence time and set port to immediately forward frames.
  • Check switch vendor recommendations regarding flow control (preventing buffer overflow, reducing retransmissions).
  • Check switch vendor recommendations for storm control.
  • Stacking / Trunking, check ISL is not oversubscribed, be aware of backplane oversubscription.
  • Both IP and FC storage traffic can be susceptible to congestion, however when an iSCSI path is overloaded, the TCP/IP protocol drops packets and requires them to be resent. This is typically caused by oversubscription of paths or low port buffer.
  • Minimise the amount of hops between initiators and targets, traffic should not be routed and sit on the same subnet.

Lastly, if your switches have the capability use performance counters to identify times of peak workloads, you may be able to cross reference this information with other latency alarms setup across your datacentre (See VMware LogInsight).

Calculating functional IOPs of the array

This is important as it will give you a good idea of the performance capabilities taking into account the RAID write penalty.

First we need to understand the Raw IOPs of the array, this is calculated as drive unit IOPs x n (total amount of drives in the array).

Second we need understand the I/O cross section – Read% vs Write%. This information can be obtained from your SAN management tools or looking at throughput for reads & writes.

Formula: Total Reads (KBps) + Total Writes (KBps) = Total Throughput in KBps

340,000KBps  + 100,000KBps  = 440,000 KBps 340,000 / 440,000 = 77.27 % Read 100,000 / 440,000 = 22.72 % Write

We can then use the this information to determine the functional IOPs of the array, this value is key in assessing whether or not the SAN is up to the job.

Functional IOPs = (Raw IOPs x Read%) + ((Raw IOPs x Write%) / RAID Write Penalty) 

Calculating Throughput Requirements

MBps = (Peak VM Workload IOPS * KB per IO) /1024 (use the VMware vscsiStats tool to output workload IO sizes)

MBps = (2000 * 8) /1024 = 15.625MBps

In the above example where the I/O workload requirement is 2000 IOPS, we would need 16MBps (128 Mbp/s @ 8KB per IO) of throughput to satisfy that requirement.

Note: esxtop can be used to determine SAN performance problems impacting hosts. Here are a couple of latency counters that should be monitored closely:

Value  Description
CMDS/s This is the total amount of commands per second and includes IOPS (Input/Output Operations Per Second) and other SCSI commands such as SCSI reservations, locks, vendor string requests, unit attention commands etc. being sent to or coming from the device or virtual machine being monitored. In most cases CMDS/s = IOPS unless there are a lot of metadata operations (such as SCSI reservations)
DAVG/cmd This is the average response time in milliseconds per command being sent to the deviceWarning threshold = 25High numbers (greater than 15-20ms) represent a slow or over worked array.
KAVG/cmd This is the amount of time the command spends in the Vmkernel.Warning threshold = 2 High numbers (greater than 2ms) represent either an overworked array or an overworked host.
GAVG/cmd This is the response time as it is perceived by the guest operating system. This number is calculated with the formula: DAVG + KAVG = GAVGWarning threshold = 2

Calculating I/O Size and I/O Profile : Random vs Sequential

See VMware vscsiStats tool / Dtrace for linux, SQLIO Windows.

ESX host monitoring

Have a look at the following VMware KB – Using esxtop to identify storage performance issues for ESX / ESXi (multiple versions) (1008205)

Virtual Machine Monitoring

  • Setup latency alarms within VMware vCentre to monitor virtual machine total disk latency.
    • The default is 75ms however, this should be adjusted depending on your KPI’s (<=25ms).

Host SCSI Latency

  • To reduce latency on the host, ensure that the sum of active commands from all virtual machines does not consistently exceed the LUN queue depth.
  • Either increase the LUN queue depth or move virtual machines to another LUN.
    • Observe vendor best practices for adjusting queue length on HBAs.
  • Work with the storage teams to ensure that each LUN is comprised of multiple physical disks.
  • To reduce latency on the array, reduce the maximum number of outstanding I/O commands to the shared LUN.
    • Distribute workloads across datastores.
    • Reduce VM to datastore ratios / reduce logical contention.
  • The maximum number of outstanding I/O commands that the array is capable of handling varies by array configuration, see my post on storage design considerations: Link

SCSI Reservations A SCSI reservation is a lock operation on the LUN preventing I/O from other operations. Consistent lock operations add latency measured in milliseconds.

  • Operations performed which can create locking metadata:
    • Creating a VMFS datastore
    • Expanding a VMFS datastore onto additional extents
    • Powering on a virtual machine
    • Acquiring a lock on a file
    • Creating or deleting a file
    • Creating a template
    • Deploying a virtual machine from a template
    • Creating a new virtual machine
    • Migrating a virtual machine with vMotion
    • Growing a file, for example, a snapshot file or a thin provisioned virtual disk
    • For the zeroed thick type of virtual disk the reservation is required only when zeroing the blocks.

In vSphere 4.x VMware released VAAI – VMware API for Array Integration, these primitives included an ATS – Atomic Test & Set which reduces locking.

  • For compatibility, check the VMware HCL/SAN Vendor –

From the hosts the following commands can be used to check the internal status ESX4.x

# esxcfg-scsidevs -l | egrep &amp;quot;Display Name:|VAAI Status:&amp;quot;

ESX5.x

# esxcli storage core device vaai status get

Note: You may need to install the binaries on the ESX host, if these have not been included. The binaries come in the form of VIBs, a reboot of the host will be required after installation.

VAAI changes are logged in the VMKernel logs at /var/log/vmkernel (for ESXi 5.0 vmkernel.log) or /var/log/messages).

VMware also provide a great KB for troubleshooting SCSI reversions conflicts on VMware Infrastructure VMware KB: 1005009 VMware KB enabling and troubleshooting VAAI : KB 1021976

Possible Solutions

Correctly plan and identify the workload characteristics and requirements:

  • Analyse total throughput, identify I/O profiles (read versus write percentages), I/O types (sequential vs random) and particularly I/O size.
  • Understand the workload characteristics, In most cases database workloads are inherently more I/O intensive than say the same number of web servers.
  • To reduce contention and latency, avoid oversubscribing by:
    • Spreading workloads across different LUNs (use VMware SDRS to automate balance workload across datastores).
    • Identify front-end bottlenecks, balance I/O to different storage processors and storage processor ports.
    • Use different FC or Ethernet paths.
    • Use multiple ESXi storage HBAs and HBA Ports.
    • Set appropriate pathing policies based on storage vendor best practises.
    • Sometimes poor storage performance can be attributed to bad host and VM management, check VM disk alignment, identify if VM’s are swapping consistently to your SAN , attributing to unnecessary I/O across your storage fabric, use local server-side storage to home VM SWAP if possible.
      • In the case of the latter reduce memory overcommit or increase physical RAM/Hosts.
  • Set appropriate queue depth values on HBA adapters, follow vendor recommendations. Observe impact to consolidation ratios specifically the number of VMs in a VMFS datastore. Setting queue depths too high can have a negative impact on performance (very large queue depths 64+, tend to just mask issues and are indicative of a larger performance problem).
  • Distributing workloads with a large number of transactions per second, you can also reduce latency by reducing the number of hops in the storage path.
  • Increase I/O front-end ports if possible (some SAN devices support the installation of additional HBA cards).
  • Check if target and initiator ports have negotiated correctly (check port properties on SAN fabric switches).
  • For NFS/iSCSI – Use, good quality switches and preferably dedicated stacks (1GbE – Dell PowerConnect 6200’s, Cisco 3750’s), if you are using 10GbE use NIOC to prevent other I/O operations from impacting your storage I/O.
  • Not directly related to performance but ensure proper L2 isolation between iSCSI traffic and general network traffic.
  • Calculate I/O cost to support the workload, factoring RAID write penalty.
    • Increase the number of drives to meet requirements.
  • Upgrade controller/SP cache, however in the event cache is saturated you will be reliant on back-end performance of the underlying drives, front-end cache is not a fix for poor back-end storage design.
    • To prevent noisy neighbours using all available front-end cache look at array side tools to create I/O polices for different workloads, prioritise those that are business critical.
    • VMware Storage IO Control (SIOC) can be used when latency impacts VM workloads by prioritising IO (only invoked during periods of contention).
  • Investigate potential use of SSD’s to absorb intensive I/O operations (Server side cache may also help – See PernixData FVP for VMs).
  • Investigate sub-lun tiering mechanisms to move hot blocks of data to drives with faster performance characteristics and less used blocks to slower storage (EMC FAST, HP 3PAR AO).
  • Use array multipathing policies, either native or thirdparty such as EMC powerpath. These policies can help by distributing I/O across all available storage paths in a more effective manner.
  • I/O Analyzer (VMware fling) can measure storage performance in a virtual environment and to help diagnose storage performance concerns. I/O Analyzer, supplied as an easy-to-deploy virtual appliance.
  • VMware Log Insight can aggregate and perform deep analysis of system logs identifying trends in many metrics in a fully customisable package. This is particularly useful when investigating storage related problems. http://www.vmware.com/products/vcenter-log-insight/

Recommended Reading

VMware KB / Troubleshooting Storage Performance Issues with VMware Products

Brocade FOS Admin Guide / Buffer Credit Management 

VMware Blog / Troubleshooting Storage Performance Queues

IBM Redbooks / Storage Area Networking

VCAP5-DCD Certification

The VMware Certified Advanced Professional 5 – Data Center Design (VCAP5-DCD) certification is designed for IT architects who design and integrate VMware solutions in multi-site, large enterprise, virtualised environments.

The VCAP-DCD focuses on a deep understanding of the design principles and methodologies behind datacentre virtualisation. This certification relies on your ability to understand and decipher both non-functional and functional-requirements, risks, assumptions and constraints. Before undertaking this, you should have thorough understanding of the following areas :  Availability and security, storage / network design, disaster recovery – and by extension performing business impact analysis / business continuity, dependency mapping, automation and service management.

Key in this process is understanding, how design decisions relating to availability, manageability, performance, recoverability and security impact the design.

I have put together some of the vSphere 5 best practises referenced in the exam blueprint, I hope you find the information helpful if you considering taking this exam. vSphere 5 Design Best Practise Guide.pdf

In preparation for this exam, I found the following books very useful along with the documents provided in the exam blueprint.

  • VMware vSphere 5.1 Clustering Deepdive – Duncan Epping & Frank Denneman
  • VMware vSphere Design 2nd Edition – Scott Lowe & Forbes Guthrie
  • Managing and Optimising VMware vSphere Deployments – Sean Crookston & Harley Stagner
  • Virtualising Microsoft Business Critical Applications on VMware vSphere – Matt Liebowitz & Alex Fontana
  • VCAP-DCD Official Cert Guide – Paul McSharry
  • ITIL v3 Handbook – UK Office of Government & Commerce

Here is the VCAP-DCD certification requirements road map:

VCAP-DCD Certification

As mentioned before, the blueprint is key! – Here is a URL export of all the tools/resources the blueprint targets, this should save time in trawling through the pdf.

Section 1 – Create a vSphere Conceptual Design
Objective 1.1 – Gather and analyze business requirements
VMware Virtualization Case Studies
Five Steps to Determine When to Virtualize Your Servers
Functional vs. Non-Functional Requirements
Conceptual, Logical, Physical:  It is Simple

Objective 1.2 – Gather and analyze application requirements
VMware Cost-Per-Application Calculator
VMware Virtualizing Oracle Kit
VMware Virtualizing Exchange Kit
VMware Virtualizing SQL Kit
VMware Virtualizing SAP Kit
VMware Virtualizing Enterprise Java Kit
Business and Financial Benefits of Virtualization: Customer Benchmarking Study

Objective 1.3 – Determine Risks, Constraints, and Assumptions
Developing Your Virtualization Strategy and Deployment Plan

Section 2 – Create a vSphere Logical Design from an Existing Conceptual Design
Objective 2.1 –Map Business Requirements to the Logical Design
Conceptual, Logical, Physical:  It is Simple
VMware vSphere  Basics Guide
What’s  New  in  VM ware  v Sphere  5 
Functional vs. Non-Functional Requirements
ITIL v3 Introduction and Overview

Objective 2.2 – Map Service Dependencies
Datacenter Operational Excellence Through Automated Application Discovery & Dependency Mapping

Objective 2.3 – Build Availability Requirements into the Logical Desig
Improving Business Continuity with VMware Virtualization Solution Brief
VMware High Availability Deployment Best Practices
vSphere Availability Guide

Objective 2.4 – Build Manageability Requirements into the Logical Design
Optimizing Your VMware Environment
Four Keys to Managing Your VMware Environment
Operational Readiness Assessment
Operational Readiness Assessment Tool

Objective 2.5 – Build Performance Requirements into the Logical Design
Proven Practice: Implementing ITIL v3 Capacity Management in a VMware environment
vSphere Monitoring and Performance Guide

Objective 2.6 – Build Recoverability Requirements into the Logical Design
VMware vCenter™  Site Recovery Manager Evaluation Guide
A Practical Guide to Business Continuity and Disaster Recovery with VMware Infrastructure
Mastering Disaster Recovery: Business Continuity and Disaster Recovery Whitepaper
Designing Backup Solutions for VMware vSphere

Objective 2.7 – Build Security Requirements into the Logical Design
vSphere Security Guide
Developing Your Virtualization Strategy and Deployment Plan
Achieving Compliance in a Virtualized Environment
Infrastructure Security:  Getting to the Bottom of Compliance in the Cloud
Securing the Cloud

Section 3 – Create a vSphere Physical Design from an Existing Logical Design
Objective 3.1 – Transition from a Logical Design to a vSphere 5 Physical Design
Conceptual, Logical, Physical:  It is Simple
vSphere Server and Host Management Guide
vSphere Virtual Machine Administration Guide

Objective 3.2 – Create a vSphere 5 Physical Network Design from an Existing Logical Design
vSphere Server and Host Management Guide
vSphere Installation and Setup Guide
vMotion Architecture, Performance and Best Practices in VMware vSphere 5
VMware are  vSphere™: Deployment Methods for the VMware® vNetwork Distributed Switch
vNetwork Distributed Switch: Migration and Configuration
Guidelines for Implementing VMware vSphere with the Cisco Nexus 1000V Virtual Switch
VMware® Network I/O Control: Architecture, Performance and Best Practices

Objective 3.3 – Create a vSphere 5 Physical Storage Design from an Existing Logical Design
Fibre Channel SAN Configuration Guide
iSCSI SAN Configuration Guide
vSphere Installation and Setup Guide
Performance Implications of Storage I/O Control–Enabled NFS Datastores in VMware vSphere® 5.0
Managing Performance Variance of Applications Using Storage I/O Control
VMware Virtual Machine File System: Technical Overview and Best Practices

Objective 3.4 – Determine Appropriate Compute Resources for a vSphere 5 Physical Design
vSphere Server and Host Management Guide
vSphere Installation and Setup Guide
vSphere Resource Management Guide

Objective 3.5 – Determine Virtual Machine Configuration for a vSphere 5 Physical Design
vSphere Server and Host Management Guide
Virtual Machine Administration Guide
Best Practices for Performance Tuning of Latency-Sensitive Workloads in vSphere VMs
Virtualizing a Windows Active Directory Domain Infrastructure
Guest Operating System Installation Guide

Objective 3.6 – Determine Data Center Management Options for a vSphere 5 Physical Design
vSphere Monitoring and Performance Guide
vCenter Server and Host Management Guide
VMware vCenter Update Manager 5.0 Performance and Best Practices

Section 4 – Implementation Planning
Objective 4.1 – Create an Execute a Validation Plan
vSphere Server and Host Management Guide
Validation Test Plan

Objective 4.2 – Create an Implementation Plan
vSphere Server and Host Management Guide
Operational Test Requirement Cases

Objective 4.3 – Create an Installation Guide
vSphere Server and Host Management Guide