SAN Implementation

HP SN6000 /Qlogic Sanbox 5000 switch cli install guide

HP SN6000 /Qlogic Sanbox 5000 switch cli install guide 
Note: All IP addresses and host-names have been edited to protect the innocent!

So the day has arrived, you’re itching to get this all connected up, after spending half an hour unpacking all the switches individually boxed sfp’s and cables let’s begin.

This guide is based on a stacked topology of 4 switches per fabric, in the event of switch failure only one segment in the stack will be lost due to the mesh design. This will result in an MPIO failure for any HBA’s attached to that switch. However, provided you have correctly configured  – cabled/zoned the host and storage to both fabrics you should not suffer any outages.

HP SN6000_Stacked

For the initial setup you will need to connect to the fabric switches via terminal connection, after some configuration changes we will be able access the switch via SSH.

The default admin password is ‘password’

1. Initial setup, Configuring an IPv4 address:

At the session prompt enter:
SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> set setup system ipv4
A list of attributes with formatting and current values will follow.
Enter a new value or simply press the ENTER key to accept the current value.
If you wish to terminate this process before reaching the end of the list
press ‘q’ or ‘Q’ and the ENTER key to do so.
Current Values: (system default values)
EthIPv4NetworkEnable      True
EthIPv4NetworkDiscovery  Static
EthIPv4NetworkAddress    10.0.0.1
EthIPv4NetworkMask         255.0.0.0
EthIPv4GatewayAddress   10.0.0.254

New Value (press ENTER to accept current value, ‘q’ to quit, ‘n’ for none):
EthIPv4NetworkEnable      (True / False)                                         :  True
EthIPv4NetworkDiscovery (1=Static, 2=Bootp, 3=Dhcp, 4=Rarp)   :  Static
EthIPv4NetworkAddress    (dot-notated IP Address)                       :  172.16.20.1
EthIPv4NetworkMask         (dot-notated IP Address)                       :  255.255.255.0
EthIPv4GatewayAddress   (dot-notated IPv4 Address)                   :  172.16.20.254

Do you want to save and activate this system setup? (y/n): [n] y
System setup saved and activated.

Don’t worry about the time being incorrect we will rectify this later…
Log Msg: [Sat Jan 1 07:40:26.522 UTC 2000][C][8400.003C][Switch][Network setup is changing – may lose connection – admin being released automatically]
SN6000 FC Switch (admin) #>

2. Next up, changing the default admin password:

SN6000 FC Switch #>
SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> passwd
Press ‘q’ and the ENTER key to abort this command.
account OLD password               : ********
account NEW password (8-20 chars)  :
please confirm account NEW password:
password has been changed.

SN6000 FC Switch (admin) #>

3. Setting the TimeZone, this is a pretty important step not only for future auditing and access control, but doing this now will avoid issues with certificate generation required for SSH connectivity.

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> set timezone
SN6000 FC Switch (admin) #>Europe/London

4. Setting up NTP

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> set setup system ntp
New Value (press ENTER to accept current value, ‘q’ to quit, ‘n’ for none):
NTPClientEnabled          (True / False)                                  :  true
NTPServerDiscovery      (1=Static, 2=Dhcp, 3=Dhcpv6)       :  1
NTPServerAddress        (hostname, IPv4, or IPv6 Address) :  Some NTP appliance IP address

5. Identify domain name servers
SN6000 FC Switch (admin) #> set setup system dns
A list of attributes with formatting and current values will follow. Enter a new value or simply press the ENTER key to accept the current value.

New Value (press ENTER to accept current value, ‘q’ to quit, ‘n’ for none):
DNSClientEnabled          (True / False)                :  true
DNSLocalHostname        (hostname)                   :  (enter switch name for example site # building # fabric# switch#)
DNSServerDiscovery      (1=Static, 2=Dhcp, 3=Dhcpv6) :  1
DNSServer1Address       (IPv4, or IPv6 Address)      :  172.16.100.1
DNSServer2Address       (IPv4, or IPv6 Address)      :  172.16.100.2
DNSServer3Address       (IPv4, or IPv6 Address)      :
DNSSearchListDiscovery  (1=Static, 2=Dhcp, 3=Dhcpv6) :  1
DNSSearchList1          (domain name)                :  vikernel.com
DNSSearchList2          (domain name)                :
DNSSearchList3          (domain name)                :
DNSSearchList4          (domain name)                :
DNSSearchList5          (domain name)                :
Do you want to save and activate this system setup? (y/n): [n] y
System setup saved and activated.

6. Enabling SSH

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> set setup services

TelnetEnabled       (True / False)        [True ]  Security best practice dictates that this should be disabled.
SSHEnabled          (True / False)       [False]  true
GUIMgmtEnabled      (True / False)   [True ]  true
SSLEnabled          (True / False)       [False]  true
EmbeddedGUIEnabled  (True / False)  [True ]  true
SNMPEnabled         (True / False)     [True ]  true
NTPEnabled          (True / False)       [False]  true
CIMEnabled          (True / False)       [True ]  true
FTPEnabled          (True / False)       [True ]  Security best practice dictates that this should be disabled, you will need to enable it for any future firmware upgrades.
MgmtServerEnabled   (True / False) [True ]  true
CallHomeEnabled     (True / False)  [True ]  unless you have this as a managed option from HP I recommend disabling this!

When enabling SSL, please verify that the date/time settings on this switch and the workstation from where the SSL connection will be started match, and then a new certificate may need to be created to ensure a secure connection to this switch.

7. Configure SSL Self Signed Certificate

SN6000 FC Switch (admin) #> create certificate
The current date and time is Jan 18 15:57:33 GMT 2014.  This is the time used to stamp onto the certificate.  Is the date and time correct? (y/n): [n] y
Certificate generation successful.

SN6000 FC Switch (admin) #>

8.  Setting Switch Stack Principality

The principal switch has the responsibility of assigning domain IDs in the fabric should a domain ID conflict occur. This is assuming that the domain ID’s are not locked on the switches. If the domain IDs are locked when a domain ID conflict occurs, the ISL between the switches will be forced to isolate.

Which switch is selected to be the principal switch in a SAN depends first upon the value set for the principal switch priority and second upon the WWN (World Wide Name) of the switch.
The principal switch priority are selectable values between 1 and 255 (1 being the highest priority). If the ‘principal switch priority’ is the same between the switches, the role of principal switch is given to the switch with the lowest WWN.
On HP SN6000H/QLogic switches, the current setting for the principal switch priority can be viewed through the use of the CLI ‘show config switch’ command.

SN6000 FC Switch #> show config switch
Configuration Name: default
——————-
Switch Configuration Information
——————————–
AdminState                 Online
BroadcastEnabled      True
InbandEnabled           True
FdmiEnabled              True
FdmiEntries                1000
DefaultDomainID       2 (0x2)
DomainIDLock           False
SymbolicName          SN6000 FC Switch
PrincipalPriority         254
ConfigDescription        Default Config
ConfigLastSavedBy      admin@OB-session4
ConfigLastSavedOn     Sat Jan  1 00:14:35 2000
InteropMode                 Standard

The principal switch priority is set by using the command line interface shell using the ‘set config switch’ command. The default setting is 254. It is **not** recommended that a value of 255 (the lowest priority) be used since this value will cause the switch to **never** attempt to be the principal switch which can be a problem should the switch ever be the only switch in the fabric.
Admin configured values:

For example a stack topology of 4 switches with 2 fabrics would look like this, generally you will have two independent fabrics, the values will look something like this:
Fabric 1 Switch 1                       10         Fabric 2 Switch 1
Fabric 1 Switch 2                       20         Fabric 2 Switch 2
Fabric 1 Switch 3                       30         Fabric 2 Switch 3
Fabric 1 Switch 4                       40         Fabric 2 Switch 4

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> config edit
The config named default is being edited.
SN6000 FC Switch (admin-config) #> set config switch

AdminState                 (1=Online, 2=Offline, 3=Diagnostics)   [Online          ]
BroadcastEnabled      (True / False)                         [True             ]
InbandEnabled           (True / False)                         [True             ]
FdmiEnabled              (True / False)                         [True             ]
FdmiEntries                (decimal value, 0-1000)         [1000            ]
DefaultDomainID        (decimal value, 1-239)           [4                  ]
DomainIDLock            (True / False)                         [False           ]
SymbolicName           (string, max=32 chars)           [SN6000 FC Switch]
PrincipalPriority          (decimal value, 1-255)            [254             ]  40
ConfigDescription      (string, max=64 chars)            [Default Config  ]

Finished configuring attributes.
This configuration must be saved (see config save command) and
activated (see config activate command) before it can take effect.
To discard this configuration use the config cancel command.

SN6000 FC Switch (admin-config) #> config save
The config named default has been saved.
SN6000 FC Switch (admin) #>

Verifying which switch has actually been selected as the principal switch can be done by checking the value (true or false) of PrincipalSwitchRole using the CLI command ‘show switch’.
SN6000 FC Switch #> show switch

Switch Information
——————
SymbolicName                     SN6000 FC Switch
SwitchWWN                         10:00:00:c0:12:34:55:aa
BootVersion                          V1.12.5.92.0 (Mon Nov  2 10:29:59 2009)
CreditPool                             0
DomainID                              1 (0x1)
FirstPortAddress                   010000
FlashSize – MBytes               256
LogFilterLevel                       Info
MaxPorts                               24
NumberOfResets                  6
ReasonForLastReset            PowerUp
ActiveImageVersion – build date   V8.0.14.3.0 (Fri Dec 16 22:50:58 2011)
PendingImageVersion – build date  V8.0.14.3.0 (Fri Dec 16 22:50:58 2011)
ActiveConfiguration               default
AdminState                            Online
AdminModeActive                  False
BeaconOnStatus                    False
OperationalState                    Online
PrincipalSwitchRole               True
POSTFaultCode                     00000000
POSTStatus                           Passed
TestFaultCode                        00000000
TestStatus                              NeverRun
BoardTemp (1) – Degrees Celsius   23
BoardTemp (2) – Degrees Celsius   22
BoardTemp (3) – Degrees Celsius   20
SwitchTemperatureStatus           Normal

SN6000 FC Switch #>

9.  Changing Switch Symbolic Name (For reference only – this serves no administrative function)

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> config edit
The config named default is being edited.

SN6000 FC Switch (admin-config) #> set config switch

AdminState                 (1=Online, 2=Offline, 3=Diagnostics)   [Online          ]
BroadcastEnabled      (True / False)                         [True            ]
InbandEnabled           (True / False)                         [True            ]
FdmiEnabled              (True / False)                         [True            ]
FdmiEntries                (decimal value, 0-1000)         [1000            ]
DefaultDomainID        (decimal value, 1-239)           [4               ]
DomainIDLock            (True / False)                         [False           ]
SymbolicName            (string, max=32 chars)          [site # building # fabric# switch#]
PrincipalPriority           (decimal value, 1-255)           [254             ]  40
ConfigDescription        (string, max=64 chars)          [Default Config  ]

Finished configuring attributes.
This configuration must be saved (see config save command) and
activated (see config activate command) before it can take effect.
To discard this configuration use the config cancel command.

SN6000 FC Switch (admin-config) #> config save
The config named default has been saved.
SN6000 FC Switch (admin) #>

10.  Changing the Switch negotiated domain ID (if necessary)

The domain ID is a unique Fibre Channel identifier for the switch. The Fibre Channel address consists of the domain ID, port ID, and the Arbitrated Loop Physical Address (ALPA). The maximum number of switches within a fabric is 239 with each switch having a unique domain ID.Switches come from the factory with the domain IDs unlocked. This means that if there is a domain ID conflict in the fabric, the switch with the highest principal priority, or the principal switch, will reassign any domain ID conflicts and establish the fabric. If you lock the domain ID on a switch and a domain ID conflict occurs, one of the switches will isolate as a separate fabric and the Logged-In LEDs on both switches will flash to show the affected ports. If you connect a new switch to an existing fabric with its domain ID unlocked, and a domain conflict occurs, the new switch will isolate as a separate fabric. However, you can remedy this by resetting the new switch or taking it offline then back online. The principal switch will reassign the domain ID and the switch will join the fabric
Administrative defined domain ID’s –

For example a stack topology of 4 switches with 2 fabrics would look like this:

Fabric 1 Switch 1                       1×1       Fabric 2 Switch 1
Fabric 1 Switch 2                       2×2       Fabric 2 Switch 2
Fabric 1 Switch 3                       3×3       Fabric 2 Switch 3
Fabric 1 Switch 4                       4×4       Fabric 2 Switch 4

SN6000 FC Switch #> admin start
SN6000 FC Switch (admin) #> config edit
The config named default is being edited.

SN6000 FC Switch (admin-config) #> set config switch

AdminState               (1=Online, 2=Offline, 3=Diagnostics)   [Online          ]
BroadcastEnabled    (True / False)                               [True            ]
InbandEnabled         (True / False)                               [True            ]
FdmiEnabled            (True / False)                               [True            ]
FdmiEntries              (decimal value, 0-1000)               [1000            ]
DefaultDomainID      (decimal value, 1-239)                 [              ]
DomainIDLock          (True / False)                               [False           ]
SymbolicName         (string, max=32 chars)                 [Fabric# Switch#]
PrincipalPriority        (decimal value, 1-255)                  [254             ]  40
ConfigDescription     (string, max=64 chars)                 [Default Config  ]

Finished configuring attributes.
This configuration must be saved (see config save command) and
activated (see config activate command) before it can take effect.
To discard this configuration use the config cancel command.

SN6000 FC Switch (admin-config) #> config save
The config named default has been saved.
SN6000 FC Switch (admin) #>

11.  Troubleshooting HTTP login issues

To prevent issues with logging into switch via http – make sure the SSL certificate is configured correctly. Check the time/date of the client computer against that of the switch, if NTP is not being enforced perform a hard reset then create a new certificate.

Perform a reboot of the switch:
SN6000 FC Switch (admin) #>hardreset

Then re-create a new SSL certificate:
SN6000 FC Switch (admin) #> create certificate

You should be gain access to the switch via a web browser now.

That’s pretty much it, shutdown the switches and proceed with stacking using either the copper 10-Gigabit/s XPAK or the XPAK LC optical connectors.

In subsequent post’s we’ll have a look at the HP StorageWorks Fabric Manager.

3PAR Installation & Troubleshooting Documents

Various links gathered from HP resources: The section on troubleshooting I found most helpful.

Installation:

Installation page for the HP 3PAR StoreServ 7000 Storage
Installing the Service Processor page for the HP 3PAR StoreServ 7000 Storage
Verifying Setup and Powering On page for the HP 3PAR StoreServ 7000 Storage
Installing the Disk Drive page for the HP 3PAR StoreServ 7000 Storage
Installing the Enclosure page for the HP 3PAR StoreServ 7000 Storage
Installing the Rail Kit page for the HP 3PAR StoreServ 7000 Storage
Setting Up a Factory Integrated Storage System page for the HP 3PAR StoreServ 7000 Storage
Installing Storage System Components in an Existing Rack page for the HP 3PAR StoreServ 7000 Storage
Installing Storage System Software page for the HP 3PAR StoreServ 7000 Storage
Installing and Removing the Cable Restraint Shipping Brackets for the HP 3PAR StoreServ 7000 Storage

Troubleshooting:

Troubleshooting Storage System Setup for the HP 3PAR StoreServ 7000 Storage
Troubleshooting Storage System Components for the HP 3PAR StoreServ 7000 Storage

Overview/Specifications & Parts:

Overview page for the HP 3PAR StoreServ 7000 Storage
Specifications page for the HP 3PAR StoreServ 7000 Storage
Standard Models page for the HP 3PAR StoreServ 7000 Storage
Spare Parts page for the HP 3PAR StoreServ 7000 Storage
Option Parts page for the HP 3PAR StoreServ 7000 Storage
Identifying Components page for the HP 3PAR StoreServ 7000 Storage
LED Indicators page for the HP 3PAR StoreServ 7000 Storage
Cabling page for the HP 3PAR StoreServ 7000 Storage

HP 3PAR / Windows Server 2008/2012 Boot from SAN Guide

Use Case: 

  • Store operating systems on the SAN, generally this provides higher availability, redundancy & recovery depending on the RAID & SAN configuration.
  • In diskless server builds to reduce power consumption by having no internal disks.
  • Blade architecture, where internal disks aren’t large enough to hold application and OS (not so much of an issue now considering the density of modern 3.5/2.5” disks).

Benefits:

  • Minimize system downtime, perhaps a critical component such as a processor, memory, or host bus adapter fails and needs to be replaced. You need only swap the hardware and reconfigure the HBA’s BIOS, switch zoning, and host-port definitions on the storage processors.
  • Enable rapid deployment scenarios.
  • Boot from SAN alleviates the necessity for each server to have its own direct-attached disk, eliminating internal disks as a potential point for failure. Thin diskless servers also take up less rack space, require less power, and are generally less expensive because they have fewer hardware components.
  • Centralised management when operating system images are stored on networked disks, all upgrades and fixes can be managed at a centralized location. Changes made to disks in a storage array are readily accessible by each server (This includes benefits in capacity planning as you typically have a holistic view of your SAN environment) .
  • All the boot information and production data stored on local SAN ‘A’ can be replicated to local SAN ‘B’ (see 3PAR Peer Persistence) or one at a geographically dispersed disaster recovery site. If a disaster destroys functionality of the servers at the primary site, the remote site can take over with minimal downtime.
  • Recovery from server failures is simplified in a SAN environment. With the help of snapshots, mirrors of a failed server can be recovered quickly by booting from the original copy of its image. As a result, boot from SAN can greatly reduce the time required for server recovery.

Risks:

  • With older windows operating systems – (Windows 2003) it was recommended that the boot LUN should be on separate SCSI bus from the shared LUNs (to avoid issues with SCSI-bus resets disrupting I/O – causing a BSOD. In windows 2008/2012 this is not an issue, boot LUNS can share the same bus/path).
  • Financial risk, understand that CAPEX costs can be higher than if you were to boot off you local disks (additional HBA’s, cabling & sfp’s). If you calculate the  £ per GB cost of a typical high-end SAN versus the cost of mirrored drives it can be allot more. Do you have enough physical capacity in your array to support this? if not you will need to buy more disks increase throughput/IOPs.

Potential Design Impact:

  • If a host/nodes swap out pages frequently this could result in heavy I/O traversing your storage fabric, this may negatively impact services (especially latency-critical apps). This might not be apparent if you have a few servers, but what if you have many that have a BFS (Boot From SAN) requirement. This can be mitigated to some extent by moving page-files to local disks or installing more memory. If this is a SQL server investigate using ‘Lock Pages in Memory’ function, to prevent SQL from paging workloads out unnecessarily (let SQL manage it’s working set size, check buffer pool to RAM ratio too).
  • Migrating OS to BFS in some situations can have a negative impact on the array or fabric switches potentially causing contention (check ISL fan-in ratios). This is more of an issue with iSCSI/NFS than FC due to the nature of the protocol. Ensure that your fabric switches – core/access have enough bandwidth to supply I/O demands. In some situations even storage processors could be overwhelmed (check host queue depth settings – this allows the host to throttle back I/O).
  • Check that you have enough FC/FCOE/iSCSI/NFS uplinks to service any high I/O workloads. Certainly the most opted for solution is to increase array side cache, but this is often the most costly option and doesn’t really address the root cause of any latency or throughput constraints.
  • Be mindful of boot storms after an outage or in VDI deployments, you may have to selectively boot tier1 apps in phases (bear in mind tier 1 application dependencies such as DNS, LDAP or Active Directory servers, they need to be started first). Review your tier 1 app service dependencies and their IOPs requirements (see point above for throughput considerations).

Key Points (3PAR):

  • The Boot LUN should be the lowest-ordered LUN number that exports to the host (3PAR recommended), however some arrays assign LUN0 to the controller in which case LUN1 can be used.
  • NOTE: With the introduction of the Microsoft Storport driver, booting from a SAN has become less problematic. Refer to http://support.microsoft.com/kb/305547.
  • For the initial boot, restrict the host to a single path connection on the 3PAR array. Only a single-path should be available on the HP 3PAR StoreServ Storage and a single path on the host to the VLUN that will be the boot volume (this can be changed after the host has booted and you have installed the MPIO driver).
  • It goes without saying check that your SAN, FC switch, server & HBA cards are running the latest firmware.
  • Ensure appropriate zoning techniques are applied (see my best practice guide)
  • If you are using clustering ensure nodes in a cluster have sole access to the boot LUN (1:1 mapping), using LUN masking (array side).
  • Server side HBA configuration (This can vary depending on HBA vendor – check your documentation)
  • Use soft zoning (zoning per pWWN), generally this is a requisite for HP 3PAR – but in terms of booting from SAN provides more flexibility. However, if the HBA card fails you will need to update LUN masking and soft zoning configurations.

 3PAR: Creating & Exporting Virtual Volumes

Virtual volumes are the only data layer visible to hosts. After devising a plan for allocating space for host servers on the HP 3PAR StoreServ Storage, create the VVs for eventual export as LUNs to the Windows Server 2012/2008 host server.

You can create volumes that are provisioned from one or more common provisioning groups (CPGs). Volumes can be fully provisioned from a CPG or can be thinly provisioned. You can optionally specify a CPG for snapshot space for fully-provisioned volumes. (Don’t forget, that if your requirements change – and you need to convert these volumes to thin provisioned volumes or vice-versa use the 3PAR System Tune operation).

Using the HP 3PAR Management Console :

  1. From the menu bar, select: Actions→Provisioning→Virtual Volume→Create Virtual Volume
  2. Use the Create Virtual Volume wizard to create a base volume.
  3. Select one of the following options from the allocation list: ‘Fully Provisioned’ / ‘Thinly Provisioned’

Next perform softzoning / LUN masking, see key point mentioned earlier about only presenting a single path to host. After you have installed the MPIO drivers (post OS install) present additional paths.

Configuring Brocade HBA to boot from SAN:

  1. Check and enable HBA BIOS (BIOS for arrays must be disabled that are not configured for boot from SAN).
  2. Enable one of the following boot LUN options:
  • Auto Discover—When enabled, boot information, such as the location of the boot LUN,is provided by the fabric (This is the default value).
  • Flash Values—The HBA obtains the boot LUN information from flash memory.
  • First LUN —The host boots from the first LUN visible to the HBA that is discovered in the fabric.
  1. Select a boot device from discovered targets.
  2. Then just save changes and exit.

Configuring Emulex HBA to boot from SAN:

  1. Boot the Windows Server 2012/2008 system following the instructions in the BootBios update manual.
  2. Press Alt+E. For each Emulex adapter, set the following parameters:
  3. Select Configure the Adapter’s Parameters.
  4. Select Enable or Disable the BIOS; for SAN boot, ensure that the BIOS is enabled.
  5. Press Esc to return to the previous menu.
  6. Select Auto Scan Setting; set the parameter to First LUN 0 Device; press Esc to return to the previous menu.
  7. Select Topology.
  8. Select Fabric Point to Point for fabric configurations.
  9. Select FC-AL for direct connect configurations.
  10. Press Esc to return to the previous menu if you need to set up other adapters. When you are Finished, press x to exit and reboot.

Configuring Qlogic HBA to boot from SAN:

Note: use the QLogic HBA Fast!UTIL utility to configure the HBA. Record the Adapter Port Name WWPN for creating the host definition  in the 3PAR IMC (however if server is zoned correctly you should see the HBA pWWN’s when adding a new host).

  1. Boot server; as the server is booting, press the Alt+Q or
  2. Ctrl+Q keys when the HBA BIOS prompts appear.
  3. In the Fast!UTIL utility, click Select Host Adapter and then select the appropriate adapter.
  4. Click Configuration Settings→Adapter Settings.
  5. In the Adapter Settings window, set the following.
  6. Host Adapter BIOS: Enabled
  7. Spinup Delay: Disabled
  8. Connection Option:0 for direct connect 1 for fabric
  9. Press Esc to exit this window.
  10. Click Selectable Boot Settings. In the Selectable Boot Settings window, set Selectable Boot Device to Disabled.
  11. Press Esc twice to exit; when you are asked whether to save NVRAM settings, click Yes.

Connecting Multiple Paths for Fibre Channel SAN Boot

After the Windows Server 2012/2008 host completely boots up and is online, connect additional paths to the fabric or the HP 3PAR disk storage system directly by completing the following tasks.

  1. On the HP 3PAR StoreServ Storage, issue createhost -add <hostname> <WWN> to add the additional paths to the defined HP 3PAR StoreServ Storage host definition.
  2. On the Windows Server 2012/2008 host scan for new devices.
  3. Reboot the Windows Server 2012/2008 system.
  4. Install following patches: KB2849097
  5. Setup Multipathing, Install the following patches: KB2406705 and KB2522766

Windows Server 2008, Server 2012 implementation steps:

On the first Windows Server 2012 or Windows Server 2008 reboot following an HP 3PAR array firmware upgrade, whether a major upgrade or an MU update within the same release family, the Windows server will mark the HP 3PAR LUNs “offline.”

This issue occurs only in the following configurations:

  1. HP 3PAR LUNs on Windows standalone servers.
  2. HP 3PAR LUNs that are used in Microsoft Failover Clustering and are not configured as “shared storage” on the Windows failover cluster. If HP 3PAR LUNs that are used in Microsoft Failover Clustering are configured as shared storage, then they will not experience the same problem (that is, be marked offline) as in a Windows standalone-server configuration.

When the HP 3PAR LUNs are marked offline, the you must follow these steps so that applications can access the HP 3PAR LUNs again:

  1. Click Computer Management→Disk Management.
  2. Right-click each of the HP 3PAR LUNs.
  3. Set the LUN online.

HP recommends the execution of Microsoft KB2849097 on every Windows Server 2008/2012 host connected to a HP 3PAR array prior to performing an initial array firmware upgrade. Subsequently, the script contained in KB2849097 will have to be rerun on a host each time new HP 3PAR LUNs are exported to that host.

KB2849097 is a Microsoft PowerShell script designed to modify the Partmgr Attributes registry value that is located at: HKLM\System\CurrentControlSet\Enum\SCSI\<device>\<instance>\DeviceParameters\Partmgr.

NOTE: The following procedure will ensure proper execution of KB2849097, which will prevent the HP 3PAR LUNs from being marked offline when the Windows server is rebooted following an array firmware upgrade.

Save the following script as ‘.ps1’ on your system:

$val = 0
$vendor = Read-Host &quot;Enter Vendor String&quot;
$devIDs = Get-ChildItem &quot;HKLM:\SYSTEM\CurrentControlSet\Enum\SCSI\Disk*Ven_$vendor*\*\Device Parameters\&quot;
 
 foreach ($id in $devIDs)
{
    $error.Clear()
    $regpath = $id.PSPath + &quot;\Partmgr\&quot;
    Set-ItemProperty -path $regpath -Name Attributes -Value $val -ErrorAction SilentlyContinue
 
    if ($error) # didn't find the path, create it and try again
    {
        New-Item -Path $id.PSPath -Name Partmgr
        Set-ItemProperty -path $regpath -Name Attributes -Value $val -ErrorAction SilentlyContinue
        $error.Clear()
    }
 
   Get-ItemProperty -Path $regpath -Name Attributes -ErrorAction SilentlyContinue | Select Attributes | fl | Out-String -Stream
}

Windows Server 2008/2012 requires the PowerShell execution policy to be changed to RemoteSigned to allow execution of external scripts. This must be done before the script is executed. To change the PowerShell execution policy, open the PowerShell console and issue the following command:

Set-ExecutionPolicy RemoteSigned

You might be prompted to confirm this action by pressing y.

The next step is to save the script as a .ps1 file to a convenient location and execute it by issuing the following command in a PowerShell console window:

C:\ps_script.ps1

The above command assumes that the script has been saved to C: under the name

ps_script.ps1.

You will then be prompted to provide a Vendor String, which is used to distinguish between different vendor types. The script will only modify those devices whose Vendor String matches the one that has been entered into the prompt. Enter 3PAR in the prompt to allow the script to be executed on all HP 3PAR LUNs currently presented to the host as shown in the output below:

Enter Vendor String: 3PAR

The script will then run through all HP 3PAR LUNs currently presented to the host and set the Attributes registry value to 0. In order to verify that the Attributes value for all HP 3PAR LUNs were properly modified, issue the following command:

Get-ItemProperty -path
“HKLM:\SYSTEM\CurrentControlSet\Enum\SCSI\Disk*Ven_3PARdata*\*\Device Parameters\Partmgr” -Name Attributes
The Attributes value should be set to 0 as shown in the example below:
PSPath :
Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM

\CurrentControlSet\Enum\SCSI\Disk&Ven_3PARdata&Prod_VV\5&381f35e2&0&00014f\Device Parameters\Partmgr

PSParentPath :
Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM

\CurrentControlSet\Enum\SCSI\Disk&Ven_3PARdata&Prod_VV\5&381f35e2&0&00014f\Device Parameters
PSChildName : Partmgr
PSDrive : HKLM PSProvider : Microsoft.PowerShell.Core\Registry
Attributes : 0 (so you are good to go)

Setting up Multipathing (Windows 2008/2012)

For high-availability storage with load balancing of I/O and improved system and application performance, Windows Server 2012/2008 requires the native Microsoft MPIO and the StorPort miniport driver.

Configuring Microsoft MPIO for HP 3PAR Storage required to resolve issues with MPIO path failover, it is recommended that hotfixes KB2406705 and KB2522766 be installed for all versions of Windows Server 2008 up to and including Windows Server 2008 R2 SP1.

Windows Server 2008, Windows Server 2008 SP1, and Windows Server 2008 SP2 also require that hotfix KB968287 be installed to resolve an issue with MPIO path failover. All three patches (KB2522766, KB968287, KB2406705) are required for non-R2 versions of Windows Server 2008. Only two patches (KB2522766 and KB2406705) are required for R2 versions of Windows Server 2008.

  1. If you have not already done so, check HBA vendor documentation for any required support drivers, and install them.
  2. If necessary, install the StorPort miniport driver.
  3. If the MPIO feature is not enabled, open the Server Manager and install the MPIO feature. This will require a reboot.
  4. After rebooting, open the Windows Administrative Tools and click MPIO.
  5. In the MPIO-ed Devices tab, click the Add button; the Add MPIO Support popup appears.
  6. In the Device Hardware ID: text box, enter 3PARdataVV, and click OK.
  7. Reboot as directed, (You can also use MPIO-cli to add 3PARdataVV).

The command is:

“mpclaim -r -I -d “3PARdataVV”

Configuring MPIO for Round Robin

Note from HP: Windows Server 2008 server connected to an HP 3PAR StoreServ Storage running HP 3PAR OS 2.2.x or later requires that the multipath policy be set to Round Robin.

Windows Server 2012 or Windows Server 2008 R2 servers do not need to change the multipath policy, as it defaults to Round Robin. If the server is running any supported Windows Server 2008 version prior to Windows Server 2008 R2, and if the Windows Server 2008 server is connected to an HP 3PAR array that is running HP 3PAR OS 2.2.x, the multipath policy will default to failover and must be changed to Round Robin. However, if the OS version on the HP 3PAR array is HP 3PAR OS 2.3.x or later, then you must use HP 3PAR OS host persona 1 for Windows Server 2008 R2 or host personal 2 for Windows Server 2008 non-R2 so that the multipath policy defaults to Round Robin.

To verify the default MPIO policy, follow these steps:

  1. In the Server Manager, click Diagnostics; select Device Manager. Expand the Disk drives list.
  2. Right-click an HP 3PAR drive to display its Properties window and select the MPIO tab.
  3. Select Round Robin from the drop-down menu.

Reference Documents:

HP 3PAR Windows Server 2008 /2012 Implementation Guide

VMware VSAN – Storage Virtualisation

VMware Virtual SAN (VSAN)

VMware have clearly laid out plans for the complete convergence of the IT infrastructure, as seen with the release of NSX and VSAN at VMworld 2013. I’ve highlighted some important features of VMware’s new storage solution ‘VSAN’.

VMware VSAN is fully integrated with VMware vSphere automatically aggregating local host storage in a cluster so that it can be presented as block level shared storage for virtual machines. It’s function is to provide both HA (High Availability) and scale-out-storage features.

This is leaps ahead of the VMware VSA (Virtual Storage Appliance) aimed at the SMB with 2-3 hosts released in 2011.

Here is a quick reminder of some of the strict requirements & limitations we had with the VSA.

  • Does not support memory overcommitment.
  • Once installed you cannot add additional storage to the VSA cluster.
  • The VSA can only be configured in a two or three node cluster (which can’t be changed after install).
  • You cannot run vCenter on the back-end storage supported by VSA cluster.
  • Requires a minimum of 4 NICs (two front-end : two back-end communication).
    • Total IP addresses required for a 2 Node cluster = 11.
    • Total IP addresses required for a 3 Node cluster = 14.
    • Requires greenfield hosts (cannot be hosting VM’s prior to VSA install).

VSAN Requirements:

vCentre Server

  • VSAN requires a vCentre server running vSphere 5.5 (VSAN is also supported on the new vCentre server appliance!).
  • VSAN is configured and monitored using the vSphere 5.5 Web Client.

Host/Storage:

  • Each vSphere 5.5 host participating in the VSAN cluster requires a disk controller (SAS/SATA – RAID Controller).
  • Passthrough mode is required because VSAN ‘talks’ directly to the SSD and HDD.
  • Disks to be used should not have any RAID configuration applied (Parity/Striping is looked after by the VSAN).
  • Check to make sure that your controller is listed in the HCL.
  • Each host participating in the VSAN cluster must have at least one SSD & HDD. (The SSD provides read/write cache for I/O to backing HDD’s, similar to conventional cache on a storage processor. Note: the SSD’s do not contribute to the size of the VSAN datastore its used for cache only operations.

Note: beta version will have an 8 host limit, this figure will be adjusted when it’s GA.

Host Network Interface Cards:

  • Each host participating in the VSAN cluster must have at least one 1Gb NIC, however VMware recommends 10Gb CNA’s for VM’s with high workloads.
  • VSAN is supported on both standard virtual switches as well as distributed virtual switches.
  • A VMkernel port must be created for VSAN communication; this is used for inter-cluster-node communication and read/write operations for virtual machines residing on parent hosts belonging to a VSAN cluster.

VSAN Performance

As with any storage solution understanding your IOPs requirement is paramount, and to effectively achieve the necessary IOPs for the solution you need to understand what your workloads are. vSphere VSAN supports SSDs which act as read and write cache for I/O this significantly improves performance, although we are yet to see any real world numbers.

SSD based Read Caching on the VSAN:

This is a cache of locally accessed disk blocks for virtual machines, specifically these are blocks used by the application running on the virtual machine. The virtual machine might not be on the same host the controller uses to communicate with the SSD. vSphere VSAN mirrors a directory of cached blocks between hosts in a VSAN cluster, if the virtual machine is utilising cache not local to the vSphere host, the interconnect (VMKernel port) is used to retrieve cached blocks from the host SSD that does hold the information. Hence why VMware recommend using 10Gb CNA’s to reduce latency.

Note: If the required disks blocks are not in cache on any of the hosts the information is retrieved directly from the backing HDD’s.

SSD Write Caching on the VSAN:

The SSD is also used for write cache to reduce I/O latency. Because write operations go to the SSD storage, a copy of the data must exist elsewhere in the VSAN cluster encase of node failure. This is so that write operations written to cache are not lost. When a write operation is initiated by an application in the virtual machine the write operations are mirrored to both local cache and to remote hosts in the VSAN cluster.

Note: Write operations must be committed to SSD on hosts before it is acknowledged and committed to the HDD’s

Availability

vSphere VSAN uses RAIN (Reliable Array of Independent Nodes) – object level redundancy, so in a converged infrastructure you can survive the loss of an underlying component (NIC port(s), disk, vSphere host).

In the past when defining HA admission control policies we defined enforcement policies as (a) ‘Host Failures the Cluster tolerates’, (b) ‘Percentage of cluster resources reserved…’  and (c) ‘Specify failover hosts’. vSphere administrators can now define how many host, network or disk failures a virtual machine can tolerate in a VSAN cluster.

Note: In the event of node failure there is no need for all of the data to be migrated to other nodes in the cluster (copies of virtual machines (replicas) reside on multiple nodes in the cluster).

Note: For an object to be accessible in VSAN, more than 50 % of its components must be accessible.

Configurable Options

NumberOfFailuresToTolerate  – This allows us to define the ‘number of failures to tolerate’ – (network or disk) in the cluster and still maintain availability. If this is set, it specifies that configurations must contain at least NumberOfFailuresToTolerate +1 replicas.

Note: VMware state that ‘any disk failure on a single host is treated as a “failure” for this metric. Therefore, the object cannot persist if there is a disk failure on host01 and another disk failure on host 02 when you have NumberOfFailuresToTolerate set to 1.’

Number of Disks Stripes per Object – This defines the number of physical disks across which each replica of a storage object is distributed. A value higher than 1 might result in better performance if read caching is not effective, but it will also result in a greater use of system resources. VMware state that a default disk stripe number of 1 should meet the requirements of most if not all virtual machine workloads. VMware recommend that the disk striping value should change only when a small number of high-performance virtual machines are running.

Flash Read Cache Reservation  – The amount of flash reserved as read cache for the storage object. This is specified as a percentage of the logical size of the virtual machine disk storage object. If cache reservation is not defined the VSAN scheduler manages ‘fair’ cache allocation.

Object Space Reservation – This defines the percentage of the logical size of the storage object that should be reserved on the HDD’s during initialisation. VSAN datastores are thin provisioned by default, the ObjectSpaceReservation is the amount of disk space reserved on the VSAN datastore specified as a percentage of the VM disk.

Note: VMware state that if you provision a virtual machine and select disk format to be either thick lazy or eager zeroed this setting overrides the ObjectSpaceReservation setting in the virtual machine storage policy.

Force provisioning (disabled by default) – This is an override to forcibly provision an object even if the capabilities in the virtual machine storage policy cannot be satisfied .VSAN will attempt to bring that object into compliance if and when resources become available.

Virtual Machine Policies & Compliance:

When a VSAN datastore is created, a set of policies are defined to identify the capabilities of the underlying environment (high performance disks, NL disks etc..) The administrator uses a virtual machine storage policy to classify the application workload requirements. Using VM storage policies, administrators can specify s set of required storage capabilities for a virtual machine, or more specifically a set of requirements for the application running in the virtual machine. When a virtual machine is deployed, and if the requirements in the VM storage policy attached to the virtual can be met by the VSAN datastore, the VSAN datastore will be identified as complaint.

Here is the whitepaper released by VMware : VMware_Virtual_SAN_Whats_New.pdf

I’m looking forward to revisiting this post after the public beta is released with some more information on how it works.

You can register for the beta here : http://www.vmware.com/vsan-beta-register

vmw-dgrm-virt-san-overview-lg

HP 3PAR StoreServ LDAP Integration

Securing Administrative Access to your HP 3PAR StoreServ

In this post I will describe how you can configure authentication via Active Directory, as well as limiting the LDAP search path used to resolve users. This will limit LDAP queries down to a particular Active Directory Organisational Unit, so you don’t have LDAP searches traversing your entire AD infrastructure.

The StoreServ uses RBAC (Roll Based Access Control) – which maps a user or group of users to an administrative role. Its important to know that the authorisation group ‘super-map’, identifies users in the defined group with super user privileges.
Additional groups can be added to identify lower level access rights for operations performed on the StoreServ.

Note: Presently there is no way of adding additional LDAP servers for redundancy, highlighting a single point of failure, something which will hopefully be addressed in new releases. I would also recommend using the CLI as some of these attributes cannot be set using the user interface.

To login to the StoreServ MC if the configured LDAP server has failed or the IP address has changed, you will need to login using a local account configured on the StoreServ and specify an alternative server.

For the purposes of this demonstration, the ‘3PARAdmins’ AD security group has been created, the AD DN name for this security group will be mapped to the StoreServ ‘super-map’ authorisation group. The user viGareth.H has been added to the AD security group.

The IP address of the StoreServ is : 172.16.20.1
The IP address of the Domain Controller is: 172.16.12.1
The FQDN name of the Domain Controller is: dc01.vikernel.com
Target Search OU = “OU=Resources,OU=ITSVD,OU=EMEA,DC=vikernel,DC=com”
Authorisation Group = “CN=3PARAdmins ,OU=Security,OU=EMEA,DC= vikernel,DC=com ”
Administrator = CN=H\, viGareth,OU=Administrators, OU=Resources,OU=ITSVD,OU=EMEA,DC=vikernel,DC=com

Note: When defining  the ‘Kerberos-Realm’ it must be entered in upper case.

Now lets configure the StoreServ to use Active Directory to authenticate users.

Login to the StoreServ via the CLI.
SSH to: 172.16.20.1 (StoreServ Cluster IP)

Enter the following:
3PAR-PRD01 cli% setauthparam -f ldap-server 172.16.12.1
3PAR-PRD01 cli% setauthparam -f ldap-server-hn dc01.vikernel.com
3PAR-PRD01 cli% setauthparam -f kerberos-realm VIKERNEL.COM
3PAR-PRD01 cli% setauthparam -f binding sasl
3PAR-PRD01 cli% setauthparam -f sasl-mechanism GSSAPI
3PAR-PRD01 cli% setauthparam -f accounts-dn OU=Resources,OU=ITSVD,OU=EMEA,DC=vikernel,DC=com ”
3PAR-PRD01 cli% setauthparam -f account-obj user
3PAR-PRD01 cli% setauthparam -f account-name-attr sAMAccountName
3PAR-PRD01 cli% setauthparam -f memberof-attr memberOf
3PAR-PRD01 cli% setauthparam -f super-map “CN=3PARAdmins ,OU=Security,OU=EMEA,DC= vikernel,DC=com”

Troubleshooting:

To check that an AD user can be resolved use the ‘checkpassword user’ 3PAR command (This is useful for troubleshooting login issues, for instance Kerberos tickets rely on accurate time synchronisation, inaccurate time will prohibit access via LDAP). Here is what happens when the time between the DC and StoreServ is not set correctly.

3PAR-PRD01 cli% checkpassword viGareth.H
password:
+ attempting authentication and authorization using system-local data
+ authentication denied: unknown username
+ attempting authentication and authorization using LDAP
+ using Kerberos configuration file:
[domain_realm]
dc01.vikernel.com = VIKERNEL.COM
[realms]
VIKERNEL.COM = {
kdc = dc01.vikernel.com
}
+ temporarily setting name-to-address mapping: dc01.vikernel.com -> 172.16.12.1
+ attempting to obtain credentials for ” viGareth.H@VIKERNEL.COM ”
+ Kerberos credentials denied: Clock skew too great user viGareth.H is not authenticated or not authorized
3PAR-PRD01 cli%

How to check the time on the StoreServ array:
3PAR-PRD01 cli% showdate
Node Date
0    2013-07-01 22:52:17 BST (Europe/London)
1    2013-07-01 22:52:35 BST (Europe/London)
2    2013-07-01 22:52:33 BST (Europe/London)
3    2013-07-01 22:52:30 BST (Europe/London)

Setting the time on the StoreServ array:

3PAR-PRD01 cli%
Setting the correct date/time in MMDDhhmmCCYY   (Month/Day/Hours/Minutes/Century/Years)
3PAR-PRD01 cli% setdate 070115002013
Node 0 time set to 2013-07-01 15:00:00 BST
Node 1 time set to 2013-07-01 15:00:00 BST
Node 2 time set to 2013-07-01 15:00:00 BST
Node 3 time set to 2013-07-01 15:00:00 BST
3PAR-PRD01 cli%

Now lets run the checkpassword command to validate clock is set correctly.

3PAR-PRD01 cli% checkpassword viGareth.H
password
+ attempting authentication and authorization using system-local data
+ attempting authentication and authorization using LDAP
+ using Kerberos configuration file:
[domain_realm]
dc01.vikernel.com = VIKERNEL.COM
[realms]
VIKERNEL.COM = {
kdc = dc01.vikernel.com
}
+ temporarily setting name-to-address mapping: dc01.vikernel.com -> 172.16.12.1
+ attempting to obtain credentials for ” viGareth.H @ VIKERNEL.COM ”
+ connecting to LDAP server using URI: ldap://dc01.vikernel.com
+ binding to user ” viGareth.H ” with SASL mechanism GSSAPI
+ searching LDAP using: search base:    OU=Resources,OU=ITSVD,OU=EMEA,DC=vikernel,DC=com
scope: sub
filter: (&(objectClass=user)(sAMAccountName=viGareth,H)) for attributes: memberOf

+ search result DN: CN=H\, viGareth,OU=Administrators, OU=Resources,OU=ITSVD,OU=EMEA,DC=vikernel,DC=com user viGareth.H is authenticated and authorized

Reference Documents

http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c02985998/c02985998.pdf
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c03735391/c03735391.pdf
http://bizsupport1.austin.hp.com/bc/docs/support/SupportManual/c03618134/c03618134.pdf