From the ESX Configuration Guide.
When you perform VMFS datastore management operations, vCenter Server uses default storage filters. The filters help you to avoid storage corruption by retrieving only the storage devices, or LUNs, that can be used for a particular operation. Unsuitable LUNs are not displayed for selection. You can turn off the filters to view all LUNs.
Before making any changes to the LUN filters, consult with the VMware support team. You can turn off the filters only if you have other methods to prevent LUN corruption.
- In the vSphere Client, select Administration > vCenter Server Settings.
- In the settings list, select Advanced Settings.
- In the Key text box, type a key.
Key Filter Name
config.vpxd.filter.vmfsFilter VMFS Filter
config.vpxd.filter.rdmFilter RDM Filter
config.vpxd.filter.SameHostAndTransportsFilter Same Host and Transports Filter
config.vpxd.filter.hostRescanFilter Host Rescan Filter
NOTE If you turn off the Host Rescan Filter, your hosts continue to perform
a rescan each time you present a new LUN to a host or a cluster.
- In the Value text box, type False for the specified key.
- Click Add.
- Click OK.
You are not required to restart the vCenter Server system.
Read Performance Characterization of VMFS and RDM Using a SAN. It may be for ESX 3.5 but still holds true. The conclusion from the document is:
VMware ESX Server offers two options for disk access management—VMFS and RDM. Both options provide clustered file system features such as user‐friendly persistent names, distributed file locking, and file permissions. Both VMFS and RDM allow you to migrate a virtual machine using VMotion. This study compares the performance characteristics of both options and finds only minor differences in performance. For random workloads, VMFS and RDM produce similar I/O throughput. For sequential workloads with small I/O block sizes, RDM provides a small increase in throughput compared to VMFS. However, the performance gap decreases as the I/O block size increases. For all workloads, RDM has slightly better CPU cost.
The test results described in this study show that VMFS and RDM provide similar I/O throughput for most of the workloads we tested. The small differences in I/O performance we observed were with the virtual machine running CPU‐saturated. The differences seen in these studies would therefore be minimized in real life workloads because most applications do not usually drive virtual machines to their full capacity. Most enterprise applications can, therefore, use either VMFS or RDM for configuring virtual disks when run in a virtual machine.
However, there are a few cases that require use of raw disks. Backup applications that use such inherent SAN features as snapshots or clustering applications (for both data and quorum disks) require raw disks. RDM is recommended for these cases. We recommend use of RDM for these cases not for performance reasons but because these applications require lower level disk control.
And read Use RDMs for Practical Reasons and Not Performance Reasons too.
There is a section in the ESX Configuration Guide that is relevent.
Obviously the choice of storage vendor and the underlying technologies play a part here but there are some general guidelines that apply regardless. VMware themselves have a short page on this which I have copied below:
Many of the best practices for physical storage environments also apply to virtual storage environments. It is best to keep in mind the following rules of thumb when configuring your virtual storage infrastructure:
Configure and size storage resources for optimal I/O performance first, then for storage capacity.
This means that you should consider throughput capability and not just capacity. Imagine a very large parking lot with only one lane of traffic for an exit. Regardless of capacity, throughput is affected. It’s critical to take into consideration the size and storage resources necessary to handle your volume of traffic—as well as the total capacity.
Aggregate application I/O requirements for the environment and size them accordingly.
As you consolidate multiple workloads onto a set of ESX servers that have a shared pool of storage, don’t exceed the total throughput capacity of that storage resource. Looking at the throughput characterization of physical environment prior to virtualization can help you predict what throughput each workload will generate in the virtual environment.
Base your storage choices on your I/O workload.
Use an aggregation of the measured workload to determine what protocol, redundancy protection and array features to use, rather than using an estimate. The best results come from measuring your applications I/O throughput and capacity for a period of several days prior to moving them to a virtualized environment.
Remember that pooling storage resources increases utilization and simplifies management, but can lead to contention.
There are significant benefits to pooling storage resources, including increased storage resource utilization and ease of management. However, at times, heavy workloads can have an impact on performance. It’s a good idea to use a shared VMFS volume for most virtual disks, but consider placing heavy I/O virtual disks on a dedicated VMFS volume or an RDM to reduce the effects of contention.
As far as vendor specific configuration goes, NetApp’s TR-3428 document is worth a read and maybe also this document from EMC. On the subject of EMC, Alan Renouf and Simon Seagrave ran a session at a recent London VMUG meeting that may also be of interest. Find out about it here.
There isn’t a single rule for this – there are more like thousands of rules! Basically have an idea of what workloads VMs are generating in terms of IO and try to balance them out but also bear in mind that write intensive loads will perform better on RAID 10 than on RAID 5 but RAID 10 uses more disks than RAID 5 does.
Whilst not specifically related to RAID and it talks about EMC storage, Optimal VM Placement offers some interesting thoughts and mentions the alarms that can be set in vCenter that are usueful to monitor problems:
- VM Disk Usage (KBps)
- Total Disk Latency (ms)
- VM Disk Abort
- VM Disk resets
Also, as a rule of thumb, if a server consistently generates a certain number of IOPS as either reads or writes on physical hardware, it will probably generate the the same on virtual hardware. So it follows that if you’d use RAID 10 for that physical server, you should use a RAID 10 LUN with the virtual server. It’s a common sense thing gained from experience really.
From How NPIV-Based LUN Access Works:
SAN objects, such as switches, HBAs, storage devices, or virtual machines can be assigned World Wide Name (WWN) identifiers. WWNs uniquely identify such objects in the Fibre Channel fabric. When virtual machines have WWN assignments, they use them for all RDM traffic, so the LUNs pointed to by any of the RDMs on the virtual machine must not be masked against its WWNs. When virtual machines do not have WWN assignments, they access storage LUNs with the WWNs of their host’s physical HBAs. By using NPIV, however, a SAN administrator can monitor and route storage access on a per virtual machine basis. The following section describes how this works.
NPIV enables a single FC HBA port to register several unique WWNs with the fabric, each of which can be assigned to an individual virtual machine. When a virtual machine has a WWN assigned to it, the virtual machine’s configuration file (.vmx) is updated to include a WWN pair (consisting of a World Wide Port Name, WWPN, and a World Wide Node Name, WWNN). As that virtual machine is powered on, the VMkernel instantiates a virtual port (VPORT) on the physical HBA which is used to access the LUN. The VPORT is a virtual HBA that appears to the FC fabric as a physical HBA, that is, it has its own unique identifier, the WWN pair that was assigned to the virtual machine. Each VPORT is specific to the virtual machine, and the VPORT is destroyed on the host and it no longer appears to the FC fabric when the virtual machine is powered off.
If NPIV is enabled, four WWN pairs (WWPN & WWNN) are specified for each virtual machine at creation time.When a virtual machine using NPIV is powered on, it uses each of these WWN pairs in sequence to try to discover an access path to the storage. The number of VPORTs that are instantiated equals the number of physical HBAs present on the host up to the maximum of four. A VPORT is created on each physical HBA that a physical path is found on. Each physical path is used to determine the virtual path that will be used to access the LUN.Note that HBAs that are not NPIV-aware are skipped in this discovery process because VPORTs cannot be instantiated on them.
Note: If a user has four physical HBAs as paths to the storage, all physical paths must be zoned to the virtual machine by the SAN administrator. This is required to support multipathing even though only one path at a time will be active.
That’s NPIV in a nutshell. For more detail and the requirements, read How to Configure NPIV on VMware vSphere 4.0.
DirectPath places some limitations on VMs and so is used with caution. Generally, any VM that uses DirectPath becomes tied to an ESX host – vMotion and DRS will not work.
DirectPath must first be enabled in the ESX host’s BIOS. As a consequence only certain systems support this. A PCI device can only be assigned to 1 VM at a time. That device cannot also be used by the host. A VM can have upto 2 directly connected devices.
The advantage that DirectPath gives is the ability for devices not directly supported by VMware to be attached to VMs. Also, by circumventing the virtualisation layer, greater performance can be achieved by a VM using a directly connected device. Typically DirectPath is used to assign high speed, dedicated NICs to high performance VMs. Other use cases include attaching locally attached USB devices to a VM.
Simon Long explains DirectPath well in VMware DirectPath I/O. An example of using USB devices can be seen in Enable USB Support for ESXi with VMDirectPath.
See also Configuration Examples for DirectPath for more detail and examples.
Surprise, wikipedia has a decent overview of different RAID levels. As such there is no point reproducing it here! In practice, levels 0, 1, 5 and 6 are most commonly used in commercial storage systems. Know these off by heart.
More recently some vendors have developed technologies that do not use RAID but pool large numbers of disks together and allows the storage subsystem to store data in a location that will get the best performance. This is often referred to as storage virtualization.
Supported HBA Types
The storage compatibility guide (document #1) is updated every few weeks to contain the latest compatible storage devices. The document runs to over 1100 pages but the first few contain everything that is needed. In summary, the following are supported types of storage:
Fibre Channel (FC) – the only type that supports Microsoft clustering
Fibre Channel over Ethernet (FCoE) – these are connected using Converged Network Adapters (CNAs)
Hardware iSCSI – These come in two flavours: Dependent – These types are dependent on the networking and iSCSI management interfaces provided by vSphere; Independent – These types have their own networking and iSCSI management interfaces, for example the QLA4052, and may need to be separately licensed
Software iSCSI – Uses functionality built into the vmkernel and connections are made using supported NICs.
NFS – Like software iSCSI, functionality is built into the vmkernel and connections are made using supported NICs
Virtual Disk Format Types
There are 10 types of virtual disk that VMware’s products use. These are taken from VMware KB 1022242.
- zeroedthick (default) – Space required for the virtual disk is allocated during the creation of the disk file. Any data remaining on the physical device is not erased during creation, but is zeroed out on demand at a later time on first write from the virtual machine. The virtual machine does not read stale data from disk.
- eagerzeroedthick – Space required for the virtual disk is allocated at creation time. In contrast to zeroedthick format, the data remaining on the physical device is zeroed out during creation. It might take much longer to create disks in this format than to create other types of disks. (Required for Fault Tolerance feature!)
- thick – Space required for the virtual disk is allocated during creation. This type of formatting does not zero out any old data that might be present on this allocated space. A non-root user cannot create disks of this format.
- thin – Space required for the virtual disk is not allocated during creation, but is supplied and zeroed out, on demand at a later time.
- rdm – Virtual compatibility mode for raw disk mapping.
- rdmp – Physical compatibility mode (pass-through) for raw disk mapping.
- raw – Raw device.
- 2gbsparse – A sparse disk with 2GB maximum extent size. You can use disks in this format with other VMware products, however, you cannot power on sparse disk on a ESX host till you reimport the disk with vmkfstools in a compatible format, such as thick or thin.
- monosparse – A monolithic sparse disk. You can use disks in this format with other VMware products.
- monoflat – A monolithic flat disk. You can use disks in this format with other VMware products.
Conversions between thin and thick disks can be done as part of a Storage vMotion operation. Eager Zeroed Thick disks are a special case. To determine if a disk is zeroedthick or eagerzeroedthick, check out vmkfstools in VMware KB 1011170. There’s also a community script (in perl) that can be used to determine the disk format type (see getRealVMDiskFormat.pl). Converting to eagerzeroedthick can also be done to a powered off VM using vmkfstools or when adding a disk to a VM by ticking the checkbox shown below.