Monday, November 29, 2010

Performance Degradation with MPxIO and Symmetric Devices (Sun Storedge[TM] 99xx/3510 and Sun Fire Servers

DescriptionSequential I/O performance degradation of up to 700% has been seen with Symetric devices connected to the host running Sun StorEdge[TM] Traffic Manager Software (MPxIO) with round-robin load balancing.

Steps to Follow
The round-robin method is the default load balancing method when MPxIO is
enabled. This method works fine with Asymetric FC (Fibre Channel) devices such as Sun StorEdge[TM] T3, which present LUNs to the host in Active/Passive mode.
The method works counterproductive with Symetric FC Devices like the
internal drives of Sun Fire[TM] v880, v1280 or v480, the Sun Storedge[TM] A5x00, 3510, or 99xx array (the 99xx array has a pre-fetch cache alogorithm built into it).

The root cause of the problem is due to the way the MPxIO round-robin
method distributes the I/O on 2 or more channels. In the case of an
internal disk, or any FC devices in a arbitrated loop, the I/O hits the
disk directly from one channel per command. The disk completes the I/O and
gets a request from the other channel. This causes the disk to spin
(rotational delay) before servicing the request. The length of delay will
vary depending on the number of commands being issued in each path and is
very visible with the Sequential I/O. The problem is more visible with
Veritas Volume Manager or Solaris Volume Manager, which has stripped
volumes with the 16K-interlace factor. This causes an even higher number of
commands that need to be sent down to a specific disk, causing high-service
times for each command.

The Storedge 99xx array has its own pre-fetch cache algorithm, which in
some cases, may cause performance degradation.
The solution is a new method of load-balancing; the "logical block" method.
This method uses a region size (specified in the conf file of 16, 32 ....),
then routes all I/Os, starting within a certain 64K range, through the
same path. I/Os that start within other 64K ranges on the disk will be
routed through a different path. This way, the disk can catch the stream
I/O better, which improves the performance.

Below are basic changes in the scsi_vhci.conf for using the "logical block"
method. 1004918.1 describes the new load balancing method in
detail.

This is provided through the new property,
"device-type-mpxio-options-list".

device-type-mpxio-options-list=
"device-type=SUN     SENA", "load-balance-options=logical-block-options";
"device-type=SEAGATE ST373307FSUN72G",
"load-balance-options=logical-block-options1",
"device-type=HITACHI OPEN-L*4    -SUN",
"load-balance-options=logical-block-options2",
"device-type=SUN     T300", "load-balance-options=logical-block-options1";
logical-block-options="load-balance=logical-block",  "region-size=15";
logical-block-options1="load-balance=logical-block", "region-size=18";
logical-block-options2="load-balance=logical-block", "region-size=32";

Note: The Product ID in the device-type string should start at the "9th"
character. Otherwise, it will use the default round-robin load balancing
method.

This logical-block method is available beginning with SAN 4.4
Add the Below entry in /kernel/drv/scsi_vhci.conf  to enable logical-block

device-type-mpxio-options-list =
"device-type=HITACHI OPEN-V      -SUN",
"load-balance-options=logical-block-options";
logical-block-options="load-balance=logical-block", "region-size=18";

No comments:

Post a Comment