Document revision date: 30 March 2001 | |
Previous | Contents | Index |
There are two types of performance assists: the merge assist and the copy assist. The merge assist improves performance by using information that is maintained in controller-based write logs to merge only the data that is inconsistent across a shadow set. When a merge operation is assisted by the write logs, it is referred to as a minimerge. The copy assist reduces system resource usage and copy times by enabling a direct disk-to-disk transfer of data without going through host node memory.
Assisted merge operations are usually too short to be noticeable. Improved performance is also possible during the assisted copy operation because it consumes less CPU and interconnect resources. Although the primary purpose of the performance assists is to reduce the system resources required to perform a copy or merge operation, in some circumstances you may also observe improved read and write I/O performance.
Volume Shadowing for OpenVMS supports both assisted and unassisted shadow sets in the same OpenVMS Cluster configuration. Whenever you create a shadow set, add members to an existing shadow set, or boot a system, the shadowing software reevaluates each device in the changed configuration to determine whether it is capable of supporting either the copy assist or the minimerge. Enhanced performance is possible only as long as all shadow set members are configured on controllers that support performance assist capabilities. If any shadow set member is connected to a controller without these capabilities, the shadowing software disables the performance assist for the shadow set.
When the correct revision levels of software are installed, the copy
assist and minimerge are enabled by default, and are fully managed by
the shadowing software.
9.2.3 Effects on Performance
The copy assist and minimerge are designed to reduce the time needed to do copy and merge operations. In fact, you may notice significant time reductions on systems that have little or no user I/O occurring during the assisted copy or merge operation. Data availability is also improved because copy operations quickly make data consistent across the shadow set.
Minimerge Performance Improvements
The minimerge feature provides a significant reduction in the time needed to perform merge operations. By using controller-based write logs, it is possible to avoid the total volume scan required by earlier merge algorithms and to merge only those areas of the shadow set where write activity was known to be in progress at the time the node or nodes failed.
Unassisted merge operations often take several hours, depending on user I/O rates. Minimerge operations typically complete in a few minutes and are usually undetectable by users.
The exact time taken to complete a minimerge depends on the amount of outstanding write activity to the shadow set when the merge process is initiated, and on the number of shadow set members undergoing a minimerge simultaneously. Even under the heaviest write activity, a minimerge will complete within several minutes. Additionally, minimerge operations consume minimal compute and I/O bandwidth.
Copy Assist Performance Improvements
Copy times vary according to each configuration and generally take
longer on systems supporting user I/O. Performance benefits are
achieved when the source and target disks are on different HSJ internal
buses.
9.3 Guidelines for Managing Shadow Set Performance
Sections 9.1 and 9.2 describe the performance impacts on a shadow set in steady state and while a copy or merge operation is in progress. In general, performance during steady state compares with that of a nonshadowed disk. Performance is affected when a copy or a merge operation is in progress to a shadow set. In the case of copy operations, you control when the operations are performed.
However, merge operations are not started because of user or program actions. They are started automatically when a system fails, or when a shadow set on a system with outstanding application write I/O enters mount verification and times out. In this case, the shadowing software reduces the utilization of system resources and the effects on user activity by throttling itself dynamically. Minimerge operations consume few resources and complete rapidly with little or no effect on user activity.
The actual resources that are utilized during a copy or merge operation depend on the access path to the member units of a shadow set, which in turn depends on the way the shadow set is configured. By far, the resources that are consumed most during both operations are the adapter and interconnect I/O bandwidths.
You can control resource utilization by setting the SHADOW_MAX_COPY system parameter to an appropriate value on a system based on the type of system and the adapters on the machine. SHADOW_MAX_COPY is a dynamic system parameter that controls the number of concurrent copy or merge threads that can be active on a single system. If the number of copy threads that start up on a particular system is more than the value of the SHADOW_MAX_COPY parameter on that system, only the number of threads specified by SHADOW_MAX_COPY will be allowed to proceed. The other copy threads are stalled until one of the active copy threads completes.
For example, assume that the SHADOW_MAX_COPY parameter is set to 3. If you mount four shadow sets that all need a copy operation, only three of the copy operations can proceed; the fourth copy operation must wait until one of the first three operations completes. Because copy operations use I/O bandwidth, this parameter provides a way to limit the number of concurrent copy operations and avoid saturating interconnects or adapters in the system. The value of SHADOW_MAX_COPY can range from 0 to 200. The default value is OpenVMS version specific.
Chapter 3 explains how to set the SHADOW_MAX_COPY parameter. Keep in mind that, once you arrive at a good value for the parameter on a node, you should also reflect this change by editing the MODPARAMS.DAT file so that when invoking AUTOGEN, the changed value takes effect.
In addition to setting the SHADOW_MAX_COPY parameter, the following list provides some general guidelines to control resource utilization and the effects on system performance when shadow sets are in transient states.
Compaq's StorageWorks RAID Software for OpenVMS provides ways to configure and use disk drives so that they achieve improved I/O performance. RAID (redundant arrays of independent disks) uses striping technology to chunk data and distribute it across multiple drives. RAID software is available in various levels, one of which is volume shadowing. Table 9-1 describes RAID levels.
RAID Level | Description |
---|---|
Level 0 | Striping with no redundancy. |
Level 1 | Shadowing. |
Levels 0 + 1 | Striping and shadowing together. |
Level 3 | Striped data with dedicated parity drive. Drives are rotationally synchronized. |
Level 5 | Striped data and parity. |
Level 6 | Striped data and parity with two parity drives. |
Shadowing striped drives can increase both performance and availability, because you can achieve faster response time with striping and data redundancy with shadowing. In addition to shadowing striped sets, you can also stripe shadow sets. Each strategy offers different advantages and tradeoffs in terms of availability, performance, and cost.
For more information about RAID 0, see the POLYCENTER product documentation set. For more information about RAID 5, see the StorageWorks RAID 5 Software for OpenVMS User's Guide.
This appendix lists volume shadowing status messages that are displayed
on the console device. For other system messages that are related to
volume shadowing, use the Help Message utility. For information about
the HELP/MESSAGE command and qualifiers, see DCL help (type HELP
HELP/MESSAGE at the DCL prompt). Messages that can occur before a
system is fully functional are also included in OpenVMS System Messages: Companion Guide for Help Message Users.
A.1 Mount Verification Messages
The following mount verification messages have approximately the same meaning for shadow sets as they do for regular disks. They are sent to the system console (OPA0) and to any operator terminals that are enabled to receive disk operator messages.
The following OPCOM message is returned in response to shadow set operations. This message results when the shadowing code detects that the boot device is no longer in the system disk shadow set. If the boot device is not added back into the system disk shadow set, the system may not reboot, and the dump may be lost if the system crashes.
virtual-unit: does not contain the member named to VMB. System
may not reboot.
Explanation: This message can occur for the following
reasons:
Shadow server operations can display the following status messages on the system console (OPA0) and on terminals enabled to receive operator messages.
Shadow server messages are always informational messages and include the prefix %SHADOW_SERVER-I-SSRVmessage-abbreviation. The following example includes the OPCOM banner and the shadow server message to illustrate what the messages look like when they are output to the console:
%%%%%%%%%%% OPCOM 24-MAR-1990 15:01:30.99 %%%%%%%%%%% (from node SYSTMX at 24-MAR-1990 15:01:31.36) Message from user SYSTEM on SYSTMX %SHADOW_SERVER-I-SSRVINICOMP, shadow server has completed initialization. |
The following messages are returned by the shadow server in response to shadow set operations. Several of the messages refer to a copy thread number; this is a unique identifier denoting a copy or merge operation. The messages in this section are listed in alphabetical order by message abbreviation. For simplicity, the messages shown here do not include the SHADOW_SERVER-I- prefix.
SSRVCMPFCPY, completing copy operation on device
_virtual-unit: at LBN: LBN-location, ID number:
copy-thread-number
Explanation: The copy operation has completed.
User Action: None.
SSRVCMPMRG, completing merge operation on device
_virtual-unit: at LBN: LBN-location, ID number:
copy-thread-number
Explanation: The merge operation has completed.
User Action: None.
SSRVCOMPLYFAIL, still out of compliance for per-disk license units, new
shadow members may be immediately removed
Explanation: The number of shadow set members on the
node has exceeded the number of VOLSHAD-DISK license units for more
than 60 minutes. Attempts to bring the node into compliance by removing
unlicensed members from their shadow sets have failed. If any new
members are mounted, they might be removed immediately.
User Action: Ensure that the number of VOLSHAD-DISK
license units on each node is equal to the number of shadow set members
mounted on that node. If necessary, dismount shadow set members until
the number of mounted members equals the number of VOLSHAD-DISK license
units loaded on the node. If you need more VOLSHAD-DISK license PAKs,
contact a Digital support representative.
SSRVINICOMP, shadow server has completed initialization
Explanation: The shadow server has been initialized at
boot time.
User Action: None.
SSRVINICPY, initiating copy operation on device _virtual-unit:
at LBN: LBN-location, I/O Size: number-of-blocks
blocks, ID number: copy-thread-number
Explanation: A copy operation is beginning on the
shadow set whose virtual unit number is listed in the message.
User Action: None.
SSRVINIMRG, initiating merge operation on device
_virtual-unit: at LBN logical-block-number, I/O Size:
number-of-blocks blocks, ID number: copy-thread-number
Explanation: A merge operation is beginning on the
shadow set. The merge can occur after a copy operation has completed.
User Action: None.
SSRVINIMMRG, initiating minimerge operation on device
_virtual-unit: at LBN LBN-location, I/O size:
number-of-blocks blocks, ID number: copy-thread-number
Explanation: A shadowing minimerge is beginning on the
device indicated. The message identifies the minimerge with the name of
the shadow set virtual unit, and the LBN location of the minimerge, the
size of the I/O request (in blocks), and the ID number of the copy
thread. For example:
%SHADOW_SERVER-I-SSRVINIMMRG, initiating minimerge operation on device _DSA2: at LBN 0, I/O size: 105 blocks, ID number: 33555161 |
SSRVINSUFPDL, insufficient per-disk license units loaded, shadow set
member(s) will be removed in number minutes
Explanation: The number of shadow set members mounted
exceeds the number of VOLSHAD-DISK license units loaded on the node. If
this condition is not corrected before the number of minutes displayed
in this message has elapsed, Volume Shadowing will remove unlicensed
members from shadow sets in an attempt to make the node compliant with
the number of loaded VOLSHAD-DISK license units.
User Action: Dismount shadow set members until the
number of mounted members is equal to the number of VOLSHAD-DISK
license units on the node.
SSRVNORMAL, successful completion of operation on device
_virtual-unit: at LBN LBN-location, ID number:
copy-thread-number
Explanation: The copy or merge operation has completed.
User Action: None.
SSRVRESCPY, resuming copy operation on device _virtual-unit:
at LBN: logical-block-number I/O size:
number-of-blocks blocks, ID number: copy-thread-number
Explanation: A copy operation is resuming. The message
identifies the copy with a unique sequence number, the name of the
shadow set virtual unit, the LBN location of the copy, and the size of
the I/O request (in blocks). For example:
%SHADOW_SERVER-I-SSRVRESFCPY, resuming Full-Copy copy sequence number 16777837 on device _DSA101:, at LBN 208314 I/O size: 71 blocks |
SSRVSPNDCPY, suspending operation on device _virtual-unit: at
LBN: logical-block-number, ID number:
copy-thread-number
Explanation: A copy operation is being interrupted
before it completes. (If a crash occurs during a copy operation, a
minimerge assist can interrupt the copy operation to resolve
inconsistencies. The shadowing software can resume the copy operation
when the minimerge completes.) The following message identifies the
copy operation with the name of the shadow set virtual unit, the LBN
location of the copy, and a unique ID number.
%SHADOW_SERVER-I-SSRVSPNDCPY, suspending operation on device _DSA101:. at LBN: 208314, ID number: 16777837 |
SSRVSPNDMMRG, suspending minimerge operation on device
_virtual-unit: at LBN: logical-block-number ID
number: copy-thread-number
Explanation: A minimerge is interrupted before it
completes. The message identifies the minimerge with the name of the
shadow set virtual unit, the LBN location of the minimerge, and a
unique ID number. For example:
%SHADOW_SERVER-I-SSRVSPNDMMRG, suspending minimerge operation on device _DSA101:. at LBN: 3907911, ID number: 16777837 |
SSRVSPNDMRG, suspending merge operation on device
_virtual-unit: at LBN: LBN-location, ID number:
copy-thread-number
Explanation: A merge operation has been suspended
while the shadow set undergoes a copy operation.
User Action: None.
SSRVTRMSTS, reason for termination of operation on device:
_virtual-unit:, abort status
Explanation: This message always accompanies the
SSRVTERM message to provide further information about the copy
termination.
User Action: Possible actions vary depending on the
reason for the error. You might need to check and repair hardware or
restart the copy operation.
SSRVTERMCPY, terminating operation on device: _virtual-unit:,
ID number: copy-thread-number
Explanation: The copy thread is aborting. See the
accompanying SSRVTRMSTS message for more information.
User Action: None.
SSRVTERMMRG, terminating operation on device: _virtual-unit:,
ID number: copy-thread-number
Explanation: The merge thread is aborting. See the
accompanying SSRVTRMSTS message for more information.
User Action: None.
SSRVTERMMMRG, terminating operation on device: _virtual-unit:,
ID number: copy-thread-number
Explanation: The minimerge thread is aborting. See the
accompanying SSRVTRMSTS message for more information.
User Action: None.
A.4 VOLPROC Messages
Shadowing operations can display the following status messages on the system console (OPA0) and on terminals enabled to receive disk operator messages.
Shadowing messages always include the prefix %SHADOW-I-VOLPROC and can sometimes be followed by "Volume Processing in Progress." The messages are displayed in the following format:
%SHADOW-I-VOLPROC, message-text
The following example shows a complete volume-processing status message:
%SHADOW-I-VOLPROC, DSA13: shadow set has changed state. Volume processing in progress. |
The following messages are returned by the VOLPROC in response to shadow set operations. The messages in this section are listed in alphabetical order beginning with the first word after the shadow set member name or the virtual unit name. For simplicity, the messages do not include the %SHADOW-I-VOLPROC prefix.
shadow-set-member: contains the wrong volume.
Explanation: The shadowing software discovered a
volume label mismatch after failover.
User Action: Check the disk drives and unit numbers.
shadow-set-member: has aborted volume processing.
Explanation: The shadow set is dissolved. A shadow set
member was not restored to operational status before the MVTIMEOUT
system parameter setting expires; thus, the mount operation aborts for
the shadow set.
User Action: Check error logs and the shadow set
membership; the disk or controller might need repair.
shadow-set-member: has been write-locked.
Explanation: The data on the disk is protected against
write I/O operations.
User Action: Remove the write lock on the volume.
shadow-set-member: has completed volume processing.
Explanation: The shadow set state change is complete.
User Action: Check the shadow set membership; the disk
or controller might need repair.
shadow-set-member: is offline.
Explanation: A shadow set member is off line. The
shadowing software attempts to fail over.
User Action: None.
shadow-set-member: shadow copy has been completed.
Explanation: A shadow copy operation has completed.
User Action: None.
shadow-set-member: shadow set has been reduced.
Explanation: The specified shadow set member has been
removed.
User Action: If the member failed out of the set (not
dismounted), look for the cause of the failure and repair it.
virtual-unit: all shadow set copy operations are completed.
Explanation: All pending shadow set copy operations
have completed. The same logical block on any shadow set member
contains the same data.
User Action: None.
virtual-unit: shadow copy has been started.
Explanation: Indicates the start of a shadow copy
operation.
User Action: None.
virtual-unit: shadow master has changed. Dump file will be
written if system crashes. Volume Processing in progress.
Explanation: The shadowing software has determined a
new master disk for the system disk shadow set. You can write a dump
file for this system only if the master is the same disk as the one the
system booted from. This is because the boot drivers are not connected
with the shadow driver, and different boot drivers from the ones that
interact with the booted system disk might be needed to interact with
the new master disk. For example, a system disk could be served and
also locally connected, causing the served path to use different
drivers from the local path.
User Action: None.
virtual-unit: shadow master has changed. Dump file will not be
written if the system crashes. Volume processing in progress.
Explanation: Indicates that the disk from which you
booted is no longer in the shadow set. If a system failure occurs, a
dump file cannot be written to the removed disk.
User Action: Return the disk to the shadow set.
virtual-unit: shadow set has changed state. Volume processing
in progress.
Explanation: The state of the shadow set is in
transition. The membership of the shadow set is changing because of
either the addition or removal of members from the shadow set, or
failover to another device after a hardware error. Further messages
give details if a change occurs.
User Action: None.
Previous | Next | Contents | Index |
privacy and legal statement | ||
5423PRO_013.HTML |