Document revision date: 30 March 2001
[Compaq] [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]
[OpenVMS documentation]

Volume Shadowing for OpenVMS


Previous Contents Index

9.2.2 Improving Performance for Merge and Copy Operations

There are two types of performance assists: the merge assist and the copy assist. The merge assist improves performance by using information that is maintained in controller-based write logs to merge only the data that is inconsistent across a shadow set. When a merge operation is assisted by the write logs, it is referred to as a minimerge. The copy assist reduces system resource usage and copy times by enabling a direct disk-to-disk transfer of data without going through host node memory.

Assisted merge operations are usually too short to be noticeable. Improved performance is also possible during the assisted copy operation because it consumes less CPU and interconnect resources. Although the primary purpose of the performance assists is to reduce the system resources required to perform a copy or merge operation, in some circumstances you may also observe improved read and write I/O performance.

Volume Shadowing for OpenVMS supports both assisted and unassisted shadow sets in the same OpenVMS Cluster configuration. Whenever you create a shadow set, add members to an existing shadow set, or boot a system, the shadowing software reevaluates each device in the changed configuration to determine whether it is capable of supporting either the copy assist or the minimerge. Enhanced performance is possible only as long as all shadow set members are configured on controllers that support performance assist capabilities. If any shadow set member is connected to a controller without these capabilities, the shadowing software disables the performance assist for the shadow set.

When the correct revision levels of software are installed, the copy assist and minimerge are enabled by default, and are fully managed by the shadowing software.

9.2.3 Effects on Performance

The copy assist and minimerge are designed to reduce the time needed to do copy and merge operations. In fact, you may notice significant time reductions on systems that have little or no user I/O occurring during the assisted copy or merge operation. Data availability is also improved because copy operations quickly make data consistent across the shadow set.

Minimerge Performance Improvements

The minimerge feature provides a significant reduction in the time needed to perform merge operations. By using controller-based write logs, it is possible to avoid the total volume scan required by earlier merge algorithms and to merge only those areas of the shadow set where write activity was known to be in progress at the time the node or nodes failed.

Unassisted merge operations often take several hours, depending on user I/O rates. Minimerge operations typically complete in a few minutes and are usually undetectable by users.

The exact time taken to complete a minimerge depends on the amount of outstanding write activity to the shadow set when the merge process is initiated, and on the number of shadow set members undergoing a minimerge simultaneously. Even under the heaviest write activity, a minimerge will complete within several minutes. Additionally, minimerge operations consume minimal compute and I/O bandwidth.

Copy Assist Performance Improvements

Copy times vary according to each configuration and generally take longer on systems supporting user I/O. Performance benefits are achieved when the source and target disks are on different HSJ internal buses.

9.3 Guidelines for Managing Shadow Set Performance

Sections 9.1 and 9.2 describe the performance impacts on a shadow set in steady state and while a copy or merge operation is in progress. In general, performance during steady state compares with that of a nonshadowed disk. Performance is affected when a copy or a merge operation is in progress to a shadow set. In the case of copy operations, you control when the operations are performed.

However, merge operations are not started because of user or program actions. They are started automatically when a system fails, or when a shadow set on a system with outstanding application write I/O enters mount verification and times out. In this case, the shadowing software reduces the utilization of system resources and the effects on user activity by throttling itself dynamically. Minimerge operations consume few resources and complete rapidly with little or no effect on user activity.

The actual resources that are utilized during a copy or merge operation depend on the access path to the member units of a shadow set, which in turn depends on the way the shadow set is configured. By far, the resources that are consumed most during both operations are the adapter and interconnect I/O bandwidths.

You can control resource utilization by setting the SHADOW_MAX_COPY system parameter to an appropriate value on a system based on the type of system and the adapters on the machine. SHADOW_MAX_COPY is a dynamic system parameter that controls the number of concurrent copy or merge threads that can be active on a single system. If the number of copy threads that start up on a particular system is more than the value of the SHADOW_MAX_COPY parameter on that system, only the number of threads specified by SHADOW_MAX_COPY will be allowed to proceed. The other copy threads are stalled until one of the active copy threads completes.

For example, assume that the SHADOW_MAX_COPY parameter is set to 3. If you mount four shadow sets that all need a copy operation, only three of the copy operations can proceed; the fourth copy operation must wait until one of the first three operations completes. Because copy operations use I/O bandwidth, this parameter provides a way to limit the number of concurrent copy operations and avoid saturating interconnects or adapters in the system. The value of SHADOW_MAX_COPY can range from 0 to 200. The default value is OpenVMS version specific.

Chapter 3 explains how to set the SHADOW_MAX_COPY parameter. Keep in mind that, once you arrive at a good value for the parameter on a node, you should also reflect this change by editing the MODPARAMS.DAT file so that when invoking AUTOGEN, the changed value takes effect.

In addition to setting the SHADOW_MAX_COPY parameter, the following list provides some general guidelines to control resource utilization and the effects on system performance when shadow sets are in transient states.

9.4 Striping (RAID) Implementation

Compaq's StorageWorks RAID Software for OpenVMS provides ways to configure and use disk drives so that they achieve improved I/O performance. RAID (redundant arrays of independent disks) uses striping technology to chunk data and distribute it across multiple drives. RAID software is available in various levels, one of which is volume shadowing. Table 9-1 describes RAID levels.

Table 9-1 RAID Levels
RAID Level Description
Level 0 Striping with no redundancy.
Level 1 Shadowing.
Levels 0 + 1 Striping and shadowing together.
Level 3 Striped data with dedicated parity drive. Drives are rotationally synchronized.
Level 5 Striped data and parity.
Level 6 Striped data and parity with two parity drives.

Shadowing striped drives can increase both performance and availability, because you can achieve faster response time with striping and data redundancy with shadowing. In addition to shadowing striped sets, you can also stripe shadow sets. Each strategy offers different advantages and tradeoffs in terms of availability, performance, and cost.

For more information about RAID 0, see the POLYCENTER product documentation set. For more information about RAID 5, see the StorageWorks RAID 5 Software for OpenVMS User's Guide.


Appendix A
Messages

This appendix lists volume shadowing status messages that are displayed on the console device. For other system messages that are related to volume shadowing, use the Help Message utility. For information about the HELP/MESSAGE command and qualifiers, see DCL help (type HELP HELP/MESSAGE at the DCL prompt). Messages that can occur before a system is fully functional are also included in OpenVMS System Messages: Companion Guide for Help Message Users.

A.1 Mount Verification Messages

The following mount verification messages have approximately the same meaning for shadow sets as they do for regular disks. They are sent to the system console (OPA0) and to any operator terminals that are enabled to receive disk operator messages.

A.2 OPCOM Message

The following OPCOM message is returned in response to shadow set operations. This message results when the shadowing code detects that the boot device is no longer in the system disk shadow set. If the boot device is not added back into the system disk shadow set, the system may not reboot, and the dump may be lost if the system crashes.

virtual-unit: does not contain the member named to VMB. System may not reboot.
Explanation: This message can occur for the following reasons:


User Action: Do one of the following:

A.3 Shadow Server Messages

Shadow server operations can display the following status messages on the system console (OPA0) and on terminals enabled to receive operator messages.

Shadow server messages are always informational messages and include the prefix %SHADOW_SERVER-I-SSRVmessage-abbreviation. The following example includes the OPCOM banner and the shadow server message to illustrate what the messages look like when they are output to the console:


%%%%%%%%%%%   OPCOM 24-MAR-1990 15:01:30.99   %%%%%%%%%%% 
 (from node SYSTMX at 24-MAR-1990 15:01:31.36) 
Message from user SYSTEM on SYSTMX 
%SHADOW_SERVER-I-SSRVINICOMP, shadow server has completed initialization. 

The following messages are returned by the shadow server in response to shadow set operations. Several of the messages refer to a copy thread number; this is a unique identifier denoting a copy or merge operation. The messages in this section are listed in alphabetical order by message abbreviation. For simplicity, the messages shown here do not include the SHADOW_SERVER-I- prefix.

SSRVCMPFCPY, completing copy operation on device _virtual-unit: at LBN: LBN-location, ID number: copy-thread-number
Explanation: The copy operation has completed.
User Action: None.

SSRVCMPMRG, completing merge operation on device _virtual-unit: at LBN: LBN-location, ID number: copy-thread-number
Explanation: The merge operation has completed.
User Action: None.

SSRVCOMPLYFAIL, still out of compliance for per-disk license units, new shadow members may be immediately removed
Explanation: The number of shadow set members on the node has exceeded the number of VOLSHAD-DISK license units for more than 60 minutes. Attempts to bring the node into compliance by removing unlicensed members from their shadow sets have failed. If any new members are mounted, they might be removed immediately.
User Action: Ensure that the number of VOLSHAD-DISK license units on each node is equal to the number of shadow set members mounted on that node. If necessary, dismount shadow set members until the number of mounted members equals the number of VOLSHAD-DISK license units loaded on the node. If you need more VOLSHAD-DISK license PAKs, contact a Digital support representative.

SSRVINICOMP, shadow server has completed initialization
Explanation: The shadow server has been initialized at boot time.
User Action: None.

SSRVINICPY, initiating copy operation on device _virtual-unit: at LBN: LBN-location, I/O Size: number-of-blocks blocks, ID number: copy-thread-number
Explanation: A copy operation is beginning on the shadow set whose virtual unit number is listed in the message.
User Action: None.

SSRVINIMRG, initiating merge operation on device _virtual-unit: at LBN logical-block-number, I/O Size: number-of-blocks blocks, ID number: copy-thread-number
Explanation: A merge operation is beginning on the shadow set. The merge can occur after a copy operation has completed.
User Action: None.

SSRVINIMMRG, initiating minimerge operation on device _virtual-unit: at LBN LBN-location, I/O size: number-of-blocks blocks, ID number: copy-thread-number
Explanation: A shadowing minimerge is beginning on the device indicated. The message identifies the minimerge with the name of the shadow set virtual unit, and the LBN location of the minimerge, the size of the I/O request (in blocks), and the ID number of the copy thread. For example:


%SHADOW_SERVER-I-SSRVINIMMRG, initiating minimerge  operation on 
device _DSA2: at LBN 0, I/O size: 105 blocks, ID number: 33555161 

User Action: None.

SSRVINSUFPDL, insufficient per-disk license units loaded, shadow set member(s) will be removed in number minutes
Explanation: The number of shadow set members mounted exceeds the number of VOLSHAD-DISK license units loaded on the node. If this condition is not corrected before the number of minutes displayed in this message has elapsed, Volume Shadowing will remove unlicensed members from shadow sets in an attempt to make the node compliant with the number of loaded VOLSHAD-DISK license units.
User Action: Dismount shadow set members until the number of mounted members is equal to the number of VOLSHAD-DISK license units on the node.

SSRVNORMAL, successful completion of operation on device _virtual-unit: at LBN LBN-location, ID number: copy-thread-number
Explanation: The copy or merge operation has completed.
User Action: None.

SSRVRESCPY, resuming copy operation on device _virtual-unit: at LBN: logical-block-number I/O size: number-of-blocks blocks, ID number: copy-thread-number
Explanation: A copy operation is resuming. The message identifies the copy with a unique sequence number, the name of the shadow set virtual unit, the LBN location of the copy, and the size of the I/O request (in blocks). For example:


%SHADOW_SERVER-I-SSRVRESFCPY, resuming Full-Copy copy sequence number 
16777837 on device _DSA101:, at LBN 208314  I/O size: 71 blocks 

User Action: None.

SSRVSPNDCPY, suspending operation on device _virtual-unit: at LBN: logical-block-number, ID number: copy-thread-number
Explanation: A copy operation is being interrupted before it completes. (If a crash occurs during a copy operation, a minimerge assist can interrupt the copy operation to resolve inconsistencies. The shadowing software can resume the copy operation when the minimerge completes.) The following message identifies the copy operation with the name of the shadow set virtual unit, the LBN location of the copy, and a unique ID number.


%SHADOW_SERVER-I-SSRVSPNDCPY, suspending operation on 
device _DSA101:. at LBN: 208314, ID number: 16777837 

User Action: None.

SSRVSPNDMMRG, suspending minimerge operation on device _virtual-unit: at LBN: logical-block-number ID number: copy-thread-number
Explanation: A minimerge is interrupted before it completes. The message identifies the minimerge with the name of the shadow set virtual unit, the LBN location of the minimerge, and a unique ID number. For example:


%SHADOW_SERVER-I-SSRVSPNDMMRG, suspending minimerge operation 
on device _DSA101:. at LBN: 3907911, ID number: 16777837 

User Action: None.

SSRVSPNDMRG, suspending merge operation on device _virtual-unit: at LBN: LBN-location, ID number: copy-thread-number
Explanation: A merge operation has been suspended while the shadow set undergoes a copy operation.
User Action: None.

SSRVTRMSTS, reason for termination of operation on device: _virtual-unit:, abort status
Explanation: This message always accompanies the SSRVTERM message to provide further information about the copy termination.
User Action: Possible actions vary depending on the reason for the error. You might need to check and repair hardware or restart the copy operation.

SSRVTERMCPY, terminating operation on device: _virtual-unit:, ID number: copy-thread-number
Explanation: The copy thread is aborting. See the accompanying SSRVTRMSTS message for more information.
User Action: None.

SSRVTERMMRG, terminating operation on device: _virtual-unit:, ID number: copy-thread-number
Explanation: The merge thread is aborting. See the accompanying SSRVTRMSTS message for more information.
User Action: None.

SSRVTERMMMRG, terminating operation on device: _virtual-unit:, ID number: copy-thread-number
Explanation: The minimerge thread is aborting. See the accompanying SSRVTRMSTS message for more information.
User Action: None.

A.4 VOLPROC Messages

Shadowing operations can display the following status messages on the system console (OPA0) and on terminals enabled to receive disk operator messages.

Shadowing messages always include the prefix %SHADOW-I-VOLPROC and can sometimes be followed by "Volume Processing in Progress." The messages are displayed in the following format:

%SHADOW-I-VOLPROC, message-text

The following example shows a complete volume-processing status message:


%SHADOW-I-VOLPROC, DSA13: shadow set has changed state. Volume processing 
                        in progress. 

The following messages are returned by the VOLPROC in response to shadow set operations. The messages in this section are listed in alphabetical order beginning with the first word after the shadow set member name or the virtual unit name. For simplicity, the messages do not include the %SHADOW-I-VOLPROC prefix.

shadow-set-member: contains the wrong volume.
Explanation: The shadowing software discovered a volume label mismatch after failover.
User Action: Check the disk drives and unit numbers.

shadow-set-member: has aborted volume processing.
Explanation: The shadow set is dissolved. A shadow set member was not restored to operational status before the MVTIMEOUT system parameter setting expires; thus, the mount operation aborts for the shadow set.
User Action: Check error logs and the shadow set membership; the disk or controller might need repair.

shadow-set-member: has been write-locked.
Explanation: The data on the disk is protected against write I/O operations.
User Action: Remove the write lock on the volume.

shadow-set-member: has completed volume processing.
Explanation: The shadow set state change is complete.
User Action: Check the shadow set membership; the disk or controller might need repair.

shadow-set-member: is offline.
Explanation: A shadow set member is off line. The shadowing software attempts to fail over.
User Action: None.

shadow-set-member: shadow copy has been completed.
Explanation: A shadow copy operation has completed.
User Action: None.

shadow-set-member: shadow set has been reduced.
Explanation: The specified shadow set member has been removed.
User Action: If the member failed out of the set (not dismounted), look for the cause of the failure and repair it.

virtual-unit: all shadow set copy operations are completed.
Explanation: All pending shadow set copy operations have completed. The same logical block on any shadow set member contains the same data.
User Action: None.

virtual-unit: shadow copy has been started.
Explanation: Indicates the start of a shadow copy operation.
User Action: None.

virtual-unit: shadow master has changed. Dump file will be written if system crashes. Volume Processing in progress.
Explanation: The shadowing software has determined a new master disk for the system disk shadow set. You can write a dump file for this system only if the master is the same disk as the one the system booted from. This is because the boot drivers are not connected with the shadow driver, and different boot drivers from the ones that interact with the booted system disk might be needed to interact with the new master disk. For example, a system disk could be served and also locally connected, causing the served path to use different drivers from the local path.
User Action: None.

virtual-unit: shadow master has changed. Dump file will not be written if the system crashes. Volume processing in progress.
Explanation: Indicates that the disk from which you booted is no longer in the shadow set. If a system failure occurs, a dump file cannot be written to the removed disk.
User Action: Return the disk to the shadow set.

virtual-unit: shadow set has changed state. Volume processing in progress.
Explanation: The state of the shadow set is in transition. The membership of the shadow set is changing because of either the addition or removal of members from the shadow set, or failover to another device after a hardware error. Further messages give details if a change occurs.
User Action: None.


Previous Next Contents Index

  [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]  
  privacy and legal statement  
5423PRO_013.HTML