OpenVMS I/O User's Reference Manual

Document revision date: 30 March 2001

OpenVMS I/O User's Reference Manual

Contents

Index

2.2.2.1 Port Selection and Access Modes

The port select switches, on each disk drive, select the ports from which the drive can be accessed. A drive can be in one of the following access modes:

Locked on Port A---The drive is in a single-port mode (Port A). It does not respond to any request on Port B.
Locked on Port B---The drive is in a single-port mode (Port B). It does not respond to any request on Port A.
Programmable A/B---The drive is capable of responding to requests on either Port A or Port B. In this mode, the drive is always in one of the following states:
- The drive is connected and responding to a request on Port A. It is closed to requests on Port B.
- The drive is connected and responding to a request on Port B. It is closed to requests on Port A.
- The drive is in a neutral state. It is equally available to requests on either port on a first-come, first-serve basis.

The operational condition of the drive cannot be changed with the port select switches after the drive becomes ready. To change from one mode to another, the drive must be in a nonrotating condition. After the new mode selection has been made, the drive must be restarted.

If a drive is in the neutral state and a disk controller either reads or writes to a drive register, the drive immediately connects a port to the requesting controller. For read operations, the drive remains connected for the duration of the operation. For write operations, the drive remains connected until a release command is issued by the device driver or a 1-second timeout occurs. After the connected port is released from its controller, the drive checks the other port's request flag to determine whether there has been a request on that port. If no request is pending, the drive returns to the neutral state.

2.2.2.2 Disk Use and Restrictions

If the volume is mounted foreign, read/write operations can be performed at both ports provided the user maintains control of where information is stored on the disk.

The Autoconfigure utility currently may not be able to locate the nonactive port. For example, if a dual-ported disk drive is connected and responding at Port A, the CPU attached to Port B might not be able to find Port B with the Autoconfigure utility. If this problem occurs, execute the AUTOCONFIGURE ALL/LOG command after the system is running.

2.2.2.3 Restriction on Dual-Ported Non-DSA Disks in a Cluster

Do not use SYSGEN to AUTOCONFIGURE or CONFIGURE a dual-ported, non-DSA disk that is already available on the system through use of an MSCP server. Establishing a local connection to the disk when a remote path is already known creates two uncoordinated paths to the same disk. Use of these two paths may corrupt files and data on any volume mounted on the drive.

Note

If the disk is not dual-ported or is never served by an MSCP server on the remote host, this restriction does not apply.

In a cluster, dual-ported non-DSA disks (MASSBUS or UNIBUS) can be connected between two nodes of the cluster. These disks can also be made available to the rest of the cluster using the MSCP server on either or both of the hosts to which a disk is connected.

If the local path to the disk is not found during the bootstrap, then the MSCP server path from the other host will be the only available access to the drive. The local path will not be found during a boot if any of the following conditions exist:

The port select switch for the drive is not enabled for this host.
The disk, cable, or adapter hardware for the local path is broken.
There is sufficient activity on the other port to hide the existence of the port.
The system is booted in such a way that the SYSGEN AUTOCONFIGURE ALL command in the SYS$SYSTEM:STARTUP.COM procedure was not executed.

Use of the disk is still possible through the MSCP server path.

After the configuration of the disk has reached this state, it is important not to add the local path back into the system I/O database. Because the operating system does not provide an automatic method for adding this local path, the only possible way that you can add this local path is to use the System Generation utility (SYSGEN) qualifiers AUTOCONFIGURE or CONFIGURE to configure the device. SYSGEN is currently not able to detect the presence of the disk's MSCP path, and will incorrectly build a second set of data structures to describe it. Subsequent events could lead to incompatible and uncoordinated file operations, which might corrupt the volume.

To recover the local path to the disk, it is necessary to reboot the system connected to that local path.

2.2.3 Dual-Pathed DSA Disks

A dual-ported DSA disk can be failed over between the two CPUs that serve it to the cluster under the following conditions: (1) the same disk controller letter and allocation class are specified on both CPUs and (2) both CPUs are running the MSCP server.

Caution

Failure to observe these requirements can endanger data integrity.

However, because a DSA disk can be on line to only one controller at a time, only one of the CPUs can use its local connection to the disk. The second CPU accesses the disk through the MSCP server. If the CPU that is currently serving the disk fails, the other CPU detects the failure and fails the disk over to its local connection. The disk is thereby made available to the cluster once more.

Note

A dual-ported DSA disk may not be used as a system disk.

2.2.4 Dual-Porting HSC Disks

By design, HSC disks are cluster accessible. Therefore, if they are dual-ported, they are automatically dual-pathed. CI-connected CPUs can access a dual-pathed HSC disk by way of a path through either HSC-connected device.

For each dual-ported HSC disk, you can control failover to a specific port using the port select buttons on the front of each drive. By pressing either port select button (A or B) on a particular drive, you can cause the device failover to the specified port.

With the port select button, you can select alternate ports to balance the disk controller workload between two HSC subsystems. For example, you could set half of your disks to use port A and set the other half to use port B.

The port select buttons also allow you to failover all the disks to an alternate port manually when you anticipate the shutdown of one of the HSC subsystems.

2.2.5 Dual-Pathed RF-Series Disks

In a dual-path configuration of MicroVAX 3300/3400 CPUs or MicroVAX 3800/3900 CPUs using RF-series disks, CPUs have concurrent access to any disk on the DSSI bus. A single disk is accessed through two paths and can be served to all satellites by either CPU.

If either CPU fails, satellites can access their disks through the remaining CPU. Note that failover occurs in the following situations: (1) when the DSSI bus is connected between SII integral adapters on both MicroVAX 3300/3400 CPUs or (2) when the DSSI bus is connected between the KFQSA adapters on pairs of MicroVAX 3300/3400s or pairs of MicroVAX 3800/3900s.

Note

The DSSI bus should not be connected between a KFQSA adapter on one CPU and an SII integral adapter on another.

2.2.6 Data Check

A data check is made after successful completion of a read or write operation and, except for the TU58, compares the data in memory with the data on disk to make sure they match.

Disk drivers support data checks at the following levels:

Per request---You can specify the data check function modifier (IO$M_DATACHECK) on a read logical block, write logical block, read virtual block, write virtual block, read physical block, or write physical block operation. IO$M_DATACHECK is not supported for the RX01 and RX01 drivers.
Per volume---You can specify the characteristics "data check all reads" and "data check all writes" when the volume is mounted. The OpenVMS DCL Dictionary describes volume mounting and dismounting. The OpenVMS System Services Reference Manual describes the Mount Volume ($MOUNT) and Dismount Volume ($DISMOU) system services.
Per file---You can specify the file access attributes "data check on read" and "data check on write." File access attributes are specified when the file is accessed. Chapter 1 of this manual and the OpenVMS Record Management Services Reference Manual describe file access.

Offset recovery is performed during a data check but error code correction (ECC) correction is not performed (see Section 2.2.9). For example, if a read operation is performed and an ECC correction is applied, the data check would fail even though the data in memory is correct. In this case, the driver returns a status code indicating that the operation was completed successfully, but the data check could not be performed because of an ECC correction.

Data checks on read operations are extremely rare, and you can either accept the data as is, treat the ECC correction as an error, or accept the data but immediately move it to another area on the disk volume.

A data check operation directed to a TU58 does not compare the data in memory with the data on tape. Instead, either a read check or a write check operation is performed (see Sections 2.4.1 and 2.4.2).

2.2.7 Effects of a Failure During an I/O Write Operation

The operating system ensures that when an I/O write operation returns a successful completion status, the data is available on the disk or tape media. Applications that must guarantee the successful completion of a write operation can verify that the data is on the media by specifying the data check function modifier IO$M_DATACHECK. Note that the IO$M_DATACHECK data check function, which compares the data in memory with the data on disk, affects performance because the function incurs the overhead of an additional read operation to the media.

If a system failure occurs while a multiple-block write operation is in progress, the operating system does not guarantee the successful completion of the write operation. (OpenVMS does guarantee single-block write operations to DSA drives.) When a failure interrupts a write operation, the data may be left in any one of the following conditions:

The new data is written completely to the disk blocks on the media, but a completion status was not returned before the failure.
The new data is partially written to the media so that some of the disk blocks involved in the I/O contain the data from the write operation in progress, and the remainder of the blocks contain the data that was present before the write operation.
The new data was never written to the disk blocks on the media.

To guarantee that a write operation either finishes successfully or (in the event of failure) is redone or rolled back as if it were never started, use additional techniques to ensure data correctness and recovery. For example, using database journaling and recovery techniques allows applications to recover automatically from failures such as the following:

Permanent loss of the path between a CPU data buffer containing the data being written and the disk being written to during a multiple-block I/O operation. Communication path loss can occur due to node or controller failure or a failure of node-to-node communications.
Failure of a CPU (such as a system failure, system halt, power failure, or system shutdown) during a multiple-block write operation.
Mistaken deletion of a file.
Corruption of file system pointers.
File corruption due to a software error or incomplete bucket write operation to an indexed file.
Cancellation of an in-progress multiple-block write operation.

2.2.8 Overlapped Seeks

A seek operation involves moving the disk read/write heads to a specific place on the disk without any transfer of data. All transfer functions, including data checks, are preceded by an implicit seek operation (except when the seek is inhibited by the physical I/O function modifier IO$M_INHSEEK). Seek operations can be overlapped except on RL02, RX01, RX02, TU58 drives, MicroVAX 2000, VAXstation 2000, or on controllers with floppy disks (for example, RQDX3) when the disk is executing I/O requests. That is, when one drive performs a seek operation, any number of other drives can also perform seek operations.

During the seek operation, the controller is free to perform transfers on other units. Therefore, seek operations can also overlap data transfer operations. For example, at any one time, seven seeks and one data transfer could be in progress on a single controller.

This overlapping is possible because, unlike I/O transfers, seek operations do not require the controller once they are initiated. Therefore, seeks are initiated before I/O transfers and other functions that require the controller for extended periods.

All DSA controllers perform extensive seek optimization functions as part of their operation; IO$M_INHSEEK has no effect on these controllers.

2.2.9 Error Recovery

Error recovery in the operating system is aimed at performing all possible operations to complete an I/O operation successfully. Error recovery operations fall into the following categories:

Handling special conditions such as power failure and interrupt timeout.
Retrying nonfatal controller and drive errors. For DSA and SCSI disks, this function is implemented by the controller.
Applying error correction information (not applicable for RB02, RL02, RX01, RX02, and TU58 drives). For DSA and SCSI disks, error correction is implemented by the controller.
Offsetting read heads to try to obtain a stronger recorded signal (not applicable for RB02, RL02, RB80, RM80, RX01, RX02, and TU58 drives). For DSA and SCSI disks, this function is implemented by the controller.

The error recovery algorithm uses a combination of these four types of error recovery operations to complete an I/O operation:

Power failure recovery consists of waiting for mounted drives to spin up and come on line, followed by reexecution of the I/O operation that was in progress at the time of the power failure.
Device timeout is treated as a nonfatal error. The operation that was in progress when the timeout occurred is reexecuted up to eight times before a timeout error is returned.
Nonfatal controller/drive errors are executed up to eight times before a fatal error is returned.
All normal error recovery procedures (nonspecial conditions) can be inhibited by specifying the inhibit retry function modifier (IO$M_INHRETRY). If any error occurs and this modifier is specified, the virtual, logical, or physical I/O operation is immediately terminated, and a failure status is returned. This modifier has no effect on power recovery and timeout recovery.

2.2.9.1 Skip Sectoring

Skip sectoring is a bad block treatment technique implemented on R80 disk drives (the RB80 and RM80 drives). In each track of 32 sectors, one sector is reserved for bad block replacement. Consequently, an R80 drive has available only 31 sectors per track. The Get Device/Volume Information ($GETDVI) system service returns this value.

You can detect bad blocks when a disk is formatted. Most formatters place these blocks in a bad block file. On an R80 drive, the first bad block encountered on a track is designated as a skip sector. This is accomplished by setting a flag in the sector header on the disk and placing the block in the skip sector file.

When a skip sector is encountered during a data transfer, it is skipped over, and all remaining blocks in the track are shifted by one physical block. For example, if block number 10 is a skip sector, and a transfer request was made beginning at block 8 for four blocks, then blocks 8, 9, 11, and 12 will be transferred. Block 10 will be skipped.

Because skip sectors are implemented at the device driver level, they are not visible to you. The device appears to have 31 contiguous sectors per track. Sector 32 is not directly addressable, although it is accessed if a skip sector is present on the track.

2.2.10 Logical-to-Physical Translation (RX01 and RX02)

Logical-block-to-physical-sector translation on RX01 and RX02 drives adheres to the standard format. For each 512-byte logical block selected, the driver reads or writes four 128-byte physical sectors (or two 256-byte physical sectors if an RX02 is in double-density mode). To minimize rotational latency, the physical sectors are interleaved. Interleaving allows the processor time to complete a sector transfer before the next sector in the block reaches the read/write heads. To allow for track-to-track switch time, the next logical sector that falls on a new track is skewed by six sectors. (There is no interleaving or skewing on read physical block and write physical block I/O operations.) Logical blocks are allocated starting at track 1; track 0 is not used.

The translation procedure, in more precise terms, is as follows:

Compute an uncorrected medium address using the following dimensions:
Number of sectors per track = 26
Number of tracks per cylinder = 1
Number of cylinders per disk = 77
Correct the computed address for interleaving and track-to-track skew (in that order) as shown in the following Compaq Fortran for OpenVMS statements. ISECT is the sector address and ICYL is the cylinder address computed in Step 1.
Interleaving: ITEMP = ISECT*2
IF (ISECT .GT. 12) ITEMP = ITEMP-25
ISECT = ITEMP
Skew:
ISECT = ISECT+(6*ICYL)
ISECT = MOD (ISECT, 26)
Set the sector number in the range of 1 through 26 as required by the hardware:

ISECT = ISECT+1
Adjust the cylinder number to cylinder 1 (cylinder 0 is not used):

ICYL = ICYL+1

2.2.11 DIGITAL Storage Architecture (DSA) Devices

The DIGITAL Storage Architecture (DSA) is a collection of specifications that cover all aspects of a mass storage product. The specifications are grouped into the following general categories:

Media format---Describes the structure of sectors on a disk and the algorithms for replacing bad blocks
Drive-to-controller interconnect---Describes the connection between a drive and its controller
Controller-to-host communications---Describes how hosts request controllers to transfer data

Because the operating system supports all DSA disks, it supports all controller-to-host aspects of DSA. Some of these disks, such as the RA60, RA80, and RA81, use the standard drive-to-controller specifications. Other disks, such as the RC25, RD51, RD52, RD53, and RX50, do not. Disk systems that use the standard drive-to-controller specifications employ the same hardware connections and use the HSC50, KDA50, KDB50, and UDA50 interchangeably. Disk systems that do not use the drive-to-controller specifications provide their own internal controller, which conforms to the controller-to-host specifications.

DSA disks differ from MASSBUS and UNIBUS disks in the following ways:

DSA disks contain no bad blocks. The hardware and the disk class driver (DUDRIVER) function to ensure a logically contiguous range of good blocks. If any block in the user area of the disk develops a defective area, all further access to that block is revectored to a spare good block. Consequently, it is never necessary to run the Bad Block Locator utility (BAD) on DSA disks. There is no manufacturer's bad block list and the file BADBLK.SYS is empty. (The Verify utility, which is invoked by the ANALYZE /DISK_STRUCTURE /READ_CHECK command, can be used to check the integrity of newly received disks.) See Section 2.2.11.1 for more information about bad block replacement for DSA disks.
Insert a WAIT statement in your SYSTARTUP_V5.COM file on VAX systems, or your SYSTARTUP_VMS.COM file on Alpha systems, prior to the first MOUNT statement for a DSA disk. The wait period should be about 2 to 4 seconds for the UDA50 and about 30 seconds for the HSC50. The wait time is controller-dependent and can be as much as several minutes if the controller is offline or otherwise inoperative. If this wait is omitted, the MOUNT request may fail with a "no such device" status.
The DUDRIVER and the DSA device controllers allow multiple, concurrently outstanding QIO requests. The order in which these requests complete might not be in the order in which they were issued.
All DSA disks can be dual-ported, but only one HSC/UDA controller can control a disk at a time (see Section 2.2.3).
In many cases, you can attach a DSA disk to its controller on a running system and then use it without manual intervention.
DSA disks and the DUDRIVER do not accept physical QIO data transfers or seek operations.

2.2.11.1 Bad Block Replacement and Forced Errors for DSA Disks

Disks that are built according to the DSA specifications appear to be error free. Some number of logical blocks are always capable of recording data. When a disk is formatted, every user-addressable logical block is mapped to a functioning portion of the actual disk surface, which is known as a physical block. The physical block has the true data storage capacity represented by the logical block.

Additional physical blocks are set aside to replace blocks that fail during normal disk operations. These extra physical blocks are called replacement blocks. Whenever a physical block to which a logical block is mapped begins to fail, the associated logical block is remapped (revectored) to one of the replacement blocks. The process that revectors logical blocks is called a bad block replacement operation. Bad block replacement operations use data stored in a special area of the disk called the Replacement and Caching Table (RCT).

When a drive-dependent error threshold is reached, the need for a bad block replacement operation is declared. Depending on the controller involved, the bad block replacement operation is performed either by the controller itself (as is the case with HSCs) or by the host (as is the case with UDAs). In either case, the same steps are performed. After inspecting and altering the RCT, the failing block is read and its contents are stored in a reserved section of the RCT.

The design goal of DSA disks is that this read operation proceeds without error and that the RCT copy of the data is correct (as it was originally written). The failing block is then tested with one or more data patterns. If no errors are encountered in this test, the original data is copied back to the original block and no further action is taken. If the data-pattern test fails, the logical block is revectored to a replacement block. After the block is revectored, the original data is copied back to the revectored logical block. In all these cases, the original data is preserved and the bad block replacement operation occurs without the user being aware that it happened.

However, if the original data cannot be read from the failing block, a best-attempt copy of the data is stored in the RCT and the bad block replacement operation proceeds. When the time comes to write-back the original data, the best-attempt data (stored in the RCT) is written back with the forced error flag set. The forced error flag is a signal that the data read is questionable. Reading a block that contains a forced error flag causes the status SS$_FORCEDERROR to be returned. This status is displayed by the following message:

%SYSTEM-F-FORCEDERROR, forced error flagged in last sector read

Writing into a block always clears the forced error flag.

Note that most utilities and DCL commands treat the forced error flag as a fatal error and terminate operation when they encounter it. However, the Backup utility (BACKUP) continues to operate in the presence of most errors, including the forced error. BACKUP continues to process the file, and the forced error flag is lost. Thus, data that was formerly marked as questionable may become correct data.

System managers (and other users of BACKUP) should assume that forced errors reported by BACKUP signal possible degradation of the data.

To determine what, if any, blocks on a given disk volume have the forced error flag set, use the ANALYZE /DISK_STRUCTURE /READ_CHECK command, which invokes the Verify utility. The Verify utility reads every logical block allocated to every file on the disk and then reports (but ignores) any forced error blocks encountered.

Contents

Index

privacy and legal statement

6136PRO_006.HTML