Document revision date: 19 July 1999
[Compaq] [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]
[OpenVMS documentation]

OpenVMS Version 7.2 Release Notes


Previous Contents Index


Chapter 7
Using Interlocked Memory Instructions (Alpha Only)

The Alpha Architecture Reference Manual, Third Edition (AARM) describes strict rules for using interlocked memory instructions. The new Alpha 21264 (EV6) processor and all future Alpha processors are more stringent than their predecessors in their requirement that these rules be followed. As a result, code that has worked in the past, despite noncompliance, could fail when executed on systems featuring the new 21264 processor. Occurrences of these noncompliant code sequences are believed to be rare. Note that the 21264 processor is not supported on versions prior to OpenVMS Alpha Version 7.1--2.

Noncompliant code can result in a loss of synchronization between processors when interprocessor locks are used, or can result in an infinite loop when an interlocked sequence always fails. Such behavior has occurred in some code sequences in programs compiled on old versions of the BLISS compiler, some versions of the MACRO--32 compiler and the MACRO--64 assembler, and in some DEC C and C++ programs.

The affected code sequences use LDx_L/STx_C instructions, either directly in assembly language sources or in code generated by a compiler. Applications most likely to use interlocked instructions are complex, multithreaded applications or device drivers using highly optimized, hand-crafted locking and synchronization techniques.

7.1 Required Code Checks

OpenVMS recommends that code that will run on the 21264 processor be checked for these sequences. Particular attention should be paid to any code that does interprocess locking, multithreading, or interprocessor communication.

The SRM_CHECK tool (named after the System Reference Manual, which defines the Alpha architecture) has been developed to analyze Alpha executables for noncompliant code sequences. The tool detects sequences that might fail, reports any errors, and displays the machine code of the failing sequence.

7.2 Using the Code Analysis Tool

The SRM_CHECK tool can be found in the following location on the OpenVMS Alpha Version 7.2 Operating System CD-ROM:


SYS$SYSTEM:SRM_CHECK.EXE 

To run the SRM_CHECK tool, define it as a foreign command (or use the DCL$PATH mechanism) and invoke it with the name of the image to check. If a problem is found, the machine code is displayed and some image information is printed. The following example illustrates how to use the tool to analyze an image called myimage.exe:


$ define DCL$PATH []
$ srm_check myimage.exe

The tool supports wildcard searches. Use the following command line to initiate a wildcard search:


$ srm_check [*...]* -log

Use the -log qualifier to generate a list of images that have been checked. The -output qualifier can be used to write the output to a data file. For example, the following command directs output to a file named CHECK.DAT:


$ srm_check 'file' -output check.dat

The output from the tool can be used to find the module that generated the sequence by looking in the image's MAP file. The addresses shown correspond directly to the addresses that can be found in the MAP file.

The following example illustrates the output from using the analysis tool on an image named SYSTEM_SYNCHRONIZATION.EXE:


 
 ** Potential Alpha Architecture Violation(s) found in file... 
 ** Found an unexpected ldq at 00003618      
 0000360C   AD970130     ldq_l          R12, 0x130(R23) 
 00003610   4596000A     and            R12, R22, R10 
 00003614   F5400006     bne            R10, 00003630 
 00003618   A54B0000     ldq            R10, (R11) 
 Image Name:    SYSTEM_SYNCHRONIZATION 
 Image Ident:   X-3 
 Link Time:      5-NOV-1998 22:55:58.10 
 Build Ident:   X6P7-SSB-0000 
 Header Size:   584 
 Image Section: 0, vbn: 3, va: 0x0, flags: RESIDENT EXE (0x880) 
 

The MAP file for system_synchronization.exe contains the following:


   EXEC$NONPAGED_CODE       00000000 0000B317 0000B318 (      45848.) 2 **  5 
   SMPROUT         00000000 000047BB 000047BC (      18364.) 2 **  5 
   SMPINITIAL      000047C0 000061E7 00001A28 (       6696.) 2 **  5 

The address 360C is in the SMPROUT module, which contains the addresses from 0-47BB. By looking at the machine code output from the module, you can locate the code and use the listing line number to identify the corresponding source code. If SMPROUT had a nonzero base, it would be necessary to subtract the base from the address (360C in this case) to find the relative address in the listing file.

Note that the tool reports potential violations in its output. Although SRM_CHECK can normally identify a code section in an image by the section's attributes, it is possible for OpenVMS images to contain data sections with those same attributes. As a result, SRM_CHECK may scan data as if it were code, and occasionally, a block of data may look like a noncompliant code sequence. This circumstance is rare and can be detected by examining the MAP and listing files.

7.3 Characteristics of Noncompliant Code

The areas of noncompliance detected by the SRM_CHECK tool can be grouped into the following four categories. Most of these can be fixed by recompiling with new compilers. In rare cases, the source code may need to be modified. See Section 7.5 for information about compiler versions.

If the SRM_CHECK tool finds a violation in an image, the image should be recompiled with the appropriate compiler (see Section 7.5). After recompiling, the image should be analyzed again. If violations remain after recompiling, the source code must be examined to determine why the code scheduling violation exists. Modifications should then be made to the source code.

7.4 Coding Requirements

The Alpha Architecture Reference Manual describes how an atomic update of data between processors must be formed. The Third Edition, in particular, has much more information on this topic. In this edition, Section 5.5, "Data Sharing", and Section 4.2.4, which describes the LDx_L instructions, detail the conventions of the interlocked memory sequence.

Exceptions to the following two requirements are the source of all known noncompliant code:

Therefore, the SRM_CHECK tool looks for the following:

To illustrate, the following are examples of code flagged by SRM_CHECK.


        ** Found an unexpected ldq at 0008291C 
        00082914   AC300000     ldq_l          R1, (R16) 
        00082918   2284FFEC     lda            R20, 0xFFEC(R4) 
        0008291C   A6A20038     ldq            R21, 0x38(R2) 

In the above example, an LDQ instruction was found after an LDQ_L before the matching STQ_C. The LDQ must be moved out of the sequence, either by recompiling or by source code changes. (See Section 7.3.)


        ** Backward branch from 000405B0 to a STx_C sequence at 0004059C 
        00040598   C3E00003     br             R31, 000405A8 
        0004059C   47F20400     bis            R31, R18, R0 
        000405A0   B8100000     stl_c          R0, (R16) 
        000405A4   F4000003     bne            R0, 000405B4 
        000405A8   A8300000     ldl_l          R1, (R16) 
        000405AC   40310DA0     cmple          R1, R17, R0 
        000405B0   F41FFFFA     bne            R0, 0004059C 

In the above example, a branch was discovered between the LDL_L and STQ_C. In this case, there is no "fall through" path between the LDx_L and STx_C, which the architecture requires.

Note

This branch backward from the LDx_L to the STx_C is characteristic of the noncompliant code introduced by the "loop rotation" optimization.

The following MACRO--32 source code demonstrates code where there is a "fall through" path, but this case is still noncompliant because of the potential branch and a memory reference in the lock sequence.


        getlck: evax_ldql  r0, lockdata(r8)  ; Get the lock data 
                movl       index, r2         ; and the current index. 
                tstl       r0                ; If the lock is zero, 
                beql       is_clear          ; skip ahead to store. 
                movl       r3, r2            ; Else, set special index. 
        is_clear: 
                incl       r0                ; Increment lock count 
                evax_stqc  r0, lockdata(r8)  ; and store it. 
                tstl       r0                ; Did store succeed? 
                beql       getlck            ; Retry if not. 
 

To correct this code, the memory access to read the value of INDEX must first be moved outside the LDQ_L/STQ_C sequence. Next, the branch between the LDQ_L and STQ_C, to the label IS_CLEAR, must be eliminated. In this case, it could be done using a CMOVEQ instruction. The CMOVxx instructions are frequently useful for eliminating branches around simple value moves. The following example shows the corrected code:


                movl       index, r2         ; Get the current index 
        getlck: evax_ldql  r0, lockdata(r8)  ; and then the lock data. 
                evax_cmoveq r0, r3, r2       ; If zero, use special index. 
                incl       r0                ; Increment lock count 
                evax_stqc  r0, lockdata(r8)  ; and store it. 
                tstl       r0                ; Did write succeed? 
                beql       getlck            ; Retry if not. 
 

7.5 Compiler Versions

This section contains information about versions of compilers that may generate noncompliant code sequences and the recommended versions to use when recompiling.

Table 7-1 contains information for OpenVMS compilers.

Table 7-1 OpenVMS Compilers
Old Version Recommended Minimum Version
BLISS V1.1 BLISS V1.3
DEC C V5.x DEC C V6.0
DEC C++ V5.x DEC C++ V6.0
DEC Pascal V5.0-2 DEC Pascal V5.1-11
MACRO--32 V3.0 V3.1 for OpenVMS Version 7.1--2
V4.1 for OpenVMS Version 7.2
MACRO--64 V1.2 See below.

Current versions of the MACRO--64 assembler may still encounter the loop rotation issue. However, MACRO--64 does not perform code optimization by default, and this problem occurs only when optimization is enabled. If SRM_CHECK indicates a noncompliant sequence in the MACRO--64 code, it should first be recompiled without optimization. If the sequence is still flagged when retested, the source code itself contains a noncompliant sequence that must be corrected.

7.6 Interlocked Memory Sequence Checking for the MACRO--32 Compiler

The MACRO--32 Compiler for OpenVMS Alpha Version 4.1 now performs additional code checking and displays warning messages for noncompliant code sequences. The following warning messages can display under the circumstances described:

BRNDIRLOC, branch directive ignored in locked memory sequence

Explanation: The compiler found a .BRANCH_LIKELY directive within an LDx_L/STx_C sequence.
User Action: None. The compiler will ignore the .BRANCH_LIKELY directive and, unless other coding guidelines are violated, the code will work as written.
BRNTRGLOC, branch target within locked memory sequence in routine 'routine_name'

Explanation: A branch instruction has a target that is within an LDx_L/STx_C sequence.
User Action: To avoid this warning, rewrite the source code to avoid branches within or into LDx_L/STx_C sequences. Branches out of interlocked sequences are valid and are not flagged.
MEMACCLOC, memory access within locked memory sequence in routine 'routine_name'

Explanation: A memory read or write occurs within an LDx_L/STx_C sequence. This can be either an explicit reference in the source code, such as "MOVL data, R0", or an implicit reference to memory. For example, fetching the address of a data label (e.g., "MOVAB label, R0") is accomplished by a read from the linkage section, the data area that is used to resolve external references.
User Action: To avoid this warning, move all memory accesses outside the LDx_L/STx_C sequence.
RETFOLLOC, RET/RSB follows LDx_L instruction

Explanation: The compiler found a RET or RSB instruction after an LDx_L instruction and before finding an STx_C instruction. This indicates an ill-formed lock sequence.
User Action: Change the code so that the RET or RSB instruction does not fall between the LDx_L instruction and the STx_C instruction.
RTNCALLOC, routine call within locked memory sequence in routine 'routine_name'

Explanation: A routine call occurs within an LDx_L/STx_C sequence. This can be either an explicit CALL/JSB in the source code, such as "JSB subroutine", or an implicit call that occurs as a result of another instruction. For example, some instructions such as MOVC and EDIV generate calls to run-time libraries.
User Action: To avoid this warning, move the routine call or the instruction that generates it, as indicated by the compiler, outside the LDx_L/STx_C sequence.
STCMUSFOL, STx_C instruction must follow LDx_L instruction

Explanation: The compiler found an STx_C instruction before finding an LDx_L instruction. This indicates an ill-formed lock sequence.
User Action: Change the code so that the STx_C instruction follows the LDx_L instruction.

7.7 Recompiling Code with ALONONPAGED_INLINE or LAL_REMOVE_FIRST Macros

Any MACRO--32 code on OpenVMS Alpha that invokes either the ALONONPAGED_INLINE or the LAL_REMOVE_FIRST macros from the SYS$LIBRARY:LIB.MLB macro library must be recompiled on OpenVMS Version 7.2 to obtain a correct version of these macros. The change to these macros corrects a potential synchronization problem that is more likely to be encountered on the new Alpha 21264 (EV6) processors.

Note

Source modules that call the EXE$ALONONPAGED routine (or any of its variants) do not need to be recompiled. These modules transparently use the correct version of the routine that is included in this release.


Previous Next Contents Index

  [Go to the documentation home page] [How to order documentation] [Help on this site] [How to contact us]  
  privacy and legal statement  
6534PRO_016.HTML