Document revision date: 19 July 1999 | |
Previous | Contents | Index |
To preserve the granularity of a VAX MACRO memory write instruction on a byte, word, or unaligned longword on Alpha means to guarantee that the instruction executes successfully on the specified data and preserves the integrity of the surrounding data.
The VAX architecture includes instructions that perform independent access to byte, word, and unaligned longword locations in memory so two processes can write simultaneously to different bytes of the same aligned longword without interfering with each other.
The Alpha architecture defines instructions that can address only aligned longword and quadword operands. On Alpha, code that writes a data field to memory that is less than a longword in length or is not aligned can do so only by using an interruptible instruction sequence that involves a quadword load, an insertion of the modified data into the quadword, and a quadword store. In this case, two processes that intend to write to different bytes in the same quadword will actually load, perform operations on, and store the whole quadword. Depending on the timing of the load and store operations, one of the byte writes could be lost.
The compiler provides the /PRESERVE=GRANULARITY option to guarantee the integrity of byte, word, and unaligned longword writes. The /PRESERVE=GRANULARITY option causes the compiler to generate Alpha instructions that provide granularity preservation for any VAX instructions that write to bytes, words, or unaligned longwords. Alternatively, you can insert the .PRESERVE GRANULARITY and .NOPRESERVE GRANULARITY directives in sections of VAX MACRO source code as required to enable and disable granularity preservation.
For example, the instruction MOVB R1, (R2) generates the following Alpha code sequence:
LDQ_U R28,(R2) MSKBL R28,R2,R28 INSBL R1,R2,R25 BIS R25,R28,R25 STQ_U R25,(R2) |
If any other code thread modifies part of the data pointed to by (R2) between the LDQ_U and the STQ_U instructions, that data will be overwritten and lost.
If you have specified that granularity be preserved for the same instruction, by either the command qualifier or the directive, the Alpha command sequence becomes the following:
BIC R2,#^B0111,R24 RETRY: LDQ_L R28,(R24) MSKBL R28,R2,R28 INSBL R1,R2,R25 BIS R25,R28,R25 STQ_C R25,(R24) BEQ R25, FAIL . . . FAIL: BR RETRY |
In this case, if the data pointed to by (R2) is modified by another code thread, the operation will be retried.
For a MOVW R1,(R2) instruction, the code generated to preserve granularity depends on whether the register R2 is currently assumed to be aligned by the compiler's register alignment tracking. If R2 is assumed to be aligned, the compiler generates essentially the same code as in the preceding MOVB example, except that it uses INSWL and MSKWL instructions instead of INSBL and MSKBL, and it uses #^B0110 in the BIC of the R2 address. If R2 is assumed to be unaligned, the compiler generates two separate LDQ_L/STQ_C pairs to ensure that the word is correctly written even if it crosses a quadword boundary.
The code generated for an aligned word write, with granularity preservation enabled, will cause a fatal reserved operand fault at run time if the address is not aligned. If the address being written to could ever be unaligned, inform the compiler that it should generate code that can write to an unaligned word by using the compiler directive .SET_REGISTERS UNALIGNED=Rn immediately before the write instruction. |
To preserve the granularity of a MOVL R1,(R2) instruction, the compiler always writes whole longwords with a STL instruction, even if the address to which it is writing is assumed to be unaligned. If the address is unaligned, the STL instruction will cause an unaligned memory reference fault. The PALcode unaligned fault handler will then do the loads, masks, and stores necessary to write the unaligned longword. However, since PALcode is noninterruptible, this ensures that the surrounding memory locations are not corrupted.
When porting an application to an Alpha system, you should determine whether the application performs byte, word, or unaligned longword writes to memory that is shared either with processes executing on the local processor, or with processes executing on another processor in the system, or with an AST routine or condition handler. See Migrating to an OpenVMS AXP System: Recompiling and Relinking Applications for a more complete discussion of the programming issues involved in granularity operations in an Alpha system.
INSV instructions do not generate code that correctly preserves granularity when granularity is turned on. |
If you enable the preservation of both granularity and atomicity, and the compiler encounters VAX code that requires that both be preserved, atomicity takes precedence over granularity.
For example, the instruction INCW 1(R0), when compiled with .PRESERVE=GRANULARITY, retries the write of the new word value, if it is interrupted. However, when compiled with .PRESERVE=ATOMICITY, it will also refetch the initial value and increment it, if interrupted. If both options are specified, it will do the latter.
In addition, while the compiler can successfully generate code for
unaligned words and longwords that preserves granularity, it cannot
generate code for unaligned words or longwords that preserves
atomicity. If both options are specified, all memory references must be
to aligned addresses.
2.10.4 Examples When Atomicity Cannot Be Guaranteed
Because compiler atomicity guarantees only affect memory modification operands in VAX instructions, you should take special care in examining VAX MACRO sources for coding problems /PRESERVE=ATOMICITY cannot resolve. For instance, consider the following VAX instruction:
ADDL2 (R1),4(R1) |
For this instruction, the compiler generates an Alpha code sequence such as the following, when /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified:
LDL R28,(R1) Retry: LDL_L R24,4(R1) ADDL R28,R24,R24 STL_C R24,4(R1) BEQ fail . . . fail: BR Retry |
Note that, in this Alpha code sequence, when the STL_C fails, only the modify operand is reread before the add. The data (R1) is not reread. This behavior differs slightly from VAX behavior. In a VAX system, the entire instruction would execute without interruption; in an Alpha system, only the modify operand is updated atomically.
As a result, code that requires the read of the data (R1) to be atomic must use another method, such as a lock, to obtain that level of synchronization.
Consider another VAX instruction:
MOVL (R1),4(R1) |
LDL R28,(R1) STL R28,4(R1) |
The VAX instruction in this example is atomic on a single VAX CPU, but the Alpha instruction sequence is not atomic on a single Alpha CPU. Because the 4(R1) operand is a write operand and not a modify operand, the operation is not made atomic by the use of the LDL_L and STL_C.
Finally, consider a more complex VAX INCL instruction:
INCL @(R1) |
LDL R28,(R1) Retry: LDL_L R24,(R28) ADDL R24,#1,R24 STL_C R24,(R28) BEQ fail . . . fail: BR Retry |
Here, only the update of the modify data is atomic. The fetch required
to obtain the address of the modify data is not part of the atomic
sequence.
2.10.5 Alignment Considerations for Atomicity
When preserving atomicity, the compiler must assume the modify data is aligned. An update of a field spanning a quadword boundary cannot occur atomically since this would require two read-modify-write sequences. Since software cannot handle an unaligned LDx_L or STx_C instruction as it can a normal load or store instruction, a LDx_L or STx_C instruction to an unaligned address will generate a fatal reserved operand fault.
When /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, an INCL (R1) instruction generates LDL_L and STL_C instructions so R1 must be longword aligned.
For an INCW (R1) instruction, the compiler generates an Alpha code sequence such as the following:
BIC R1,#^B0110,R28 ; Compute Aligned Address Retry: LDQ_L R24,(R28) ; Load the QW with the data EXTWL R24,R1,R23 ; Extract out the Word ADDL R23,#1,R23 ; Increment the Word INSWL R23,R1,R23 ; Correctly position the Word MSKWL R24,R1,R24 ; Zero the spot for the Word BIS R23,R24,R23 ; Combine Original and New word STQ_C R23,(R28) ; Conditionally store result BEQ fail ; Branch ahead on failure . . . fail: BR Retry |
An INCB instruction uses #^B0111 to generate the aligned address since
all bytes are aligned.
2.10.6 Interlocked Instructions and Atomicity
The compiler's methods of preserving atomicity have an interesting side effect in compiled VAX MACRO code. On VAX systems, only the interlocked instructions will work correctly to synchronize access to shared data in multiprocessor systems. On Alpha multiprocessing systems, the code resulting from a compilation of modify instructions (with atomicity preserved) and interlocked instructions would both work correctly, because the LDx_L and STx_C which the compiler generates for both sets of instructions operate correctly across multiple processors.
Because this compiler side effect is specific to Alpha systems and does not port back to VAX systems, you should avoid relying on it when porting VAX MACRO code to Alpha if you intend to run the code on both systems.
However, interlocked instructions must still be used if the memory modification is being used as an interlock for other instructions for which atomicity is not preserved. This is because the Alpha architecture does not guarantee strict write ordering. For example, consider the following VAX MACRO code sequence:
.PRESERVE ATOMICITY INCL (R1) .NOPRESERVE ATOMICITY MOVL (R2),R3 |
This code sequence will generate the following Alpha code sequence:
Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, fail LDL R3, (R2) . . . fail: BR Retry |
Because of the data prefetching of the Alpha architecture, the data from (R2) may be read before the store to (R1) is processed. If the INCL (R1) instruction is being used as a lock to prevent the data at (R2) from being accessed before the lock is set, the read of (R2) may occur before the increment of (R1) and thus is not protected.
The VAX interlocked instructions generate Alpha MB (memory barrier) instructions before and after the interlocked sequence. This prevents memory loads from being moved across the interlocked instruction.
For example, consider the following code sequence:
ADAWI #1,(R1) MOVL (R2),R3 |
This code sequence will generate the following Alpha code sequence:
MB Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, Fail MB LDL R3, (R2) . . . Fail: BR Retry |
The MB instructions cause all memory operations before the MB
instruction to complete before any memory operations after the MB
instruction are allowed to begin.
2.11 Interoperability of Native and Translated Images
DECmigrate for OpenVMS AXP Systems Translating Images describes how to use the VAX Environment Software Translator (VEST), a component of the DECmigrate utility, to translate OpenVMS VAX images into images that can run on an OpenVMS Alpha system. Using VEST, you can translate all the components of an application, such as the main executable image and all the shareable images that it calls. However, you can also create an application that is a mix of translated and native components. For example, you may want to create a native version of a shareable image that is called by your application to take advantage of native performance. You may also choose to use a mixture of native and translated components to allow you to create a native version of your application in stages.
You can use translated OpenVMS VAX images as you would a native OpenVMS
Alpha image. To create native images that can interoperate with
translated images requires some additional considerations, described in
the following sections.
2.11.1 Compiling Native Images That Can Interoperate with Translated Images
To create a native image that can call or be called by a translated image, you must specify the /TIE qualifier when compiling the source files of the native OpenVMS Alpha image. (Note that /TIE is the default used by the compiler for this qualifier.) Any source module that contains a procedure that is made available to external callers must be compiled with the /TIE qualifier. When you specify the /TIE qualifier, the compiler creates procedure signature blocks (PSBs) that are needed by the Translated Image Environment (TIE) at execution time in order to properly jacket calls between translated and native images. The TIE is part of the OpenVMS Alpha operating system.
You must also specify the /TIE qualifier when compiling a source module that contains a procedure that performs a callback (or calls out to a specified procedure) that may be in a translated image. In this case, the /TIE qualifier causes the compiler to generate a call to a special OpenVMS Run-Time Library routine, OTS$CALL_PROC, that ensures that the outbound call to a translated procedure is handled properly.
Depending on application-specific semantics, you may also need to
specify other compiler qualifiers to force byte granularity, data
alignment, and AST atomicity.
2.11.2 Linking Native Images That Can Interoperate with Translated Images
To create a native OpenVMS Alpha image that can call a translated OpenVMS VAX image, you must link the native object modules with the /NONATIVE_ONLY linker qualifier. (Note that /NONATIVE_ONLY is the default used by the linker for this qualifier.) This qualifier causes the linker to include in the image the PSB information created by the compilers.
Because the /NATIVE_ONLY qualifier affects only outgoing calls from native images to translated images, you do not need to specify it when creating a native OpenVMS Alpha image that will be called by a translated OpenVMS VAX image. The linker's /NATIVE_ONLY qualifier can prevent native images from calling translated images but not from being called by translated images.
Note that the layout of the symbol vector in the native version of the
shareable image must match the layout of the symbol vector in the
translated shareable image it replaces.
2.12 Compiling and Linking
The compiler requires the following files, one for compiling, the other for linking:
File | Description |
---|---|
SYS$LIBRARY:STARLET.MLB | Macro library that defines the compiler directives. |
SYS$LIBRARY:STARLET.OLB | Object library containing emulation routines and other routines used by the compiler. |
When you compile your code, the compiler automatically checks STARLET.MLB for definitions of compiler directives. Similarly, when you link your code, the linker links against STARLET.OLB to resolve undefined symbols.
The following is an example of a command procedure used to compile the MACRO-32 module [SYS]SYSSNDJBC.MAR:
$ SET DEFAULT WORK1:[PEAK.A.PORT] $ MACRO/MIGRATION/LIS=LIS$SYSSNDJBC-ALPHA.LIS - ALPHA$LIBRARY:STARLET/LIB+ - ALPHA$LIBRARY:LIB/LIB+ - ALPHA$LIBRARY:ARCH_DEFS.MAR+ - SRC$SYSSNDJBC.MAR $ MACRO/NOOBJECT/LIS=LIS$:SYSSNDJBC-VAX - VAX$LIBRARY:STARLET/LIB+ - VAX$LIBRARY:LIB/LIB+ - VAX$LIBRARY:ARCH_DEFS.MAR+ - SRC$SYSSNDJBC.MAR $ EXIT |
Not all modules need both libraries and many modules need component-specific libraries, but this example shows the basic approach to using the compiler.
When you use the latest version of the compiler, you must also use the latest version of SYS$LIBRARY:STARLET.MLB. Make sure that the latest version is installed on your system and that the logical name points to the correct directory. |
The macro expansion line numbering scheme in the listing file is Xnn/mmm, where Xnn shows the nesting depth and mmm is the line number relative to the outermost macro, as shown in the following example.
.MAIN. Source Listing 9-SEP-1996 11:36:03 AMAC V3.0-20-311D 20-JUL-1992 11:05:38 X6AJ_RESD$:[SYSLIB]ARCH_DEFS.MAR;1 00000000 1 ; 00000000 2 ; This is the ALPHA (previously called "EVAX") version of ARCH_DEFS.MAR, 00000000 3 ; which contains architectural definitions for compiling VMS sources 00000000 4 ; for VAX and ALPHA systems. 00000000 5 ; 00000001 00000000 6 EVAX = 1 00000001 00000000 7 ALPHA = 1 00000001 00000000 8 BIGPAGE = 1 00000020 00000000 9 ADDRESSBITS = 32 00000000 10 .macro test1 00000000 11 clrl r1 00000000 12 clrl r2 00000000 13 tstl 48(sp) ; generate uplevel stack error 00000000 14 clrl r3 00000000 15 .endm 00000000 16 .macro test2 00000000 17 clrl r4 00000000 18 clrl r5 00000000 19 test1 00000000 20 clrl r6 00000000 21 .endm 00000000 22 00000000 23 foo: .jsb_entry . . . 00000000 44 clrl r0 00000011 45 test2 1....... %AMAC-E-UPLEVSTK, (1) up-level stack reference in routine FOO X01/001 00000002 clrl r4 X01/002 00000004 clrl r5 X01/003 00000006 test1 X02/004 00000006 clrl r1 X02/005 00000008 clrl r2 X02/006 0000000A tstl 48(sp) ; generate uplevel stack error X02/007 0000000D clrl r3 X02/008 0000000F X01/009 0000000F clrl r6 X01/010 00000011 00000011 46 rsb 00000012 47 .end |
To link the object files produced by the compiler, use the following commands as a basis:
$ @ALPHA$TOOLS:LINK ! Set up DCL and Logical to EXE $ LINK/ALPHA image_name,object1,object2,... |
For certain VAX instructions (such as the divide instructions and others described in this manual), the compiler produces object code that issues a call to the OpenVMS General-Purpose Run-Time Library (OTS$ RTL). By default, the linker links against the library that that contains these routines.
Previous | Next | Contents | Index |
privacy and legal statement | ||
5601PRO_004.HTML |