Document revision date: 19 July 1999 | |
Previous | Contents | Index |
The preserve argument indicates those registers that should be preserved over the routine call. This should include only those registers that are modified and whose full 64-bit contents should be saved and restored.
The preserve argument causes registers to be preserved whether or not they would have been preserved automatically by the compiler's processing of a .CALL_ENTRY or .JSB_ENTRY directive. This is also the only way in a .JSB32_ENTRY routine to save and restore the full 64 bits of a register. Note that because R0 and R1 are scratch registers, by calling standard definition, the compiler never saves and restores them in any routine unless you specify them in the preserve argument at the routine's entry point.
This argument overrides the output and scratch arguments. If you specify a register both in the preserve argument and in the output or scratch arguments, the compiler will preserve the register but will report the following warning:
%AMAC-W-REGDECCON, register declaration conflict in routine A |
The preserve argument has no effect on the compiler's
temporary register usage.
2.5.5 Help for Specifying Register Sets
When you invoke the compiler, specifying /FLAG=HINTS on the command line, the compiler generates messages that can assist you in constructing the register sets for routine entry points. Among the hints the compiler provides are the following:
It is recommended that the .CALL_ENTRY, .JSB_ENTRY, and .JSB32_ENTRY
register arguments reflect the routine interface, as described in the
routine's text header. Wherever possible, you should declare
input, output,
scratch, and preserve register
arguments for all routines. You only need to provide the argument when
there are registers to be declared (for instance,
input=<> is not necessary).
2.6 Branching Between Local Routines
The compiler allows a branch from the body of one routine into the body of another routine in the same module and psect. However, because this may result in additional overhead in both routines, the compiler reports an information-level message.
The compiler does not recognize a call to $EXIT as terminating a routine. Add an extra RET or RSB, whichever is applicable, after $EXIT to terminate the routine. |
If a CALL routine branches into a code path that executes an RSB, an error message is reported. Such a CALL routine, if not corrected, will fail at run time.
If a JSB routine branches into a code path that executes a RET instruction, and the JSB routine preserves any registers, an informational message is issued. This construct will work at run time, but the registers saved by the JSB routine will not be restored.
If routines that share a code path have different register declarations, the register restores will be done conditionally. That is, the registers written on the stack at routine entry will be the same for both routines, but whether or not the register is restored will depend upon which entry point was invoked.
For example:
rout1: .jsb_entry output=r3 movl r1, r3 ! R3 is output, not preserved movl r2, r4 ! R4 should be preserved blss lab1 rsb rout2: .jsb_entry ! R3 is not output, and movl #4, r3 ! should be auto-preserved movl r0, r4 ! R4 should be preserved lab1: clrl r0 rsb |
For both routines, R3 will be included in the registers saved on the stack at entry. However, at exit, a mask (also in the stack frame) will be tested before restoring R3. The mask will not be tested before R4 is restored, because R4 should be restored for both entry points.
Note that declaring registers that are destroyed in two routines that
share code as scratch in one but not the other is actually
more expensive than letting them be saved and restored. In this case,
declare them as scratch in both, or if one routine requires
that they be preserved, as preserve in both.
2.7 Declaring Exception Entry Points
The .EXCEPTION_ENTRY directive, as described in Appendix B, indicates the entry point of an exception service routine. Use the .EXCEPTION_ENTRY directive to declare the entry points for routines serving interrupts such as the following:
At routine entry, R3 must contain the address of the procedure descriptor. The routine must exit with an REI instruction.
At exception entry points, the interrupt dispatcher pushes onto the stack registers R2 through R7, the PC, and the PSL. To access the contents of these registers, specify the stack_base argument in the .EXCEPTION_ENTRY directive. The compiler generates code that places the value of the SP at routine entry in the register you specify in stack_base, allowing the exception service routine to use this register to locate the contents of registers on the stack.
The compiler automatically saves and restores all other registers used in the routine, plus, if the service routine issues a CALL or a JSB instruction, all scratch registers, including R0 and R1.
Error handling routines, established when their addresses are stored in the frame at 0(FP), are not .EXCEPTION_ENTRY routines. Such error handlers should be declared as .CALL_ENTRY routines and end with RET instructions. |
The packed decimal directive .PACKED and all packed decimal
instructions, except EDITPC, are supported for the MACRO-32 compiler by
emulation routines that exist outside the compiled module.
2.8.1 Differences Between the VAX and Alpha Implementations
The differences between the implementations on VAX and Alpha systems are noted in the following list:
MOVP R0, @8(AP), @4(AP) |
MOVL 8(AP), R1 MOVL 4(AP), R2 MOVP R0,(R1),(R2) |
All floating-point instructions and directives, with the exception of POLYx, EMODx and all H-Floating instructions, are supported.
These instructions are emulated via subroutine calls. This support is provided to allow hands-off compatibility for most existing VAX MACRO modules and is not designed for fast floating-point performance.
Besides the overhead of the emulation routine call, all floating-point
operands must be passed through memory because the Alpha architecture
does not have instructions to move values directly from the integer
registers to the floating-point registers. In addition, on the first
floating-point instruction, the FEN (Floating-point enable) bit is set
for the process which will cause the entire floating-point register set
to be saved and restored on every context switch for the life of the
image.
2.9.1 Differences Between the VAX and Alpha Implementations
The differences between the implementations on VAX and Alpha systems are noted in the following list:
MOVP R0, @8(AP), @4(AP) |
MOVL 8(AP), R1 MOVL 4(AP), R2 MOVP R0,(R1),(R2) |
This support does not make the floating-point register set visible to the compiler. It simply allows floating point-operations to be done on the integer registers. This means that routines in other languages that want to interface with a VAX MACRO routine, either calling it or being called by it, must not expect any floating-point values as inputs or outputs. Compilers for other languages will pass these values in the floating-point registers. Floating-point arguments can be passed into or out of a VAX MACRO routine only by pointer.
Calls to run-time library (RTL) routines of other languages fall into
this category. For example, a call to MTH$RANDOM returns a floating
value in floating-point register F0. The compiler cannot directly read
F0. You need to either create a jacket routine (in another language),
which makes the call to MTH$RANDOM and then moves the result to R0, or
write a separate routine that only does the move.
2.10 Preserving the Atomicity and Granularity of VAX MACRO
The VAX architecture includes instructions that perform a read-modify-write memory operation so that it appears to be a single, noninterruptible operation in a uniprocessing system. Atomicity is the term used to describe the ability to modify memory in one operation. Because the complexity of such instructions severely limits performance, read-modify-write operations on an Alpha system can be performed only by nonatomic, interruptible instruction sequences.
Furthermore, VAX instructions can address single aligned or unaligned byte, word, and longword locations in memory without affecting the surrounding memory locations. (A data item is considered aligned if its address is an even multiple of the item's size in bytes.) Granularity is the term used to describe the ability to independently write to portions of aligned longwords. Because byte, word, and unaligned longword access also severely limits performance, an Alpha system can only access aligned longword or quadword locations. Therefore, a sequence of instructions to write a single byte, word, or unaligned longword causes some of the surrounding bytes to be read and rewritten.
Both architectural differences can cause data to become corrupted under certain conditions. In an Alpha system, atomicity and granularity preservation are not provided by locking out other threads from modifying memory, but by providing a way to determine if a piece of memory may have been modified during the read-modify-write operation. In this case, the read-modify-write operation is retried.
To ensure data integrity, the compiler provides certain qualifiers and
directives to be used for the conditions described in the following
sections.
2.10.1 Preserving Atomicity
On VAX and Alpha multiprocessing systems alike, an application in which multiple, concurrent threads can modify shared data in a writable global section must have some way of synchronizing their access to that data. On a VAX single processor system, a memory modification instruction is sufficient to provide synchronized access to shared data. However, it is not sufficient on Alpha systems.
The compiler provides the /PRESERVE=ATOMICITY option to guarantee the integrity of read-modify-write operations for VAX instructions that have a memory modify operand. Alternatively, you can insert the .PRESERVE ATOMICITY and .NOPRESERVE ATOMICITY directives in sections of VAX MACRO source code as required to enable and disable atomicity.
For instance, the instruction INCL (R1) requires a read, modify, and write sequence on the data pointed to by R1. In a VAX system, the microcode performs these three operations. Therefore, an interrupt cannot occur until the sequence is fully completed. In an Alpha system, the following three instructions are required to perform the one VAX instruction:
LDL R28,(R1) ADDL R28, 1,R28 STL R28,(R1) |
When an atomic operation is required, and /PRESERVE=ATOMICITY (or .PRESERVE ATOMICITY) is specified, the compiler generates the following Alpha instruction sequence for INCL (R1):
Retry: LDL_L R28,(R1) ADDL R28,#1,R28 STL_C R28,(R1) BEQ R28, fail . . . fail: BR Retry |
If (R1) is modified by any other code thread on the current or any other processor during this sequence, the Store Longword Conditional instruction (STL_C) will not update (R1), but will indicate an error by writing 0 into R28. In this case, the code branches back and retries the operation until it completes without interference.
The BEQ Fail and BR Retry are done instead of a BEQ Retry because the branch prediction logic of the Alpha architecture assumes that backward conditional branches will be taken. Since this operation will rarely need to be retried, it is more efficient to make a forward conditional branch which is assumed not to be taken.
Because of the way atomicity is preserved on Alpha systems, this guarantee of atomicity applies to both uniprocessor and multiprocessor systems. This guarantee applies only to the actual modify instruction and does not extend interlocking to subsequent or previous memory accesses (see Section 2.10.6).
You should take special care in porting an application to an Alpha system if it involves multiple processes that modify shared data in a writable global section, even if the application executes only on a single processor. Additionally, you should examine any application in which a mainline process routine modifies data in process space that can also be modified by an asynchronous system trap (AST) routine or condition handler. See Migrating to an OpenVMS AXP System: Recompiling and Relinking Applications for a more complete discussion of the programming issues involved in read-modify-write operations in an Alpha system.
When preserving atomicity, the compiler generates aligned memory instructions that cannot be handled by the PALcode unaligned fault handler. They will cause a fatal reserved operand fault on unaligned addresses. Therefore, all memory references for which .PRESERVE ATOMICITY is specified must be to aligned addresses (see Section 2.10.5). |
Previous | Next | Contents | Index |
privacy and legal statement | ||
5601PRO_003.HTML |