Document revision date: 19 July 1999 | |
Previous | Contents | Index |
Vector forms of many MTH$ routines are provided to support vectorized compiled applications. Vector versions of key F-floating, D-floating, and G-floating scalar routines employ vector hardware, while maintaining identical results with their scalar counterparts. Many of the scalar algorithms have been redesigned to ensure identical results and good performance for both the vector and scalar versions of each routine. All vectorized routines return bit-for-bit identical results as the scalar versions.
You can call the vector MTH$ routines directly if your program is
written in VAX MACRO. If you are a Fortran programmer, specify the
Fortran intrinsic function name only. The Fortran compiler will then
determine whether the vector or scalar version of a routine should be
used.
2.3.1 Exceptions
You should not attempt to recover from an MTH$ vector exception. After
an MTH$ vector exception, the vector routines cannot continue
execution, and nonexceptional values might not have been computed.
2.3.2 Underflow Detection
In general, if a vector instruction results in the detection of both a floating overflow and a floating underflow, only the overflow will be signaled.
Some scalar routines check to see if a user has enabled underflow
detection. For each of those scalar routines, there are two
corresponding vector routines: one that always enables underflow
checking and one that never enables underflow checking. (In the latter
case, underflows produce a result of zero.) The Fortran compiler always
chooses the vector version that does not signal underflows, unless the
user specifies the /CHECK=UNDERFLOW qualifier. This ensures that the
check is performed but does not impair vector performance for those not
interested in underflow detection.
2.3.3 Vector Routine Name Format
Use one of the formats in Table 2-3 to call (from VAX MACRO) a vector math routine that enables underflow signaling. (The E in the routine name means enabled underflow signaling.)
Format | Type of Routine |
---|---|
MTH$Vx SAMPLE_E_Ry_Vz | Real valued math routine |
MTH$VCx SAMPLE_E_Ry_Vz | Complex valued math routine |
OTS$ SAMPLEq_E_Ry_Vz | Power routine or complex multiply and divide |
Use one of the formats in Table 2-4 to call (from VAX MACRO) a vector math routine that does not enable underflow signaling.
Format | Type of Routine |
---|---|
MTH$Vx SAMPLE_Ry_Vz | Real valued math routine |
MTH$VCx SAMPLE_Ry_Vz | Complex valued math routine |
OTS$ SAMPLEq_Ry_Vz | Power routine or complex multiply and divide |
In the preceding formats, the following conventions are used:
x | The letter A (or blank) for F-floating, D for D-floating, G for G-floating. | |||||||||||||||||||||
y | A number between 0 and 11 (inclusive). R y means that the scalar registers R0 through R y will be used by the routine SAMPLE. You must save these registers. | |||||||||||||||||||||
z | A number between 0 and 15 (inclusive). V z means that the vector registers V0 through V z will be used by the routine SAMPLE. You must save these registers. | |||||||||||||||||||||
q |
Two letters denoting the base and power data type, as follows:
|
You can call the vector MTH$ routines directly if your program is written in VAX MACRO.
If you are a Digital Fortran programmer, do not specify the MTH$ vector routines explicitly. Specify the Fortran intrinsic function name only. The Fortran compiler determines whether the vector or scalar version of a routine should be used. |
In the following examples, keep in mind that vector real arguments are passed in V0, V1, and so on, and vector real results are returned in V0. On the other hand, vector complex arguments are passed in V0 and V1, V2, and V3, and so on. Vector complex results are returned in V0 and V1.
Argument | Argument Passed Register |
Results Returned Register |
---|---|---|
Vector real arguments | V0, V1,... | V0 |
Vector complex arguments | V0 and V1, V2 and V3,... | V0 and V1 |
The following example shows how to call the vector version of MTH$EXP. Assume that you do not want underflows to be signaled, and you need to use the current contents of all vector and scalar registers after the invocation. Before you can call the vector routine from VAX MACRO, perform the following steps.
The following MACRO program fragment shows this example. Assume that:
Note that MTH$VEXP_R3_V6 denotes an F-floating data type because there is no letter between V and E in the routine name. (For further explanation, refer to Section 2.3.3.) The stride (the number of array elements that are skipped) must be a multiple of 4 because each F-floating value requires 4 bytes.
MTVLR #60 ; Load VLR MOVL #4, R5 ; Stride VLDL (R4), R5, V0 ; Load V0 with the actual arguments JSB G^MTH$VEXP_R3_V6 ; JSB to MTH$VEXP VSTL V0, (R6), R5 ; Store the results |
The following example demonstrates how to call the vector version of OTS$POWDD with a vector base raised to a scalar power. Before you can call the vector routine from VAX MACRO, perform the following steps.
The following MACRO program fragment shows how to call OTS$VPOWDD_R1_V8 to compute the result of raising 60 values to the power P. Assume that:
Note that OTS$VPOWDD_R1_V8 raises a D-floating base to a D-floating power, which you determine from the DD in the routine name. (For further explanation, refer to Section 2.3.3.) The stride (the number of array elements that are skipped) must be a multiple of 8 because each D-floating value requires 8 bytes.
; R0/R1 already contains the power MTVLR #60 ; Load VLR MOVL #8, R5 ; Stride VLDQ (R4), R5, V0 ; Load V0 with the actual arguments CALLS #0,G^OTS$VPOWDD_R1_V8 ; CALL OTS$VPOWDD VSTQ V0, (R6), R5 ; Store the results |
This section describes the fast-vector math routines that offer significantly higher performance at the cost of slightly reduced accuracy when compared with corresponding standard vector math routines. Also note that some fast-vector math routines have restricted argument domains.
When you specify the compile command qualifiers /VECTOR and /MATH_LIBRARY=FAST, the Digital Fortran compiler selects the appropriate fast-vector math routine, if one exists. The default is /MATH_LIBRARY=ACCURATE. You must specify the /G_FLOATING compile qualifier in conjunction with the /MATH_LIBRARY=FAST and /VECTOR qualifiers to access the G_floating routines.
You can call these routines from VAX MACRO using the standard calling method. The math function names, together with corresponding entry points of the fast-vector math routines, are listed in Table 2-5.
Function Name | Data Type | Call or JSB | Vector Input Registers | Vector Output Registers | Vector Name (Underflows Not Signaled) |
---|---|---|---|---|---|
ATAN | F_floating | JSB | V0 | V0 | MTH$VYATAN_R0_V3 |
DATAN | D_floating | JSB | V0 | V0 | MTH$VYDATAN_R0_V5 |
GATAN | G_floating | JSB | V0 | V0 | MTH$VYGATAN_R0_V5 |
ATAN2 | F_floating | JSB | V0, V1 | V0 | MTH$VVYATAN2_R0_V5 |
DATAN2 | D_floating | JSB | V0, V1 | V0 | MTH$VVYDATAN2_R0_V5 |
GATAN2 | G_floating | JSB | V0, V1 | V0 | MTH$VVYGATAN2_R0_V5 |
COS | F_floating | JSB | V0 | V0 | MTH$VYCOS_R0_V3 |
DCOS | D_floating | JSB | V0 | V0 | MTH$VYDCOS_R0_V3 |
GCOS | G_floating | JSB | V0 | V0 | MTH$VYGCOS_R0_V3 |
EXP | F_floating | JSB | V0 | V0 | MTH$VYEXP_R0_V4 |
DEXP | D_floating | JSB | V0 | V0 | MTH$VYDEXP_R0_V6 |
GEXP | G_floating | JSB | V0 | V0 | MTH$VYGEXP_R0_V6 |
LOG | F_floating | JSB | V0 | V0 | MTH$VYALOG_R0_V5 |
DLOG | D_floating | JSB | V0 | V0 | MTH$VYDLOG_R0_V5 |
GLOG | G_floating | JSB | V0 | V0 | MTH$VYGLOG_R0_V5 |
LOG10 | F_floating | JSB | V0 | V0 | MTH$VYALOG10_R0_V5 |
DLOG10 | D_floating | JSB | V0 | V0 | MTH$VYDLOG10_R0_V5 |
GLOG10 | G_floating | JSB | V0 | V0 | MTH$VYGLOG10_R0_V5 |
SIN | F_floating | JSB | V0 | V0 | MTH$VYSIN_R0_V3 |
DSIN | D_floating | JSB | V0 | V0 | MTH$VYDSIN_R0_V3 |
GSIN | G_floating | JSB | V0 | V0 | MTH$VYGSIN_R0_V3 |
SQRT | F_floating | JSB | V0 | V0 | MTH$VYSQRT_R0_V4 |
DSQRT | D_floating | JSB | V0 | V0 | MTH$VYDSQRT_R0_V4 |
GSQRT | G_floating | JSB | V0 | V0 | MTH$VYGSQRT_R0_V4 |
TAN | F_floating | JSB | V0 | V0 | MTH$VYTAN_R0_V3 |
DTAN | D_floating | JSB | V0 | V0 | MTH$VYDTAN_R0_V3 |
GTAN | G_floating | JSB | V0 | V0 | MTH$VYGTAN_R0_V3 |
POWRR(X**Y) | F_floating | CALL | V0, R0 | V0 | OTS$VYPOWRR_R1_V4 |
POWDD(X**Y) | D_floating | CALL | V0, R0 | V0 | OTS$VYPOWDD_R1_V8 |
POWGG(X**Y) | G_floating | CALL | V0, R0 | V0 | OTS$VYPOWGG_R1_V9 |
2.4.1 Exception Handling
The fast-vector math routines signal all errors except
floating underflow. No intermediate calculations result in
exceptions. To optimize performance, the following message signals all
errors:
%SYSTEM-F-VARITH, vector arithmetic fault
2.4.2 Special Restrictions On Input Arguments
The special restrictions listed in Table 2-6 apply only to fast-vector routines SIN, COS, and TAN. The standard vector routines handle the full range of VAX floating-point numbers.
Function Name | Input Argument Domain (in Radians) |
---|---|
SIN | ~( -6746518783.0, 6746518783.0) |
COS | ~( -6746518783.0, 6746518783.0) |
TAN | ~( -3373259391.5, 3373259391.5) |
If the application program uses arguments outside of the listed domain, the routine returns the following error message:
%SYSTEM-F-VARITH, vector arithmetic fault
If the application requires argument values beyond the listed limits,
use the corresponding standard vector math routine.
2.4.3 Accuracy
The fast-vector math routines do not guarantee the
same results as those obtained with the corresponding standard vector
math routines. Calls to the fast-vector routines generally
yield results that are different from the scalar and original vector
MTH$ library routines. The typical maximum error is a 2-LSB (Least
Significant Bit) error for the F_floating routines and a 4-LSB error
for the D_floating and G_floating routines. This generally corresponds
to a difference in the 6th significant decimal digit for the F_floating
routines, the 15th digit for D_floating, and the 14th digit for
G_floating.
2.4.4 Performance
The fast-vector math routines generally provide performance improvements over the standard vector routines ranging from 15 to 300 percent, depending on the routines called and input arguments to the routines. The overall performance improvement using fast-vector math routines in a typical user application will increase, but not at the same level as the routines themselves. You should do performance and correctness testing of your application using both the fast-vector and the standard vector math routines before deciding which to use for your application.
Previous | Next | Contents | Index |
privacy and legal statement | ||
6117PRO_003.HTML |