Reliable Transaction Router
Release Notes

Contents

1.5 Documentation Changes

There are no corrections to existing documentation in this release.

1.6 Limitations

Please see the Software Product Description.

1.7 Known Problems

If an RTR ACP process dies by any means other than the RTR STOP RTR command, Compaq strongly recommends that you immediately issue the RTR STOP RTR command to update RTR's shared memory tables. Similarly with the RTR Command Server, type RTR DISCONNECT SERVER whenever a Command Server dies in an unplanned manner. Failure to do so may cause RTR to try to connect to processes that no longer exist; this may have undesirable results.
14-1-520 Remote commands fail with ERRACCNOD when DECnet/TCP preference mismatched
Remote Commands may not work if there is a mismatch between the RTR_PREF_PROT network protocol preference environment variable on local and remote nodes. Although the name of the remote node can be prefixed with tcp. or dna. to select a protocol with which the local node contacts the remote node, this does not influence the protocol used for the return leg. If the remote node attempts to connect back to the local node using the wrong protocol, then the remote command attempt can fail with ERRACCNOD, without a more detailed entry in the log. [A more normal cause of ERRACCNOD is a lack of authorization: try simple non-RTR remote commands like rsh host date, or TYPE host::"0=procedure".]
The default for the environment variable RTR_PREF_PROT is RTR_DNA_FIRST for OpenVMS nodes with DECnet, but RTR_TCP_FIRST for other platforms. Other possible values are RTR_DNA_ONLY and RTR_TCP_ONLY.

1.8 Problem Reporting

For problem reporting:

Send mail to your Compaq Service Representative requesting that it be forwarded to the RTR Quality Manager.
If you have any RTR log files or pertinent output from monitor pictures or RTR SHOW commands, send it to us via E-mail.
Send us as much other information as possible about the conditions which caused the failure, pointers to applications programs which caused the problem, command sequences, etc.

2 Compaq Tru64 UNIX Specific Information

This chapter gives platform-specific information for the Compaq Tru64 UNIX implementation of Reliable Trasaction Router, Version 3.2.

2.1 New Features

RTR supports XA; however, problems have been found when testing with Oracle 7.34 and 8.04. Contact Oracle support for details.
14-5-44 New script rtr_snapshot.sh for gathering RTR diagnostic data
The new command rtr_snapshot.sh calls various SHOW and MONITOR commands to output a snapshot of the state of RTR on a node. This information may be of use for monitoring, tuning, troubleshooting, and reporting problems.

2.2 Known Problems Corrected Since Version 3.1D

14-1-643 Assertion when restarting timed out command server at RTR> prompt
When an idle command server started by the same RTR> prompt process times out after RTR_COMSERV_TIMEOUT seconds (default 300) and is restarted for a new command, the RTR> prompt process could raise an assertion. This problem has been corrected.
14-3-190 Signal handling by RTR shared library in RTR applications
The first RTR api call no longer replaces any existing signal handlers that were installed by the application main program for the three usual termination signals SIGINT, SIGHUP, and SIGTERM.
If no existing termination signal handlers are found (SIG_DFL), RTR installs a simple handler which will cause RTR to call exit() at the next convenient opportunity during an RTR api call, or in the RTR polling thread in a threaded application.
RTR installs an exit() handler with atexit(). This handler is not essential, but is intended to perform a more controlled shutdown of RTR in an application than when the process is terminated abruptly, for example with _exit(), which does not call exit handlers.
The application may choose to leave the RTR termination signal handler in place, or to install its own handlers at any time. The application handlers should notify the mainline program in an async-safe manner that it should call exit() when convenient, and may even be constructed to also call the RTR handler they replaced so that the application can exit in an RTR api call too. Consult the operating system documentation for the usual restrictions on exactly what is permitted in an async-safe signal handler.
If the application does not install its own signal handlers for the usual termination signals and does not continue to make regular RTR api calls, then the application will appear to ignore them.
RTR still installs an empty handler to catch the SIGPIPE signal to avoid the default action of program termination. In unthreaded applications RTR may still install the RTR SIGIO handler which also executes any previous SIGIO handler installed by the main program.

2.3 Known Problems with Workarounds

14-3-217 Unthreaded UNIX applications using rtr_set_wakeup can fail, e.g., in malloc
When an unthreaded UNIX RTR application calls rtr_set_wakeup, the non-reentrant RTR shared library -lrtr with which it is linked installs a signal handler. This signal handler called functions internal to RTR which could occasionally call runtime library functions such as malloc() that are not async-safe, according to the relevant standards. See man (4) signal.
In practice this may appear to work most of the time, but break for no apparent reason when the signal happens to occur while background code is also in a runtime library call such as malloc.
The problem in RTR has been corrected. The small penalty for this is that RTR no longer makes any attempt to try to ensure that messages available are not just housekeeping. Applications must always be prepared for a timeout return status on calling rtr_receive_message with a zero timeout, even after a wakeup suggests that a message ought to be available.
Application writers are reminded that their RTR wakeup handlers are subject to the same restrictions: routines like printf, malloc, and the entire RTR API may not be used directly or indirectly from within a signal handler. A workaround for applications with unsafe wakeup handlers can be to link with the reentrant version of the library -lrtr_r because different rules apply for wakeups in a thread: applications should not call anything that is not thread-safe, or anything that might block indefinitely, such as rtr_send_to_server, rtr_reply_to_client, rtr_broadcast_event, or rtr_receive_message with a non-zero timeout.
14-7-952 Treating dumb unknown terminal like a VT100
If you try to run RTR on a terminal or window with unknown (zero) dimensions, RTR exits immediately with a BADROWCOL message.
A workaround is to enter the following UNIX command:
stty rows 24 cols 80
RTR expects the terminal or window to be at least capable of emulating a VT100 terminal. Otherwise, a few control characters are displayed at the beginning of each line, and the output from the MONITOR command contains so many control sequences that it is unreadable.
A workaround is to redirect both standard input and output to files:
rtr monitor calls < /dev/null > monitored_calls.lis

2.4 Restrictions

14-1-420 RTR's use of the Trucluster Distributed Lock Manager
RTR uses the Distributed Lock Manager that comes with TruCluster PS to manage access to certain system resources. Among other uses, the primary reason for locks is to coordinate access to RTR's journal.
To support standby servers in a TruCluster, the RTR journal for each node must be accessible by RTR on any node in the TruCluster in case of failure of any other cluster member nodes. As part of TruCluster support, the ownership of the NFS service may failover from one node to another. RTR exploits this feature when it finds it necessary to recover transactions from another node's journal.
Before RTR opens a journal, it will verify that the local node has assumed ownership of the shared disk service (as determined by the Distributed Lock Manager). This can work only if each RTR journal in a TruCluster is located on its own distinct shared disk service.
14-3-50 Maximum number of application processes limit
An ACP crash that occurred when starting the last of a great many applications has been corrected.
When the process open file limit is reached, the application will now generally report ACPNOTVIA, "RTR ACP is no longer a viable entity, restart RTR". In actual fact the ACP continues to operate with all previously connected processes, and only the new rejected process thinks that the RTR ACP is not alive. This message should be interpreted as "ACPINSRES, The RTR ACP has insufficient resources."
Please ensure that your system is configured with sufficient default per-process resources, or that the acp process is started with increased resource limits. Allow at least one open file for each additional application process, and at least one for each link.

3 OpenVMS Specific Information

This chapter gives platform-specific information for the OpenVMS implementation of Reliable Transaction Router, Version 3.2.

3.1 New Features

There are no new features in this release that are specific to this platform.

3.2 Known Problems Corrected Since Version 3.1D

14-1-170 rtr_api_wakeup_entries/exits not maintained on OpenVMS
The process counters rtr_api_wakeup_entries/exits were not incremented on OpenVMS. This gave an incorrect indication of the number of wakeup calls on the "monitor calls" picture. This behavior has been corrected.
14-1-260 Display key range bounds completely and in appropriate format
Quadword signed and unsigned key ranges are supported on all Rtr platforms including OpenVMS Alpha and VAX.
14-1-544 Non-portable VMS journals across VAX/ALPHA
The incompatibility between the VAX and Alpha journal files has been corrected. Customers will have to do: rtr> CREATE JOURNAL/SUPERSEDE when they install V3.2.
14-3-53 Sys$start_txw sometimes returns 0 instead of 1 upon success
ASTlm resource limitations may result in applications receiving an erroneous indication that the ACP is not available. Raising the process ASTlm quota corrected this problem.
14-3-89 V2 field ASTPRM not in RTR$_EVT
The RTR$_EVT structure, part of the v2 compatibility layer, now contains the field RTR$L_EVT_ASTPRM (as with RTR V2). The value of RTR$K_EVTAST_ARGNO has been altered accordingly (from 6 to 7).
14-3-131 $DCL_TX_PRC crashes when the user is underprivileged
Running a V2 application from an account that does not have RTR info privilege no longer causes the application to crash.
14-3-135 RTR V3 does not select all nodes in a VMScluster when using the SET ENVIRONMENT command
SET ENVIRONMENT/CLUSTER now works on OpenVMS and Windows NT.
Previously, all nodes in the cluster had to be listed in a SET ENVIRONMENT /NODE=(...) command in order to issue subsequent commands to all of them. SET ENVIRONMENT /CLUSTER is now available on OpenVMS Windows and NT clusters, as well as on Compaq Tru64 UNIX TruCluster.
14-3-169 Application not notified if ACP dies
Upon the death of the ACP process, RTR V3 would incorrectly terminate any outstanding calls to the V2 API wrapper with the status ACPNOTVIA. V2 behaviour has been restored, and such calls now terminate with NOACP.
14-3-196 Application calling $START_TX at AST level while the ACP died would cause the application to crash inside LIBRTR.
This has been corrected and SYS$START_TX will simply return to the caller a message indicating that the ACP is not available.
14-3-197 ACPNOTVIA error returned if RTR command $DCL_TX_PRC issued
The RTR command $DCL_TX_PRC issued for a non-existent facility caused an ACPNOTVIA error return. This does not happen the first time - only subsequent times if RTR is stopped in between.
API verbs called from the RTR command line interpreter would fail with the status ACPNOTVIA if RTR was stopped and restarted without restarting the command server. This has been corrected. The problem can be avoided on earlier vesions of RTR by issuing the command 'disconnect server' after stopping RTR.
14-3-285 OpenVMS process quotas artificially constrained
Prior versions of RTR would limit the maximum values that could be specified for the ACP process quotas to 64K. This restriction has been removed. Warning messages are generated if the requested (or default) memory quotas conflict with the system wide WSMAX parameter, or if the calculated or specified page file quota is greater than the remaining free page file space.
14-3-286 Synchronous call to accept DECnet connect causes links to get isolated
Stalling of ACP due to synchronous(sys$qiow) calls has been fixed by changing to asynchronous calls (sys$qio), which prevents the link from being disconnected. A completion event is called at the end of a successful asynchronous DECnet accept connection. Similarly, DECnet connection reject has also been fixed by changing to asynchronous calls instead of synchronous.
14-7-640 "exceeded byte count quota" message received if process quota bytlm is less than the specified value
On starting RTR in OpenVMS, if process quota bytlm is less than the specified value (e.g., currently 100000), RTR will return an OpenVMS error message "exceeded byte count quota" and will not start.
Users should change the BYTLM setting to the specified value or higher to eliminate the error message and start RTR.
Application users with less process quota bytlm than the specified value will receive RTR error code RTR_STS_BYTLMNSUFF on starting their application.
14-8-130 ACCVIO and omitted parameters using Inter-Operability Services
This version of the RTR Inter-Operability Services now checks the number of parameters passed. If the consumer of the API omits the trailing optional parameter(s), RTR will detect it and supply the necessary value.
It is better practice to supply a "0" for the optional arguments.

3.3 Known Problems with Workarounds

There are no known problems with workarounds in this release that are specific to this platform.

3.4 Restrictions

14-1-279 RTR V2 compatibility interface is not yet thread-safe
The RTR V2 compatibility interface may only be called from one program thread.
14-3-139 RTR V3 only allows up to 30 bytes for the EVTNAM parameter
The RTR V2 compatibility layer only allows up to 30 bytes for the EVTNAM parameter to $DCL_TX_PRC(W), whereas RTR V2 allows up to 32 bytes.
14-7-625 RTR V3 cannot be run in system-mode on a machine on which RTR V2 is already running
RTR V3 cannot be run in system-mode on a machine on which RTR V2 is already running. If this is attempted the RTR V3 acp process will fail. Please make sure V2 RTR has been stopped before attempting to install and run RTR V3.
14-7-1026 Increased AST Process Quota Usage
It may be necessary to increase process ASTLM quotas after upgrading from RTR V2 to V3. If your application receives a large number of messages in a relatively small time period, and you find that RTR calls are failing to complete, raise the ASTLM substantially. For example, if your process receives several hundred broadcasts in a few seconds, raise ASTLM by several hundred.

4 AIX Specific Information

This chapter gives platform-specific information for the AIX implementation of Reliable Transaction Router, Version 3.2.

4.1 New Features

14-5-44 New script rtr_snapshot.sh for gathering RTR diagnostic data
The new command rtr_snapshot.sh calls various SHOW and MONITOR commands to output a snapshot of the state of RTR on a node. This information may be of use for monitoring, tuning, troubleshooting, and reporting problems.

4.2 Known Problems Corrected Since Version 3.1D

14-1-643 Assertion when restarting timed out command server at RTR> prompt
When an idle command server started by the same RTR> prompt process times out after RTR_COMSERV_TIMEOUT seconds (default 300) and is restarted for a new command, the RTR> prompt process could raise an assertion. This problem has been corrected.
14-3-190 Signal handling by RTR shared library in RTR applications
The first RTR api call no longer replaces any existing signal handlers that were installed by the application main program for the three usual termination signals SIGINT, SIGHUP, and SIGTERM.
If no existing termination signal handlers are found (SIG_DFL), RTR installs a simple handler which will cause RTR to call exit() at the next convenient opportunity during an RTR api call, or in the RTR polling thread in a threaded application.
RTR installs an exit() handler with atexit(). This handler is not essential, but is intended to perform a more controlled shutdown of RTR in an application than when the process is terminated abruptly, for example with _exit(), which does not call exit handlers.
The application may choose to leave the RTR termination signal handler in place, or to install its own handlers at any time. The application handlers should notify the mainline program in an async-safe manner that it should call exit() when convenient, and may even be constructed to also call the RTR handler they replaced so that the application can exit in an RTR api call too. Consult the operating system documentation for the usual restrictions on exactly what is permitted in an async-safe signal handler.
If the application does not install its own signal handlers for the usual termination signals and does not continue to make regular RTR api calls, then the application will appear to ignore them.
RTR still installs an empty handler to catch the SIGPIPE signal to avoid the default action of program termination. In unthreaded applications RTR may still install the RTR SIGIO handler which also executes any previous SIGIO handler installed by the main program.
14-3-275 aio not available makes RTR fail with unresolved errors for kaio_rdrw, etc.
RTR for AIX exploits Asynchronous I/O for increased journal performance. By default, aio is only defined, i.e., disabled, instead of available. Aio can be configured with the system management tool: # smit aio.
The RTR installation procedure post_i script now makes aio available, and ensures that aio will also be available after a restart.

4.3 Known Problems with Workarounds

14-3-217 Unthreaded UNIX applications using rtr_set_wakeup can fail, e.g., in malloc
When an unthreaded UNIX RTR application calls rtr_set_wakeup, the non-reentrant RTR shared library -lrtr with which it is linked installs a signal handler. This signal handler called functions internal to RTR which could occasionally call runtime library functions such as malloc() that are not async-safe, according to the relevant standards. See man (4) signal.
In practice this may appear to work most of the time, but break for no apparent reason when the signal happens to occur while background code is also in a runtime library call such as malloc.
The problem in RTR has been corrected. The small penalty for this is that RTR no longer makes any attempt to try to ensure that messages available are not just housekeeping. Applications must always be prepared for a timeout return status on calling rtr_receive_message with a zero timeout, even after a wakeup suggests that a message ought to be available.
Application writers are reminded that their RTR wakeup handlers are subject to the same restrictions: routines like printf, malloc, and the entire RTR API may not be used directly or indirectly from within a signal handler. A workaround for applications with unsafe wakeup handlers can be to link with the reentrant version of the library -lrtr_r because different rules apply for wakeups in a thread: applications should not call anything that is not thread-safe, or anything that might block indefinitely, such as rtr_send_to_server, rtr_reply_to_client, rtr_broadcast_event, or rtr_receive_message with a non-zero timeout.
14-7-952 Do not treat dumb unknown terminal like a VT100
If you try to run RTR on a terminal or window with unknown (zero) dimensions, RTR exits immediately with a BADROWCOL message.
A workaround is to enter the following UNIX command:
stty rows 24 cols 80
RTR expects the terminal or window to be at least capable of emulating a VT100 terminal. Otherwise, a few control characters are displayed at the beginning of each line, and the output from the MONITOR command contains so many control sequences that it is unreadable.
A workaround is to redirect both standard input and output to files:
rtr monitor calls < /dev/null > monitored_calls.lis

4.4 Restrictions

14-3-50 Maximum number of application processes limit
An ACP crash that occurred when starting the last of a great many applications has been corrected.
When the process open file limit is reached, the application will now generally report ACPNOTVIA, "RTR ACP is no longer a viable entity, restart RTR". In actual fact the ACP continues to operate with all previously connected processes, and only the new rejected process thinks that the RTR ACP is not alive. This message should be interpreted as "ACPINSRES, The RTR ACP has insufficient resources."
Please ensure that your system is configured with sufficient default per-process resources, or that the acp process is started with increased resource limits. Allow at least one open file for each additional application process, and at least one for each link.

5 Sun Solaris Specific Information

This chapter gives platform-specific information for the Sun Solaris implementation of Reliable Transaction Router, Version 3.2.

5.1 New Features

14-5-44 New script rtr_snapshot.sh for gathering RTR diagnostic data
The new command rtr_snapshot.sh calls various SHOW and MONITOR commands to output a snapshot of the state of RTR on a node. This information may be of use for monitoring, tuning, troubleshooting, and reporting problems.

5.2 Known Problems Corrected Since Version 3.1D

14-1-643 Assertion when restarting timed out command server at RTR> prompt
When an idle command server started by the same RTR> prompt process times out after RTR_COMSERV_TIMEOUT seconds (default 300) and is restarted for a new command, the RTR> prompt process could raise an assertion. This problem has been corrected.
14-3-190 Signal handling by RTR shared library in RTR applications
The first RTR api call no longer replaces any existing signal handlers that were installed by the application main program for the three usual termination signals SIGINT, SIGHUP, and SIGTERM.
If no existing termination signal handlers are found (SIG_DFL), RTR installs a simple handler which will cause RTR to call exit() at the next convenient opportunity during an RTR api call, or in the RTR polling thread in a threaded application.
RTR installs an exit() handler with atexit(). This handler is not essential, but is intended to perform a more controlled shutdown of RTR in an application than when the process is terminated abruptly, for example with _exit(), which does not call exit handlers.
The application may choose to leave the RTR termination signal handler in place, or to install its own handlers at any time. The application handlers should notify the mainline program in an async-safe manner that it should call exit() when convenient, and may even be constructed to also call the RTR handler they replaced so that the application can exit in an RTR api call too. Consult the operating system documentation for the usual restrictions on exactly what is permitted in an async-safe signal handler.
If the application does not install its own signal handlers for the usual termination signals and does not continue to make regular RTR api calls, then the application will appear to ignore them.
RTR still installs an empty handler to catch the SIGPIPE signal to avoid the default action of program termination. In unthreaded applications RTR may still install the RTR SIGIO handler which also executes any previous SIGIO handler installed by the main program.
14-3-193 Link loss after Sun Solaris 2.5.1 send (34: Result too large)
Sun has confirmed that the sendmsg() system call on Sun Solaris 2.5.1 can return with an undocumented error number ERANGE "Result too large". Rtr now works around this and no longer closes the link.