Reliable Transaction Router
Release Notes
5.3 Known Problems with Workarounds
- 14-3-217 Unthreaded UNIX applications using rtr_set_wakeup
can fail, e.g., in malloc
When an unthreaded UNIX RTR application
calls rtr_set_wakeup, the non-reentrant RTR shared library -lrtr with
which it is linked installs a signal handler. This signal handler
called functions internal to RTR which could occasionally call runtime
library functions such as malloc() that are not async-safe, according
to the relevant standards. See man (4) signal.
In practice this may
appear to work most of the time, but break for no apparent reason when
the signal happens to occur while background code is also in a runtime
library call such as malloc.
The problem in RTR has been corrected.
The small penalty for this is that RTR no longer makes any attempt to
try to ensure that messages available are not just housekeeping.
Applications must always be prepared for a timeout return status on
calling rtr_receive_message with a zero timeout, even after a wakeup
suggests that a message ought to be available.
Application writers
are reminded that their RTR wakeup handlers are subject to the same
restrictions: routines like printf, malloc, and the entire RTR API may
not be used directly or indirectly from within a signal handler. A
workaround for applications with unsafe wakeup handlers can be to link
with the reentrant version of the library -lrtr_r because different
rules apply for wakeups in a thread: applications should not call
anything that is not thread-safe, or anything that might block
indefinitely, such as rtr_send_to_server, rtr_reply_to_client,
rtr_broadcast_event, or rtr_receive_message with a non-zero timeout.
- 14-7-952 Do not treat dumb unknown terminal like a VT100
If you try to run RTR on a terminal or window with unknown (zero)
dimensions, RTR exits immediately with a BADROWCOL message.
A
workaround is to enter the following UNIX command:
RTR expects the terminal or window to be at least capable of
emulating a VT100 terminal. Otherwise, a few control characters are
displayed at the beginning of each line, and the output from the
MONITOR command contains so many control sequences that it is
unreadable.
A workaround is to redirect both standard input and
output to files:
rtr monitor calls < /dev/null > monitored_calls.lis
|
5.4 Restrictions
- 14-1-6 Network Connection status codes incorrect with
SUNLink DNI
RTR will give a reason code when explicitly rejecting a
network connection from another node. The reason text is displayed in
the "monitor connects" screen, and is useful in diagnosing connectivity
and configuration problems. The reject reason is not available when
attempting connections using DECnet (SUNLink DNI) as a network
transport. As a result, this platform incorrectly reports an explicit
rejection by a remote node as having been refused. Use the "monitor
accfail" screen on the target of the connection to obtain a correct
indication of the reason for the rejection.
- 14-3-50 Maximum number of application processes limit
An ACP crash that occurred when starting the last of a great many
applications has been corrected.
When the process open file limit
is reached, the application will now generally report ACPNOTVIA, "RTR
ACP is no longer a viable entity, restart RTR". In actual fact the ACP
continues to operate with all previously connected processes, and only
the new rejected process thinks that the RTR ACP is not alive. This
message should be interpreted as "ACPINSRES, The RTR ACP has
insufficient resources."
Please ensure that your system is
configured with sufficient default per-process resources, or that the
acp process is started with increased resource limits. Allow at least
one open file for each additional application process, and at least one
for each link.
- 14-8-43 Sun Solaris 256 File Descriptor restriction
Sun Solaris versions up to and including 2.5.1 cannot use file
descriptor numbers larger than 255 for standard I/O. (Sun Solaris 2.6
is believed to address this problem.)
RTR no longer leaks file
descriptors when attempting to use DECnet to reach a non-DECnet node
while it is not currently reachable by TCP/IP, for example because RTR
is stopped on that node.
RTR now conserves low file descriptor
numbers, so that if the per-process limit is configured to be as much
as 1024, they can all be used for links and application processes.
Crashes caused by this leak of a scarce resource should no longer
occur.
6 HP-UX Specific Information
This chapter gives platform-specific information for the HP-UX
implementation of Reliable Transaction Router, Version 3.2.
6.1 New Features
- 14-5-44 New script rtr_snapshot.sh for gathering RTR
diagnostic data
The new command rtr_snapshot.sh calls various SHOW
and MONITOR commands to output a snapshot of the state of RTR on a
node. This information may be of use for monitoring, tuning,
troubleshooting, and reporting problems.
6.2 Known Problems Corrected Since Version 3.1D
- 14-1-643 Assertion when restarting timed out command
server at RTR> prompt
When an idle command server started by the
same RTR> prompt process times out after RTR_COMSERV_TIMEOUT seconds
(default 300) and is restarted for a new command, the RTR> prompt
process could raise an assertion. This problem has been corrected.
- 14-3-190 Signal handling by RTR shared library in RTR
applications
The first RTR api call no longer replaces any existing
signal handlers that were installed by the application main program for
the three usual termination signals SIGINT, SIGHUP, and SIGTERM.
If
no existing termination signal handlers are found (SIG_DFL), RTR
installs a simple handler which will cause RTR to call exit() at the
next convenient opportunity during an RTR api call, or in the RTR
polling thread in a threaded application.
RTR installs an exit()
handler with atexit(). This handler is not essential, but is intended
to perform a more controlled shutdown of RTR in an application than
when the process is terminated abruptly, for example with _exit(),
which does not call exit handlers.
The application may choose to
leave the RTR termination signal handler in place, or to install its
own handlers at any time. The application handlers should notify the
mainline program in an async-safe manner that it should call exit()
when convenient, and may even be constructed to also call the RTR
handler they replaced so that the application can exit in an RTR api
call too. Consult the operating system documentation for the usual
restrictions on exactly what is permitted in an async-safe signal
handler.
If the application does not install its own signal
handlers for the usual termination signals and does not continue to
make regular RTR api calls, then the application will appear to ignore
them.
RTR still installs an empty handler to catch the SIGPIPE
signal to avoid the default action of program termination. In
unthreaded applications RTR may still install the RTR SIGIO handler
which also executes any previous SIGIO handler installed by the main
program.
- 14-7-386 Better formatting for non-VT100 compatible
windows and terminals
Termcap entries are now parsed more
carefully: :nd= is now respected. If you wish to run RTR in an hpterm
instead of an xterm window, then try:
TERMCAP=hp:al=\EL:am:bs:cd=\EJ:ce=\EK:ch=\E&a%dC:cl=\EH\EJ:co#80:
da:db:dc=\EP:dl=\EM:do=\EB:ei=\ER:kb=^H:kd=\EB:kh=\Eh:kl=\ED:kr=\EC:
ku=\EA:ke=\E&s0A:ks=\E&s1A:li#24:mi:nd=\EC:pt:se=\E&d@:so=\E&dB:
up=\EA:xs:cm=\E&a%dy%dC:cv=\E&a%dY:im=\EQ:ml=\El:mu=\Em:ue=\E&d@:
us=\E&dD:bt=\Ei:
TERM=hp
|
6.3 Known Problems with Workarounds
- 14-3-217 Unthreaded UNIX applications using rtr_set_wakeup
can fail, e.g., in malloc
When an unthreaded UNIX RTR application
calls rtr_set_wakeup, the non-reentrant RTR shared library -lrtr with
which it is linked installs a signal handler. This signal handler
called functions internal to RTR which could occasionally call runtime
library functions such as malloc() that are not async-safe, according
to the relevant standards. See man (4) signal.
In practice this may
appear to work most of the time, but break for no apparent reason when
the signal happens to occur while background code is also in a runtime
library call such as malloc.
The problem in RTR has been corrected.
The small penalty for this is that RTR no longer makes any attempt to
try to ensure that messages available are not just housekeeping.
Applications must always be prepared for a timeout return status on
calling rtr_receive_message with a zero timeout, even after a wakeup
suggests that a message ought to be available.
Application writers
are reminded that their RTR wakeup handlers are subject to the same
restrictions: routines like printf, malloc, and the entire RTR API may
not be used directly or indirectly from within a signal handler. A
workaround for applications with unsafe wakeup handlers can be to link
with the reentrant version of the library -lrtr_r because different
rules apply for wakeups in a thread: applications should not call
anything that is not thread-safe, or anything that might block
indefinitely, such as rtr_send_to_server, rtr_reply_to_client,
rtr_broadcast_event, or rtr_receive_message with a non-zero timeout.
- 14-7-952 Do not treat dumb unknown terminal like a VT100
If you try to run RTR on a terminal or window with unknown (zero)
dimensions, RTR exits immediately with a BADROWCOL message.
A
workaround is to enter the following UNIX command:
RTR expects the terminal or window to be at least capable of
emulating a VT100 terminal. Otherwise, a few control characters are
displayed at the beginning of each line, and the output from the
MONITOR command contains so many control sequences that it is
unreadable.
A workaround is to redirect both standard input and
output to files:
rtr monitor calls < /dev/null > monitored_calls.lis
|
6.4 Restrictions
- 14-3-50 Maximum number of application processes limit
An ACP crash that occurred when starting the last of a great many
applications has been corrected.
When the process open file limit
is reached, the application will now generally report ACPNOTVIA, "RTR
ACP is no longer a viable entity, restart RTR". In actual fact the ACP
continues to operate with all previously connected processes, and only
the new rejected process thinks that the RTR ACP is not alive. This
message should be interpreted as "ACPINSRES, The RTR ACP has
insufficient resources."
Please ensure that your system is
configured with sufficient default per-process resources, or that the
acp process is started with increased resource limits. Allow at least
one open file for each additional application process, and at least one
for each link.
7 Windows NT Specific Information
This chapter gives platform-specific information for Reliable
Transaction Router, Version 3.2 for Windows NT.
7.1 New Features
- RTR supports XA; however, problems have been found when testing
with Oracle 7.34 and 8.04. Contact Oracle support for details.
- 14-1-236 New RTR demo included with kit
The latest RTR
demo and associated application sources are now shipped on the RTR
:CD-ROM. The demo gives an overview of RTR
functionality. The applications are written using MS Visual C++ and are
provided on an unsupported basis for the benefit of the developers of
applications using RTR.
- 14-5-91 Windows NT Service for RTR
Included with this
version of RTR is the RTR\NT Service program. Installation of the
software is described in the RTR Installation Guide. Operation
of the software is described in the RTR System Manager's
Manual.
- 14-7-789 JAM locking in WNT clusters
RTR
configurations are supported in Windows NT cluster environments. The
cluster platforms that are currently supported are Digital Clusters for
Windows NT (V1.0 SP2 on Windows NT V3.51, or V1.1 on Windows NT V4.0),
and Microsoft Cluster Server configurations (formerly known as
Wolfpack). Only two-node NT cluster configurations are supported for
this version of RTR.
RTR supports the use of standby configurations
in this environment. In terms of NT clusters, RTR is an application and
the RTR journals are the database resource which is failed over between
the NT cluster servers.
The following requirements must be observed:
- The RTR journal for both NT servers must be located on the same
disk on the SCSI bus that is shared between the two NT cluster servers.
The RTR registry entry for the journal must be set to the same value on
both server nodes. Furthermore, the registry entry should specify the
journal disk using the path qualified by the cluster name. For example,
if the cluster name is ALPHACLUSTER, and the journal disk has the
cluster share name DISK1, then the RTR journal registry entry should be
entered as:
This can be modified using the Registry Editor. The registry key
for the journal is found under:
\HKEY_LOCAL_MACHINE\SOFTWARE\DigitalEquipmentCorporation\RTR\Journal
|
The key name is the default (none) and value should be in the
format as given above.
- If the journal file is specified as above on a shared SCSI disk,
then RTR can operate with standby server functionality. If the journal
is not located on a shared disk in a Windows NT cluster configuration,
then RTR behaves as a standalone RTR node and no use is made of cluster
functionality.
- RTR must be configured as both a backend and a router role on the
Windows NT cluster server nodes if the journal file is located on a
shared SCSI disk.
- In a Windows NT cluster configuration, the RTR directory must not
be located on a shared SCSI disk.
- The failover group containing the disk share on which the journal
files are located must have no failback policy enabled. That is, if the
failover group fails over to the secondary cluster node due to primary
server outage, then the group must not failback to the primary node
once the primary node is available again.
- While RTR facilities have been defined in a cluster configuration,
then the failover group with the journal device must not be manually
failed over to the other cluster server (by the cluster administrator).
Failover should only occur on the discretion of the cluster failover
manager software.
- RTR creates lock files in the RTR directory and the journal
directory during normal operation. These are of the form N*.LCK or
N*.BLK, and C*.LCK or C*.BLK. These files may be left in these
directories after RTR has been stopped, but they will be reused once
RTR is started again. There is no real need for a daemon to purge these
files at system boot time.
7.2 Known Problems Corrected Since Version 3.1D
- 14-1-514 Simultaneous CONNECT/EXCEPTION event generation
causes W32 ACP crash
Unexpected Winsock 1.1 behavior on Windows 95
when a TCP connect attempt failed could result in an RTR failure.
Although this looks like a discrepancy against the documented Winsock
behaviour, RTR has been modified to handle the condition and continue
running. The node counters knlnet_tcp1_spurious and
knlnet_tcp2_spurious track the number of times this condition is
detected.
- 14-3-135 RTR V3 does not select all nodes in a VMScluster
when using the SET ENVIRONMENT command
SET ENVIRONMENT/CLUSTER now
works on OpenVMS and Windows NT.
Previously, all nodes in the
cluster had to be listed in a SET ENVIRONMENT /NODE=(...) command in
order to issue subsequent commands to all of them. SET ENVIRONMENT
/CLUSTER is now available on OpenVMS Windows and NT clusters, as well
as on Digital UNIX TruCluster.
- 14-3-218 Microsoft Visual C compiler options /Gz (stdcall)
and /Gr (fastcall) supported
The RTR API functions in <rtr.h>
are now declared with the __cdecl attribute so they can be used in
applications compiled with calling conventions other than the /Gd
(cdecl) default.
- 14-3-255 Multiple broadcast or data received on wrong
channel
When running W95/NT with Pathworks installed, RTR would not
detect that the client had closed its channel when the client
application was aborted by closing the window. RTR now detects when the
client has aborted the channel and closes the channel.
- 14-5-43 Exception handler report file names changed
For consistency with other supported platforms, the name of the
file used to hold the exception handler report has been changed to
rtr_error.log. Any prior versions of the file are renamed to
rtr_error<n>.log, where n cycles through the range 0 - 9.
7.3 Known Problems with Workarounds
- 14-3-62 Systems configured with Pathworks32 without DECnet
In certain cases, after installing Pathworks32 without DECnet
support (for example, LAT procotol only, or if want to use PowerTerm or
eXcursion where DECnet not strictly needed) it may occur that DECnet
happens to be registered in the Winsock2 protocol stack even though no
DECnet drivers are loaded. (This may come about through some errors in
the configuration procedure while, for example, removing DECnet after
it has been installed.)
If DECnet is registered in the Winsock2
protocol stack, but no DECnet drivers are loaded, then there is no
problem with running RTR if Pathworks32 V7.0A is the version installed.
If Pathworks32 V7.0 is installed, then RTR will not start correctly.
If Pathworks V7.0 is installed, and RTR cannot start, then this
condition can easily be verified by running WSAENUM.EXE found in the
SDK directory tree of the Pathworks32 installation kit. If the address
family DECnet is displayed, then DECnet has been registered in the
Winsock2 protocol stack. If, at the same time, there is no DECnet
protocol listed under Protocols in the Control Panel network applet,
then this is the problem.
To allow RTR to correctly start in such a
configuration, use one of the workarounds:
- run PWS2DNST.EXE (Winsock2 de-register DECnet utility) found in the
Pathworks32 installation kit to de-register DECnet in the Winsock2
protocol stack.
- set RTR_PREF_PROT=RTR_TCP_ONLY
- upgrade to Pathworks32 V7.0A
7.4 Restrictions
- 14-1-155 Confusion between host and node names
To
fully support protocol failover between TCP/IP and DECnet on Windows NT
and Windows 95 systems, the unqualified IP host name of the machine
should be the same as the DECnet node name.
- 14-1-160 RTR for Windows requires TCP/IP
RTR for
Windows NT, Windows 95, and Windows 98 requires TCP/IP protocol to be
operational, since TCP/IP is used for inter-process communication
between the RTR ACP and the application. If TCP/IP is removed using the
Network applet in the Control Panel (for example, if DECnet has been
installed), then trying to start RTR will result in a Winsock error.
- 14-1-253 V3 WIN32 FE Can't connect to V2 TR
There is a
known deficiency for Windows clients running Pathworks DECnet trying to
connect to a V2 router node. The connection does not always come up
reliably at the first attempt. If this is a problem in your environment
please report it to Compaq.
- 14-1-471 Incorrect handling of failed DECnet connect
attempts on NT
Network connection attempts over DECnet that get
explicitly refused are not handled on Windows platforms until RTR times
them out. This may make failover operations slower than required for
some applications. If this is the case, the timeout period can be
reduced by specifying revised values using the following environment
variables:
RTR_TIMEOUT_CONNECT (default 60 s, minimum 5 s)
RTR_TIMEOUT_CONNECT_RELAX (default 90 s, minimum 1 s)
|
Failover processing occurs after the combined values of these times
has elasped.
- 14-1-516 Excessive occurrences of ERROR WSASYSNOTREADY
10091 in rtr-log file
Entries of the following type may be writen
to the RTR log file as applications exit:
%KNL-W-SYSTEM, ioctlsocket FIONBIO blocking, (10091: *No message text found
for 10091 (317)*), knl_net.c:1780
%KNL-W-SYSTEM, ioctlsocket FIONBIO nonblocking, (10091: *No message text found
for 10091 (317)*), knl_net.c:1708
|
These are caused by RTR trying to send data from an exit handler
installed by RTR in the application. This currently does not work on
Windows as the network is unavailable from an exit handler. The problem
can be avoided by ensuring that the application closes all its RTR
channels prior to exiting.
- 14-1-594 Incomplete cleanup by COMSERV after CLI exits
When executing commands from the RTR command line (CLI), if the
process that opens a channel goes away, the PID associated with that
channel also goes away, due to the way Windows NT and Windows 95
identify the requester. This can cause invalid channel arguments, but
is normal behavior in a test or experimental environment using the CLI.
- 14-5-10 Enable process dump file creation on Windows NT
Using Dr Watson
In the event that a problem is discovered with
RTR that causes it to crash, a process dump file can be generated by
enabling the Dr Watson post mortem crash analyser. This is done by
entering the :MS--DOS command:
The files that are created are %WINDIR%\DRWTSN32.LOG and
%WINDIR%\USER.DMP.
These files should be included with any problem
report submitted to RTR Engineering in the event of an RTR crash, along
with the RTR dump file (RTR_<n>.DMP) and the RTR log file.
8 Windows 95 and Windows 98 Specific Information
This chapter gives platform-specific information for Reliable
transaction Router, Version 3.2 for Windows 95 and Windows 98.
8.1 New Features
- 14-1-236 New RTR demo included with kit
The latest RTR
demo and associated application sources are now shipped on the RTR
:CD-ROM. The demo gives an overview of RTR
functionality. The applications are written using MS Visual C++ and are
provided on an unsupported basis for the benefit of the developers of
applications using RTR.
8.2 Known Problems Corrected Since Version 3.1D
- 14-3-218 Microsoft Visual C compiler options /Gz (stdcall)
and /Gr (fastcall) supported
The RTR API functions in <rtr.h>
are now declared with the __cdecl attribute so they can be used in
applications compiled with calling conventions other than the /Gd
(cdecl) default.
- 14-3-255 Multiple broadcast or data received on wrong
channel
When running W95/NT with Pathworks installed, RTR would not
detect that the client had closed its channel when the client
application was aborted by closing the window. RTR now detects when the
client has aborted the channel and closes the channel.
- 14-5-43 Exception handler report file names changed
For consistency with other supported platforms, the name of the
file used to hold the exception handler report has been changed to
rtr_error.log. Any prior versions of the file are renamed to
rtr_error<n>.log, where n cycles through the range 0 - 9.