Reliable Transaction Router
System Manager's Manual

B.1.2 Examples

To make small style adjustments, you may capture the default styles from your browser using the view source option and then edit them. The following example changes the font family to the browser default serif font, and displays a corporate logo at the top-right corner of the page:

Alternately, rtrhead.html could contain a link to a style sheet stored elsewhere. In this case the content of the file would look something like this:

B.2 Server Security

RTR performs both client and user verification.

B.2.1 User Authentication

Note

In order to perform user authentication on Windows 95 and Windows 98 systems, you must first enable User-level access control; follow the path Control Panel/Network/Access Control and select User-level access control .

The HTTP server client will request user credentials, displaying a dialog similar to the following:

Enter a username and password for an account for which the RTR HTTP has been enabled. Windows users may enter the username in the format domain-name\user-name .

To reduce the overhead of accessing the host system user authorization facilities, the server caches user credentials for a period of 90 seconds. During this time it will not revalidate user credentials against the operating system. If you change your password, wait 90 seconds before submitting it to the RTR server.

In addition to validating the supplied credentials, the server ensures that all HTTP requests are received by a command server running under the validated username. Username/password validation errors are logged to the RTR log file.

B.2.2 User Credentials Caching

The RTR web server usually caches valid client credentials to avoid the overhead of validating each access with the operating system. Since only one set of credentials is cached, users who present different sets of credentials (for example, from different browser sessions using different Windows NT domains) will experience unexpected authorization failures. To turn off client credential caching, set the following environment variable: RTR_PASSWORD_CACHE_DISABLE.

B.2.3 Break-in Detection and Evasion

The server attempts to detect a password probing attempt by monitoring the rate of user authentication errors. This is achieved by counting the errors that occur in a time window. This count is maintained for each connecting client node. If the count exceeds a threshold, the server refuses to accept subsequent connections from the client node concerned for a certain time interval. Errors that remain at the end of the counting window are forgiven, and a new window and count are started. The following table shows the default times and counts and the names of environment variables that may be used to specify customized values.

Description Environment Variable Default Value

Counting window period RTR_LGI_WINDOW 300 seconds

Max. number of user authentications errors tolerated in window RTR_LGI_BRK_LIM 5

Time during which server refuses connections from evaded client RTR_LGI_HID_TIM 300 seconds

Description	Environment Variable	Default Value
Counting window period	RTR_LGI_WINDOW	300 seconds
Max. number of user authentications errors tolerated in window	RTR_LGI_BRK_LIM	5
Time during which server refuses connections from evaded client	RTR_LGI_HID_TIM	300 seconds

Appendix C
XA Support

This appendix explains how RTR can be used with an X/OPEN Distributed Transaction Processing (DTP) conformant Resource Manager.

C.1 Introduction

The X/OPEN Distributed Transaction Processing (DTP) architecture defines a standard interface that lets application programs share resources provided by resource managers. The XA interface uses the two-phase commit protocol to commit transactions, and is a system-level, bidirectional interface between the transaction manager (TM) and the resource manager (RM). In the RTR environment, RTR is the transaction manager and database software such as ORACLE8 is the resource manager.

Without XA, an RTR application must deal with replayed transactions after server recovery delivered with rtr_mt_msg1_uncertain ; the application has to check if the transaction has been committed to the database. With XA, the application does not need to be concerned with this problem.

The XA library is an external interface that enables a transaction manager to coordinate global transactions. These can include:

Opening a resource manager
Starting a transaction
Rolling back a transaction
Preparing and committing a transaction
Closing a resource manager

With XA, RTR can connect directly to a resource manager such as ORACLE8.

C.2 Invoking RTR XA Support

Starting with RTR Version 4.0, you can invoke RTR XA support in an application without modifying the RTR API. This section shows how to use and invoke RTR XA support within an ORACLE environment.

C.2.1 Registering a Resource Manager

You must register an instance of an RM with RTR. The RM instance name will be used by RTR to identify the specific database. Refer to the ORACLE administrator's reference manual for the appropriate open_string and xaswitch name.

RTR> REGISTER RM db_name1_rm /library_path="/opt/oracle8/lib/libclntsh.so" /open_string="Oracle_XA+Acc=P/Scott/Tiger+db=db_name1" /xaswitch=xaosw RTR> REGISTER RM db_name2_rm /library_path="/opt/oracle8/lib/libclntsh.so" /open_string="Oracle_XA+Acc=P/Scott/Tiger+db=db_name2" /xaswitch=xaosw

Note

You can only register an RM on an RTR backend.

C.2.2 Associating a Resource Manager with a Facility

All resource managers that will be accessed by a facility must be specified when the facility is created. During a crash, all doubtful transactions associated with these resource managers will be processed and recovered. Once an RM is associated with a given facility, the same RM cannot be associated with another facility.

RTR> CREATE FACILITY facility_name/router=.../backend=... /resource_manager=(db_name1_rm,db_name2_rm)

C.2.3 Binding a Resource Manager with a Partition

You must bind the specific resource manager with an RTR partition when the partition is created. This allows RTR to manage transactions accessing this partition down to the underlying RM via the XA protocol. The XA-managed attribute for the partition remains until the partition goes away.

An RM can be bound with only one partition. Once an RM is associated with a partition, the RM cannot be associated with another partition.

<RTR > CREATE PARTITION db_name1_part/resource_manager=db_name1_rm/... <RTR > CREATE PARTITION db_name2_part/resource_manager=db_name2_rm/...

Note

This feature is supported only in RTR Version 4.0 and later.

C.2.4 Opening an RTR Channel

Starting with RTR Version 4.0, when a server application opens a new channel it does not have to specify the RTR_F_OPE_XA_MANAGED flag and RM name along with the RM's attributes such as open_string in order to invoke RTR XA service. The server application just has to specify the name of a partition that is associated with a specific RM, provided that the user specifies an RM name when creating the partition. All transactions processed through this channel will be managed by the RTR XA service. The following example shows how to open a new channel using RTR V4.0:

srv_key[0].ks_type = rtr_keyseg_partition; srv_key[0].ks_length = 0; /* N/A */ srv_key[0].ks_offset = 0; /* N/A */ srv_key[0].ks_lo_bound = &partition_name[0]; /* null terminated */ flag = RTR_F_OPE_SERVER | RTR_F_OPE_EXPLICIT_PREPARE | RTR_F_OPE_EXPLICIT_ACCEPT; status = rtr_open_channel(&s_chan, flag, reply_msg.fac_name, NULL, /* rcpnam */ pevtnum, NULL, num_seg, /* numseg */ srv_key); /* key range */

However, if the server application is running a version of RTR prior to RTR V4.0, the server application must specify the RTR_F_OPE_XA_MANAGED flag, the RM's name, and other RM attributes such as open_string . You must overload the rtr_keyset_t data structure with the RM-specific information and then pass it when creating an RTR channel.

srv_key[0].ks_type = rtr_keyseg_unsigned; srv_key[0].ks_length = sizeof(rtr_uns_8_t); srv_key[0].ks_offset = 0; srv_key[0].ks_lo_bound = &low; srv_key[0].ks_hi_bound = &high; srv_key[1].ks_type = rtr_keyseg_rmname; srv_key[1].ks_length = 0; /* N/A */ srv_key[1].ks_offset = 0; /* N/A */ srv_key[1].ks_lo_bound = &rm_name[0]; /* null terminated */ srv_key[1].ks_hi_bound = &xa_open_string[0]; /* null terminated */ flag = RTR_F_OPE_SERVER | RTR_F_OPE_EXPLICIT_PREPARE | RTR_F_OPE_EXPLICIT_ACCEPT | RTR_F_OPE_XA_MANAGED; status = rtr_open_channel(&s_chan, flag, reply_msg.fac_name, NULL, /* rcpnam */ pevtnum, NULL, num_seg, /* numseg */ srv_key); * key range */

C.3 Impact on Server Application

Using an RTR XA service has limited impact on existing server applications. The following examples show some of the impact:

RTR will not present messages of type mt_uncertain to server applications. The server application does not have to replay transactions during the recovery. All transactions will be recovered by RTR when the facility is created.
The server application does not need to explicitly commit or roll back the transactions with the underlying resource manager because transactions are managed directly by RTR using the XA protocol.

C.4 MONITOR XA

This command monitors the internal status of XA interface activities. It displays counters containing information such as the number of XA calls, call status (success or failure), and the number of read-only transactions. It provides counts for the open, close, start, end, prepare, commit, rollback, and recovery commands.

Command Syntax: MONITOR XA

C.5 Microsoft DTC Support

RTR for Windows NT is interoperable with the Microsoft Distributed Transaction Controller (DTC). DTC is supported via the RTR XA software architecture. That is, with the XA protocol, RTR users can develop application programs to update MS SQL Server databases, MSMQ, or other Microsoft resource managers under the control of a true distributed transaction.

This is possible because RTR (as a distributed transaction manager) is able to directly communicate with MS DTC to manage a transaction or perform a recovery via the XA protocol. For each standard XA call received from RTR, MS DTC will translate it into a corresponding OLE transaction call that SQL Server or MSMQ can use to update databases.

Appendix D
Troubleshooting RTR Applications

This appendix contains information useful for analyzing performance aspects of RTR, especially in large configurations.

To manage remote nodes, you must have either proxy accounts or rsh access to them. Use RTR remote commands to manage remote nodes.

You should also add and grant operator privileges to the accounts used to manage the RTR network.

D.1 RTR Monitor Pictures

RTR supplies many monitor pictures to help you troubleshoot your application. To display a monitor picture, use the following command at the RTR prompt:

RTR> MONITOR picture-name

The following table provides suggested monitor pictures to display when you encounter problems:

Type of Failure Monitor Pictures

Most common problems SYSTEM

Connection failures ACCFAIL, CONNECTS, FRONTEND, LINK, NETSTAT, STALLS

Transaction sequence problems CALLS

Channel problems CALLS, CHANNEL, PARTIT

Quorum problems QUORUM, ROLEQUOR

V2 interface API V2CALLS

Journal problems JCALLS, JOURNAL

API problems APP2ACP, ACP2APP, REJECTS, REJHIST, ROUTERS

XA interface problems XA

Application Problems APP2ACP, ACP2APP, CALLS, CHANNEL, PARTIT, REJECTS, REJHIST, ROUTERS

Type of Failure	Monitor Pictures
Most common problems	SYSTEM
Connection failures	ACCFAIL, CONNECTS, FRONTEND, LINK, NETSTAT, STALLS
Transaction sequence problems	CALLS
Channel problems	CALLS, CHANNEL, PARTIT
Quorum problems	QUORUM, ROLEQUOR
V2 interface API	V2CALLS
Journal problems	JCALLS, JOURNAL
API problems	APP2ACP, ACP2APP, REJECTS, REJHIST, ROUTERS
XA interface problems	XA
Application Problems	APP2ACP, ACP2APP, CALLS, CHANNEL, PARTIT, REJECTS, REJHIST, ROUTERS

Refer to Chapter 6 for descriptions of the monitor pictures, and the MONITOR command in Chapter 7 for the full syntax of the MONITOR command.

D.2 Enabling RTR logging

Many problems can be better analyzed when RTR logging has been enabled.

RTR logging output can be directed to a file, for example, on RTR startup.

$ RTR SET LOG /FILE=logfile.dat

You should monitor the size of the log file; archive and purge as necessary.

D.3 Starting a Facility

When a facility is started or restarted and servers are declared, RTR recovery features require that it searches journal files of backend nodes in the facility. This allows recovery of any incomplete transactions that were in-flight when the facility last existed. However, if some of the facility's recovery information exists on a backend that is not available at startup, RTR waits for access to the journal on that backend and thus appears to "hang".

This situation can be detected by using MONITOR RECOVERY; backend nodes will be waiting for access to recovery journals. If this is the case, you may follow one of these procedures to continue the startup:

Delete the facility and recreate it without the unavailable backend.
Begin the startup by creating a smaller facility and using the EXTEND FACILITY and TRIM FACILITY commands.
Force a partition to abandon recovery with a SET PARTITION command.

D.4 Analyzing RTR Application Performance

This section provides guidance for System Managers who are analyzing an RTR application that is not functioning correctly.

If an application using RTR hangs, use the following checklist to analyze the situation.

Is there a system-level problem on the node concerned, such as a full disk?

Has RTR been started? Is RTR running correctly?

$ RTR SHOW RTR RTR running on node MYNODE in SYSTEM mode

Are the application programs running? RTR lists the processes using RTR with the following command:
$ RTR SHOW PROCESS
The user application processes should be in this list.
Has the application stopped?
Use MONITOR SYSTEM to check for problems. If it indicates a problem with a subsystem, you can get additional information by monitoring that subsystem.
Network partitioning can also be a problem; this can happen if half or fewer of the configured backends and routers are reachable. To recognize network partitioning, use the MONITOR QUORUM picture. If the number of retries keep increasing without a corresponding increase in the reason counters (CNF/RCH/QRT), you have a partitioned network.
To check the individual links, use the MONITOR CONNECTS picture. This picture displays the link protocol for connected links, and the reason for a failed connection on any links.
Are the application programs running correctly? Use MONITOR CALLS to examine the state of the participating application processes.
- Does the number of rtr_open_channel calls match the number of rtr_mt_opened messages ? If they do not match, use the MONITOR CONNECTS picture to check individual links.
- Use MONITOR CONNECTS to make sure the connection to the router is OK.
- Look in the RTR log file for error messages concerning any unconnected node.
- Look at any unconnected nodes found, and determine:
  - Is RTR running?
  - Has the RTR command CREATE FACILITY been issued?
  - Are there DECnet problems, e.g., executor maximum links too low? Are the router nodes reachable?
Is a server waiting for an rtr_mt_accepted or rtr_mt_rejected message (in other words, has it voted, but not yet received confirmation of the outcome of the transaction)? This is most likely a problem with the application logic. Also check the database for a possible deadlock situation.
Is a client channel declaration not completing? Client channels need to have connectivity via a router node to at least one server channel before they get an mt_opened message. If the server is up and running, use MONITOR QUORUM and MONITOR CONNECTS to check connectivity.
Has a client channel called rtr_receive_message waiting for an rtr_mt_accepted or rtr_reply_to_client message and not received it within a reasonable time period? Check the application logic and the database for a deadlock.
Has a client channel called rtr_receive_message expecting an rtr_mt_accepted or rtr_mt_rejected message that is not forthcoming? If yes, RTR is awaiting the necessary resources for message transmission to the backend servers. Reasons could be:
- Congestion of a network link, frontend to router or router to backend
- Server application not correctly dequeuing messages
- System-level problem on router or backend node
Use MONITOR TPS to check the transaction processing rate of each process on a system. A system's capacity is generally expressed as the throughput of the servers. If the rates are low or sporadic, contention may be the cause. For systems with throughput less than 10 tps, the MONITOR TPSLO display provides greater granularity in the associated bar graph.
Adding server instances can often decrease applicaton throughput if transactions all access common data elements. Partitioning data so that server instances do not interfere with each other is one way to resolve database contention.
Use the command SHOW PARTITION/FULL to display the backlog of transactions on a server pool (partition). If the number of free servers is continually zero, the arrival rate of transactions is greater than the processing capacity of the existing server pool.
The MONITOR QUEUES picture also shows monitor backlogs. This display shows queuing by partition. If the service time and arrival rate of transactions are large, there are not enough servers to process the load. The remedy is to start additional server instances or decrease the processing time of each transaction. Also, many transactions or messages queued can be caused by contention which is limiting the efficiency of servers.
Check the state of links with:
$ RTR SHOW FACILITY /LINK
Check if there are sufficient concurrent application server channels to handle the transaction load; messages may have to be queued for long periods before being processed.
Use MONITOR QUEUES to check the number of outstanding messages for each partition.
Check for congestion by examining the network links with the longest delays by using MONITOR TRAFFIC.
Use the command MONITOR STALLS to determine if the network needs tuning.
If there is no congestion, use MONITOR FLOW to discover if a link has credits for data traffic, or if the application requires more bandwidth than is available.
If the RTRACP dies when adding a facility (which has a backend role on the node), suspect journal file difficulties. Ensure that the journal file is not corrupted, or incompatible with the running RTR version. In the event of journal file corruption, please contact your Compaq support office.

Contents

Index

Reliable Transaction RouterSystem Manager's Manual

B.1.2 Examples

Appendix CXA Support

C.2.2 Associating a Resource Manager with a Facility

C.2.4 Opening an RTR Channel

C.4 MONITOR XA

Appendix DTroubleshooting RTR Applications

D.4 Analyzing RTR Application Performance

Reliable Transaction Router
System Manager's Manual

Appendix C
XA Support

Appendix D
Troubleshooting RTR Applications