Reliable Transaction Router
C Application Programmer's
Reference Manual


Previous Contents Index

2.11.4 Event Troubleshooting

Several RTR MONITOR screens can be helpful in troubleshooting events, as described below. Sample screens are available in the Reliable Transaction Router System Manager's Manual.

Monitoring Events

User Event traffic (broadcasts) may specifically be monitored for each node using the MONITOR BROADCAST screen in RTR. This screen shows the total event throughput, along with a count of any discarded broadcasts.

The MONITOR FACILITY screen in RTR provides a combined summary of all RTR Events and User Events processed for each facility.

The SHOW CLIENT/FULL and SHOW SERVER/FULL commands in RTR are helpful for viewing the current event subscription list for a particular client or server, along with any channel name specified in the rcpnam parameter on the rtr_open_channel call.

Execution of rtr_broadcast_event calls and event message traffic in RTR can be monitored using the MONITOR CALLS screen in RTR. This screen shows the frequency of use of the rtr_broadcast_event call, and the number of RTR Events and User Events processed. If an event is in pending ("pend") status, it indicates that the event is waiting for an rtr_receive_message call to be performed.

The MONITOR ROUTING screen shows the transaction and broadcast throughput on the system. This display shows the number of events and also the rate over time during the monitoring interval.

The MONITOR STALLS screen is helpful to determine if RTR Flow Control is affecting a particular system. Flow Control stalls that have occurred are categorized by duration. Any stall which lasts more than 60 seconds results in a Link Drop entry. A Stall ("stll") entry in the far-right column indicates that a Flow Control stall is currently in progress on the link indicated. For the purposes of User Event broadcast delivery, any stall could indicate that a broadcast message could have been discarded.

It is possible to monitor additional details of RTR Flow Control by using the MONITOR CONGEST, MONITOR FLOW, and MONITOR TRAFFIC monitor screens in RTR.

2.12 Nested Transactions

An RTR transaction can be included in a transaction that is coordinated by a parent transaction manager (TM). This is called a nested transaction or subtransaction. RTR, Microsoft DTC, Encina, Tuxedo, or another foreign transaction manager can act as the TM. In such cases the RTR transaction is termed a nested transaction (or subtransaction), with RTR acting as a Resource Manager (RM) for the parent transaction manager.

An RTR transaction can be joined to an existing transaction by a call to rtr_start_tx using the pjointxid argument.

By default, RTR treats a subtransaction as an intrinsic part of the transaction to which it was joined. This is equivalent to a scheme where the client directly involves all participating servers.

The joining transaction may not, however, be an intrinsic part of the joined transaction. That is, the server may have several alternatives for the subtransaction and only when all alternatives are exhausted will the joined transaction be aborted. In this case the server can specify the RTR_F_EXPENDABLE flag on the calls to rtr_send_to_server that are allowed to fail. If this flag is set and the subtransaction fails, RTR aborts the subtransaction as if it had never happened.

Table 2-8, Changes for Nested Transactions, lists changes made to RTR to effect use of nested transactions. One new call has been introduced and several have been changed.

Table 2-8 Changes for Nested Transactions
Change to: Description
rtr_open_channel RTR_F_OPE_FOREIGN_TM must be specified for a client channel to be used for nested transactions.
rtr_prepare_tx Lets the RTR client initiate the first (prepare) phase of the two-phase commit protocol for the nested transaction.
rtr_request_info Used for recovery processing.
rtr_set_info Used for recovery processing.
rtr_start_tx Must be called by the RTR client explicitly for nested transactions. This allows the Foreign Transaction Manager (FTM) transaction ID to be specified.
rtr_send_to_server The RTR_F_SEN_ACCEPT flag causes an implicit prepare of the transaction; that is, no additional call to rtr_prepare_tx is required.
RTR journal An RTR journal is required on a frontend node that is used for nested transactions.

2.13 Recovery for Foreign Transaction Managers

In the event of a failure of the application or the node where an RTR nested transaction is started, the foreign transaction manager must be able to determine which transactions are in an indeterminate state. The method for this is similar to the method traditionally used by RTR server applications to identify and deal with uncertain transactions.

When rtr_open_channel is called with RTR_F_OPE_FOREIGN_TM , a local journal scan is done (if not already done). Any transactions started by this foreign transaction manager that are in an incomplete state but in the correct facility are read into ACP memory. Before the rtr_mt_opened message is delivered to the application, the local journal scan and a local recovery of any transactions found in the journal is completed. That is, RTR will try to determine what was the final outcome of the transactions it has recovered from the journal.

After the rtr_mt_opened message has been delivered to the application, the application must ask RTR if there are any transactions in an incomplete state and tell RTR how to deal with them. This is known as the Foreign Transaction Manager (FTM) recovery phase. If the foreign transaction manager does not go through the recovery phase, then any incomplete transactions remain in the RTRACP on the frontend, and in the journal.

The application uses the rtr_request_info and rtr_set_info calls for foreign transaction manager recovery. Generally, the application calls rtr_request_info as follows:


status = rtr_request_info(...
          rtr_info_class_t = "ftx", 
          rtr_itemcode_t   = "$name",           /* Select item */ 
          rtr_selval_t     = "facility_name", 
          rtr_itemcode_t   = "kr_id,tx_id"      /* Get items */ 
          ...); 
 

This returns information about the transactions active on the frontend for the specific facility_name. The application then checks the key range ID (kr_id) value to see if it corresponds to the foreign transaction manager ID. If it does, the application can request further information about that transaction. For example:


  status = rtr_request_info(...
           rtr_info_class_t = "ftx", 
           rtr_itemcode_t   = "tx_id",   /* Select item */ 
           rtr_selval_t     = "tx id value from previous call", 
           rtr_itemcode_t   = "xid,state,jnl_state,sr_state,bloblen,blob" 
                                         /* Get items */ 
           ...); 

The journal state (jnl_state) field will have one of three values, and the action for the transaction depends on the value, as shown in Table 2-9:

Table 2-9 Nested Transaction Recovery
Journal State Description User Action
rtr_tx_jnl_prepare Transaction has been prepared.

Servers have voted on the transaction and are waiting for the final vote from the foreign transaction manager. The foreign transaction manager may have called rtr_accept_tx or rtr_reject_tx in a previous incarnation, but the failure occurred before the vote was received by the router or written to the frontend's journal.

rtr_set_info

Set the transaction state to either COMMIT or ABORT.

rtr_tx_jnl_commit Transaction has been committed.

Failure occurred after foreign transaction manager called rtr_accept_tx in a previous incarnation, and RTR unsure whether foreign transaction manager knows transaction outcome.

rtr_set_info

Continue with next operation.

rtr_tx_jnl_abort Transaction has been rejected.

Possibilities:

  1. Router has not received VREQ from the frontend (in ENQ state) or the transaction had been rejected by participant but ABORT has not reached the frontend.
  2. Router in VOTING state. Transaction timed out on router and aborted with COMSTAUNO.
rtr_set_info

Continue with next operation.

During the foreign transaction manager recovery phase, if needed, the application can get the user data that was passed in the pmsg parameter in the call to rtr_prepare_tx . The application gets this data by specifying the bloblen and blob item codes in the call to rtr_request_info (see above). The XID specified by the foreign transaction manager (rtr_xid_t) is also available using the same call.

Note that since the foreign transaction manager recovery phase uses rtr_request_info or rtr_set_info , this could be done in a separate thread or process. The only requirement is that the journal has been opened on the frontend (by a call to rtr_open_channel with RTR_F_OPE_FOREIGN_TM set). Foreign transaction manager recovery code could thus be kept in separate logic that is not directly associated with the rtr_open_channel call. Exactly how this is done depends on the foreign transaction manager architecture. An example of the use of rtr_set_info is presented in the section describing the rtr_set_info call.

2.14 Use of XA Support

Users need to register an resource manager first, to invoke RTR XA support when creating a facility. Please see the RTR System Manager's Manual for more details about how to register and unregister resource managers. In the server application, specify the flag RTR_F_OPE_XA_MANAGED and the underlying resource manager information when issuing the rtr_open_channel call. Once this flag is specified for a given RTR partition, all transactions running in that RTR partition are committed using the XA interface between RTR and the resource manager. When the partition is deleted or the resource manager is unregistered, RTR commits transactions running in this partition in a conventional manner.

Note

When running RTR Version 4.0 with Oracle8, Version 8.1.5 is required.

2.15 RTR Applications in a Multiplatform Environment

Applications using RTR in a multiplatform (that is, mixed endian) environment with non-string application data have to tell RTR how to marshall the data for the destination architecture. The sender of a message must supply both a description of the application data being sent and the application data itself. This description is supplied as the msgfmt argument to rtr_send_to_server , rtr_reply_to_client , and rtr_broadcast_event .

The default (that is, when no msgfmt is supplied) is to assume the application message is string data.

2.15.1 Defining a Message Format

The msgfmt string is a null-terminated ASCII string consisting of a number of field-format specifiers:

[field-format-specifier,...]

The field-format specifier is defined as:

%[dimension]field-type

where:
Field Description Meaning
% indicates a new field description is starting.  
dimension is an optional integer denoting array cardinality (default 1)  
field-type is one of the following:  
  Code Meaning
  UB 8 bit unsigned byte
  SB 8 bit signed byte
  UW 16 bit unsigned
  SW 16 bit signed
  UL 32 bit unsigned
  SL 32 bit signed
  C 8 bit signed char
  UC 8 bit unsigned char
  B boolean

For example, consider the following data structure:


     typedef struct { 
     rtr_uns_32_t first ; 
     rtr_sgn_32_t second ; 
     char  str[12] ; 
 } example_t ; 

The msgfmt for this structure could be " %ul%sl%12c ".

The transparent data type conversion of RTR does not support certain conversions (for example, floating point). Convert these to another format such as character string.

2.16 Application Design and Tuning Issues

This section addresses some considerations for design and tuning, including:

2.16.1 Transactions that Can Cause Server Failure

It is possible for a "rogue" client transaction, due to a user application bug, to "kill" the server process. If RTR were to re-apply this transaction indefinitely, all available servers would be destroyed. To avoid a transaction killing all server processes, the following mechanism is implemented:

The limitation of this feature to transactions which have not yet been accepted prevents possible transaction inconsistencies which could otherwise arise between client and server(s), and on shadow secondary sites. Thus a server application should complete any necessary validation of client transaction messages before accepting the transaction, to take advantage of this feature.

2.16.2 Transaction Grouping and Database Applications

RTR generates commit sequence numbers (CSN) for each transaction committed on the primary site. Concurrent servers can have several transactions assigned to a single CSN value. Transactions with the same CSN are understood by RTR to be independent, and hence their relative commit ordering to the database does not violate the serialisability requirements of transactions.

For purposes of throughput, RTR attempts to group as many transactions as possible into a single CSN during a given vote cycle. (Grouped transactions are only those that explicitly vote (that is, call rtr_accept_tx on the server.)

The vote cycle completes as soon as RTR is ready to ask a server to commit the next transaction. For this mechanism to work correctly with the application, RTR places the following restriction on the server design:

A server must obtain an exclusive lock on any resource that another concurrent server may be accessing for a different transaction before it issues the call to rtr_accept_tx .

Database applications, in general, comply with this requirement. If the database management software allows "dirty reads," the application should apply this rule explicitly, so that RTR can correctly serialise transactions during shadowing or other recovery. Failure to comply with this rule can cause unsynchronised copies of shadow databases.

2.16.3 Transaction Sequence and Shadow Servers

When using a facility having a shadow site and two or more partitions, the transaction sequence is the same at both shadow sites within a single partition only. Sequences across partitions are not preserved. For example, suppose the following transactions are executed on half of a shadow site in the following chronological order:

tx1_for_partition1
tx2_for_partition1
tx3_for_partition1
tx1_for_partition2
tx4_for_partition1

When replayed on the secondary, the order could be:

tx1_for_partition1
tx2_for_partition1
tx3_for_partition1
tx4_for_partition1
tx1_for_partition2

Do not write your application to expect preservation of transaction serialization across partitions.

2.16.4 Transaction Independence

RTR normally assumes that each transaction processed by a given server depends on the transactions that particular server has previously accepted.

To keep the shadowed database identical to that on the primary, RTR controls the order in which the secondary executes transactions. The secondary is constrained to execute transactions in the same order as the primary. Under some circumstances, this can lead to the secondary sitting idle, waiting to be given a transaction to execute.

RTR provides a performance enhancement that may help some applications decrease idle time on the secondary, reducing the corresponding backlog. If the application knows that particular transactions are independent of the transactions previously received, then the application can set one of two flags listed in Table 2-10:

Table 2-10 Independent Transaction Flags
Flag Meaning
RTR_F_ACC_INDEPENDENT Set on an rtr_accept_tx call to indicate this transaction is independent.
RTR_F_REP_INDEPENDENT Set on an rtr_reply_to_client call along with RTR_F_REP_ACCEPT to indicate this transaction is independent.

A transaction accepted with one of these flags can be started on the secondary while other transactions are still running. All transactions flagged with one of these flags must truly be independent of the transactions which have previously executed. They will execute in an arbitrary sequence on the secondary site.

If the server channel has been opened with RTR_F_OPE_EXPLICIT (explicit accept), then the RTR_F_REP_INDEPENDENT flag can only be used together with RTR_F_REP_ACCEPT. If the server channel has been opened with implicit accept, then using RTR_F_REP_INDEPENDENT implies using RTR_F_REP_ACCEPT.

An application can be written to create CSN boundaries to ensure independence. A transaction always receives a CSN, and the INDEPENDENT flag could be used to prevent the CSN from being incremented, so an application could be coded to force dependence between sets of transactions. This could be important in certain cases where transactions coming in at a particular time of day are independent of each other, but other transactions executed, say, at the end of the day, need to ensure that the day's transactions have been processed, and the following day's transactions need to ensure that the previous end-of-day processing has completed.


Previous Next Contents Index