Previous | Contents | Index |
When a server votes on a transaction, RTR expects the server to commit the transaction to the database when RTR makes the request. If for some reason the server cannot do so, the server has two choices:
EXCEPTION transactions can be inspected with the DUMP JOURNAL command.
The final state of the transaction should say EXCEPTION.
4.2.1 Dealing with EXCEPTION Transactions
The system administrator must decide what to do with transactions that are marked EXCEPTION. There are two choices:
EXCEPTION transactions keep the application available, although they
cause some loss of data integrity. EXCEPTION transactions are
considered committed by the initiator of the transaction, as well as by
the other participants (such as the other shadow member). Therefore,
subsequent transactions, which are dependent on the results of this
transaction, could produce erroneous outcomes. In some applications,
the erroneous outcomes do not matter. In applications where the outcome
does matter, the best approach is to crash the application, and allow
the system administrator to manually intervene.
4.3 Transaction State Changes
There are eight valid state changes allowed for the SET TRANSACTION command. Attempting to change transaction state to a state that is not allowed produces an error message of %RTR-E-INVSTATCHANGE, Invalid to change from current state to the specified state . Table 4-2 identifies the valid state changes.
NEW STATE | ||||
---|---|---|---|---|
Current State | COMMIT | ABORT | EXCEPTION | DONE |
SENDING | YES | |||
VOTED | YES | YES | ||
COMMIT | YES | YES | ||
EXCEPTION | YES | YES | ||
PRI_DONE | YES |
All transaction states referenced in Table 4-2 are RTR journal states. Use the RTR commands DUMP JOURNAL or SHOW TRANSACTION to determine the journal state for each transaction branch.
Four typical situations are listed below where transaction state changes by the system administrator are allowed.
This operation could lead to data inconsistency, if used injudiciously, and should only be used after careful research. |
After the SET TRANSACTION command is executed, use the DUMP JOURNAL
command to verify the result.
4.4 Command Line Examples
The following is an example of the SET TRANSACTION command:
RTR> start rtr RTR> set log/file=settran RTR> set transaction/state=PRI_DONE/new_state=DONE/facility=Facility1/- _RTR> partition=Partition1 * |
This example would set all transactions with the current state of PRI_DONE (remember) to DONE on the facility Facility1 and the partition Partition1. The log file, settran , would record the transaction state changes. The changes could be viewed with the SHOW TRANSACTION command or the DUMP JOURNAL command. In a shadow recovery situation this would clear the journal of remember transactions and provide for a quick turnaround of the shadow site.
The following example shows how RTR commands monitor and manipulate three different transaction states. Consider a scenario where a distributed transaction accesses two RTR partitions. The multiple-participant distributed transaction would have two transaction branches accessing different RTR partitions, say part1 and part2, respectively.
The client commits the transaction and calls rtr_accept_tx() which prompts RTR to start the two-phase commit protocol. RTR sends a prepare message to the two participants. Upon receiving the prepare message ( mt_prepare ), one of the server applications is ready to commit and casts its vote by calling rtr_accept_tx() . RTR writes a VOTE record in the RTR journal and sends the vote message back to the router. However, due to an unexpected defect in the application software, the second server has not sent its VOTE message back to the RTR router. Thus, the transaction is stalled in the second server.
To examine this situation, an RTR system administrator should first use the SHOW TRANSACTION/BACKEND command on the backend node to analyze the transaction's status. As shown in the following example, the transaction runtime state is RECEIVING, indicating the distributed transaction is not yet committed. The server states for the transaction branches are VOTED and VREQ respectively, indicating that one of the transaction branches has been voted by the associated server whereas the other transaction branch is still in "Vote Request" state (VREQ). The journal states for the transaction branches are VOTED and SENDING, indicating that one transaction branch has voted and its VOTED record was written in the RTR journal. The other transaction branch's journal state is SENDING, indicating that transaction branch is still in the process of processing a message from the client and it has not yet advanced to the VOTED state. The journal states for the transaction branches that are recorded in the RTR journal are consistent with their server states.
A transaction branch's journal state is persistent and is therefore used by the SET TRANSACTION command to change a transaction's state. The DUMP JOURNAL command is also useful to examine each transaction branch's journal state.
Backend transactions on node nodea at Mon Mar 13 16:02:42 2000 Tid: 3ad01f10,0,0,0,0,3ad01f10,a08730b4 Facility: test Frontend: nodea FE-User: tu.7006 State: RECEIVING Start time: Mon Mar 13 16:00:08 2000 Key-Range-Id: 16777216,16777217 Router: nodea Invocation: ORIGINAL,ORIGINAL Active-Key-Ranges: 2 Recovering-Key-Ranges: 0 Total-Tx-Enqs: 2 Server-Pid: 7006,7006 Server-State: VOTED,VREQ Journal-Node: nodea.com,nodea.com Journal-State: VOTED,SENDING First-Enq: 1,2 Nr-Enqs: 1,1 Nr-Replies: 0,0 |
As previously described in this scenario, the transaction is stalled in one of the servers. To resolve this situation, use the RTR SET TRANSACTION command to abort this transaction. Change either one of the transaction branch's journal state to ABORT as shown in the following example:
RTR>set transaction/new=abort/state=voted/facility=test/partition=part1 %RTR-S-SETTRANDONE, 1 transaction(s) updated in partition part1 of facility test |
or
RTR>set transaction/new=abort/state=sending/facility=test/partition=part2 %RTR-S-SETTRANDONE, 1 transaction(s) updated in partition part2 of facility test |
See Chapter 7 for detailed information on these commands.
With RTR shadowing, your system can recover from a site disaster without the need for special coding within your application program.
A database is said to be shadowed when two copies of the same database are deployed on separate nodes at two different locations, typically two different sites. Each location maintains a copy of the database used by the server application, and RTR keeps the database copies synchronized. Shadow site configurations can contain two nodes at separate sites, two nodes in a cluster, or two clusters at separate sites. When setting up a shadow configuration for two nodes in a cluster, the syntax must explicitly state that the nodes are not to be standby nodes.
Concurrent servers handle similar transactions, (that is, in
the same key range but not the same transactions). Standby servers do
not handle transactions at all (for the given key range) and shadow
servers handle the same transactions.
5.1 Primary and Secondary Roles
There is a concept of primary and secondary roles for the shadow server pair, although in most cases this is transparent to the user when the processing is the same on both sites.
The assignment of primary and secondary roles to partitions can be
managed by the partition priority list, or left to RTR. If left to RTR,
initial role assignment is arbitrary, in that the first server of a
shadow pair to start is given the primary role, and the second the
secondary. The assigned roles may change, as servers come and go. Roles
are required, since RTR needs to determine the voting order on the
primary site before the transaction is presented to the secondary site.
5.2 Automatic Features
Shadow sites each have an identical copy of the customer's database.
Transactions are sent by RTR to both sites. RTR ensures that they are processed by the servers in the same order on each site, so that both copies of the customer database remain up to date.
A transaction is sent to the secondary site only after the primary has accepted it, or if the primary fails before being asked to vote.
RTR suppresses replies and broadcasts issued by the secondary shadow
server.
5.2.1 Shadow Events
RTR provides the following shadowing events:
RTR_EVTNUM_SRPRIMARY | Server is in primary mode |
RTR_EVTNUM_SRSTANDBY | Server is in standby mode |
RTR_EVTNUM_SRSECONDARY | Server is in secondary mode |
RTR_EVTNUM_SRSHADOWLOST | Server has lost its shadow partner |
RTR_EVTNUM_SRSHADOWGAIN | Server has gained its shadow partner |
RTR_EVTNUM_SRRECOVERCMPL | Server has completed recovery |
The shadow events are delivered with no special status and no data. They are delivered only to the servers whose state has changed.
A server receives RTR_EVTNUM_SRPRIMARY under the following circumstances:
A server receives RTR_EVTNUM_SRSTANDBY when it starts up and servers already exist for the same key range on another node in the same cluster.
A server receives RTR_EVTNUM_SRSECONDARY when it starts up and a shadow primary set of servers exist elsewhere.
A server receives RTR_EVTNUM_SRSHADOWLOST if it is running as primary and the secondary goes away.
A server receives RTR_EVTNUM_SRSHADOWGAIN if it is running as primary and a secondary node starts up.
A server receives RTR_EVTNUM_SRRECOVERCMPL when it has finished doing
recovery operations and is ready to start processing new transactions.
5.3 RTR Journal System
The RTR journal is used for the following purposes:
The amount of space required for the journal depends upon the:
Thus a journal file is often quite large.
The /MAXIMUM_BLOCKS qualifier on the CREATE JOURNAL command controls how large a journal may become. The /MAXIMUM_BLOCKS qualifier defines the maximum number of blocks which the journal is allowed to occupy on any one disk. RTR does not check if this amount of space is actually available, as the disk space specified by /MAXIMUM_BLOCKS is used only on demand by RTR when insufficient space is available in the space allocated by the /BLOCKS qualifier.
The number of blocks specified by the /BLOCKS qualifier specifies the maximum size of the journal that RTR attempts to use. The actual number of blocks used may vary, depending upon the load on RTR.
The command MODIFY JOURNAL also accepts the /BLOCKS and /MAXIMUM_BLOCKS qualifiers.
Journal file extension occurs on demand when RTR detects that a "write to journal" would otherwise fail due to lack of space. Journal file truncation takes place periodically when enough free blocks are detected.
Refer to MODIFY JOURNAL for the syntax description of the MODIFY JOURNAL command.
RTR> show journal/files/full RTR journal:- Disk: /dev/rz3a Blocks: 2500 Allocated: 1253 Maximum: 3500 File: //rtrjnl/anders/BRONZE.J00 RTR> |
If a shadow site fails, RTR allows transactions to continue to be processed on the remaining site. The intermediate transactions processed by the remaining server or servers are retained by RTR; when the failed site restarts, these transactions are sent to this site as part of a shadow-recovery operation, thus bringing the failed site back up to date.
Since the transactions are stored in the RTR journal, it must be created with enough disk space in reserve to store data for the longest expected outage. It can be calculated using:
( Nr. transaction messages per second multiplied by ( transaction message length + 70 ) multiplied by seconds of outage ) + 5% file overhead. |
The result in bytes must be divided by 512 to obtain size in blocks.
The overhead required when calculating journal size comes from internal journal data (block stamping) of approximately 3%. In addition, there is internal transaction data per (client to server) transactional message, and some further data per transaction (concerning voting and transaction completion).
Also, RTR prevents further transactional data from being written to the journal when it is nearly full, but continues to allow deletes from the journal (deletes also cause data to be written to the journal). Ten segments are held in reserve for storing information about deleted transactions even when RTR cannot accept further transactions because the journal is full.
If the journal disk becomes full, transactions are aborted until the shadow partner restarts and empties the journal of transactions to be replayed. |
Previous | Next | Contents | Index |