Reliable Transaction Router
System Manager's Manual

3.5.2 Programmed Partition Management

Partition commands are programmed using rtr_set_info() . Usage of the arguments are as follows:

pchannel supplies the address of a rtr_channel_t to receive the channel opened in the event of a successful call.
Flags must be RTR_NO_FLAGS .
Verb must be the value verb_set (from the enumeration rtr_verb_t ).
Object must be rtr_partition_object .

select_qualifiers should identify the facility and partition, by name:

rtr_qualifier_value_t select_qualifiers[ 3 ]; select_qualifiers[ 0 ].qv_qualifier = rtr_facility_name; select_qualifiers[ 0 ].qv_value = "your_facility_name_here"; select_qualifiers[ 1 ].qv_qualifier = rtr_partition_name; select_qualifiers[ 1 ].qv_value = "your_partition_name_here"; select_qualifiers[ 2 ].qv_qualifier = rtr_qualifiers_end; select_qualifiers[ 2 ].qv_value = NULL;

The set_qualifier list expresses the required change in partition behavior or characteristic.

The rtr_set_info() call completes asynchronously. If the function call is successful, completion is signaled by the delivery of an RTR message of type rtr_mt_closed on the channel whose identifier is returned through the pchannel argument. The programmer should retrieve this message by using rtr_receive_message() . The data accompanying the message is of type rtr_status_data_t . The completion status of the partition command can be accessed as the status field of the message data.

3.6 Managing Partitions

A set of commands or program calls are used to manage partitions. Information on managing partitions is provided in this section.

3.6.1 Controlling Shadowing

The state of shadowing for a partition can be enabled or disabled. This can be useful in the following circumstances:

Enabling site disaster protection for an application partition for the first time
A recovery aid following prolonged outage of a former shadow site.

The following restrictions apply:

Shadowing for a partition can be turned off only in the absence of an active secondary site.
The active member must be running in remember mode.
The command will fail if entered on either an active primary or secondary with a message to this effect.
If entered on a standby of either the primary or secondary, the command is accepted but fails in the RTR router. This failure is recorded with a log file entry at the router.

Once shadowing is disabled, the secondary site servers will be unable to start up in shadow mode until shadowing is enabled again. Shadowing for the partition can be turned on by entering the command at the current active member or on any of its standbys.

RTR> SET PARTITION/FACILITY=Facility1/SHADOW Facility1:Partition1

For further information, see the SET PARTITION command in Chapter 6.

To enable shadowing, program the set_qualifier argument of rtr_set_info() as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_state_t newState = rtr_partition_state_shadow; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

To disable shadowing, specify newState as rtr_partition_state_noshadow .

3.6.2 Controlling Transaction Presentation

Transaction presentation is the process of passing transactions to idle server channels for processing. While transaction presentation is active, new transactions are started on the first free server channel for the appropriate partition.

Use the /SUSPEND qualifier to the SET PARTITION command to halt the presentation of new transactions to servers on the backend where the command is entered. The command completes when the processing of all currently active transactions is complete. The optional /TIMEOUT qualifier specifies, as a number of seconds, the time that the command waits for completion. If the command times out, presentation of new transactions are suspended, but there still exist transactions for which servers have yet to complete processing. The operator must decide either to reenter the command and wait a further period of time, or resume the partition. Note that use of this command does not affect any transaction timeout value specified by RTR clients, so such transactions may encounter a timeout condition if the partition remains suspended.

The /RESUME qualifier restarts presentation of transactions to the server application channels.

The following examples show how to use the qualifiers:

RTR> SET PARTITION/FACILITY=Facility1/SUSPEND/TIMEOUT=5 Facility1:Partition1 RTR> RTR> SET PARTITION/FACILITY=Facility1/RESUME Facility1:Partition1

For a more complete description, see the SET PARTITION command in Chapter 6.

To suspend transaction presentation on a partition with a timeout of 30 seconds, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 3 ]; rtr_partition_state_t newState = rtr_partition_state_suspend; rtr_uns_32_t ulTimeoutSecs = 30; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_partition_cmd_timeout_secs; set_qualifiers[ 1 ].qv_value = &ulTimeoutSecs; set_qualifiers[ 2 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 2 ].qv_value = NULL;

Note that the timeout is an optional element. To resume transaction presentation, specify newState as rtr_partition_state_resume .

3.6.3 Controlling Recovery

The purpose of RTR automated recovery is to ensure the best possible consistency of application databases across a distributed computing environment. To achieve this, RTR relies in part on information stored in the journals of the participating systems. Should one or more of these systems be unavailable at recovery time, automated recovery may stall or fail awaiting availability of these systems and their journals. This is good from the point of view of data consistency, but bad when viewed from an application availability perspective.

If a partition enters a wait state or fails, but has neither a local or remote journal, an operator can instruct RTR to skip the current step in the recovery process with the /IGNORE_RECOVERY qualifier. Since this command bypasses parts of the recovery cycle, use it with caution in cases where availability is valued over consistency in application databases.

The recovery cycle can also be manually restarted with the /RESTART_RECOVERY qualifier. This may be useful if the operator previously aborted automated recovery. Since this command can result in recovery of transactions from previously inaccessible journals, do not use this if your applications are sensitive to the order in which transactions are processed by the servers.

The following example shows how to use the qualifiers:

RTR> SET PARTITION/FACILITY=Facility1/IGNORE_RECOVERY Facility1:Partition1 RTR> RTR> SET PARTITION/FACILITY=Facility1/RESTART_RECOVERY Facility1:Partition1

A complete description of the SET PARTITION command qualifiers can be found in Chapter 6.

To terminate the current recovery state, program the set_qualifier argument of rtr_set_info() as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_state_t newState = rtr_partition_state_exitwait; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

To restart recovery, specify newState as rtr_partition_state_recover .

3.6.4 Controlling the Active Site

RTR lets the system operator deploy a range of shadow and standby partitions in order to provide the desired degree of application resilience to failures. By default, RTR automatically manages the assignment of active and standby roles to the available partition instances. The operator can assign a relative priority to each backend on which a partition instance exists. Enter priority as a list of backend node names with the highest priority first in decreasing order, as shown in the following example:

RTR> SET PARTITION/PRIORITY_LIST=(BE1, BE2, BE3) Facility1:Partition1

Suspend transaction presentation before entering or changing the priority list.

Chapter 7 provides more information on the SET PARTITION command.

To set the partition backend priority list, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; char *szNodeList = "your,list,of,node,names,here" set_qualifiers[ 0 ].qv_qualifier = rtr_partition_be_priority_list; set_qualifiers[ 0 ].qv_value = &szNodeList; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

3.6.5 Controlling Failover

In a system configured for maximum fault tolerance employing both shadows and standbys, there is a choice to be made in case the primary site fails. The /FAILOVER_POLICY qualifier to the SET PARTITION command allows the system operator to select one of the following policies that RTR should pursue in selecting the new primary site in the event of a failure:

/FAILOVER_POLICY=STANDBY causes RTR to choose a standby of the failed primary (if any) to become the new primary. If there is more than one standby, the operator may also use the priority list feature (described above) to control which standby is preferred. Depending on the size of the journal of the failed primary, there will be a holdup in the processing of transactions while the journal is recovered. This is the default behavior.
/FAILOVER_POLICY=SHADOW instructs RTR to make the active secondary (if any) the new primary. A standby of the failed primary (if any) will be elected to become the new secondary.
/FAILOVER_POLICY=COMPATIBLE_PRE_V32 is a mode that will operate with configurations that contain RTR routers running versions of the software prior to V3.2. This mode will be automatically adopted if such routers exist in or join the configuration.
This mode assigns the active role to the first instance of a partition to be declared. If the system hosting this partition becomes unreachable, then the active role is assigned to the next partition instance to be declared, until the first partition's host becomes reachable again when the original roles are restored.

A factor influencing the choice of failover policy is the time required to affect a failover and the subsequent impact on client response times. The time for standby takeover of a failed node's journal depends on the size of that journal. Failover to a shadow site is affected quickly. However, if the secondary site has accumulated a backlog of transactions, they must be processed before any new transactions can be started. The choice will be determined by the characteristics of your application and configuration.

The following example shows one use of the /FAILOVER_POLICY qualifier:

RTR> SET PARTITION/FAILOVER_POLICY=SHADOW Facility1:Partition1

For more information see the SET PARTITION command in Chapter 6.

To set the partition failover policy, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_failover_policy_t newPolicy; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_failover_policy; set_qualifiers[ 0 ].qv_value = &newPolicy; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

Legal values for newPolicy are:

rtr_partition_fail_to_standby
rtr_partition_fail_to_shadow
rtr_partition_pre32_compatible

3.6.6 Controlling Transaction Replay

RTR has implemented the capability of controlling transaction replay in cases where a killer message happens during a transaction replay preventing recovery from continuing normally. A killer message causes server availability to be lost because of the presence of a message capable of causing repeated server application failure during recovery. This is typically the result of an improperly handled condition or application programming error within the server itself. Under such circumstances it may be desirable to sidestep a particular transaction, maintain server operation, and manually process the transaction at some later time.

The RTR solution is to establish, for a given partition, the maximum number of retries for any given transaction presented during recovery. Once this limit has been exceeded, the offending transaction is removed from the recovery process and is written to the journal as an exception record. Subsequent processing of this transaction requires manual intervention by someone qualified to evaluate and correct the situation in both the application and in RTR. Once the application status is understood, the SET TRANSACTION command can be used to update the journal, thus ensuring that the final state of any manually transacted exceptions are accurately reflected in future recovery operations.

The recovery retry count is partition-specific, and applies to both local and shadow recovery operations. The default is no limit on the number of retries, which permits a killer message to bring down all available servers servicing a given partition.

The recovery retry count should be set before starting (or restarting) the application servers so that the limit is established prior to the start of recovery operations.

The following example shows how to set the retry count:

RTR> SET PARTITION/RECOVERY_RETRY_COUNT=3 Facility1:Partition1

See Chapter 7 for more information on the SET PARTITION command.

To set the partition transaction recovery limit, program the set_qualifier argument of rtr_set_info() as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_uns_32_t newLimit = . . .; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_rcvy_retry_count; set_qualifiers[ 0 ].qv_value = &newLimit; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

3.6.7 Partition Persistence

Partitions in RTR are designed to be persistent, remaining until explicitly removed during normal RTR processing. However, under certain conditions, relics of partitions can remain in the RTR journal. RTR automatically performs some cleanup of such records, but depends on the creation of the relevant facility to initiate this process. For example, in a test environment, many facilities are created for temporary use, with no intention of retaining those facilities. Because the creation of each facility may cause the creation of associated records for a partition in the RTR journal, creating many ad hoc facilities can cause the RTR journal to become filled. In such a case, when trying to create a new partition (or opening a new server channel), the error message NOMOREPRT may appear. To correct this problem, the journal must be purged of these ad hoc entries. To purge the RTR journal of such unwanted transactions, use the DUMP JOURNAL command to verify the partition name and transaction ID of the unwanted transactions, and use the SET TRANSACTION command with the partition name and transaction ID to set the state to DONE. Recreate the facility with the CREATE FACILITY command.

3.7 Displaying Partition Information

Information on the definition and state of a partition is displayed with the SHOW PARTITION command, as seen in the following example. The information of interest in the context of partition management relates to the backend instance of the partition. See Chapter 7 for more information on the SHOW PARTITION command.

RTR> show partition/backend Backend partitions on node BE1 at Wed Feb 24 15:07:50 1999 Partition name Facility State RTR$DEFAULT_PARTITION_16777217 RTR$DEFAULT_FACILITY active RTR$DEFAULT_PARTITION_16777218 RTR$DEFAULT_FACILITY active

Chapter 4
Transaction Management

4.1 Overview

This section describes the concepts of RTR's transaction management capability.

The RTR transaction is the center of an RTR application, and transaction state is the property that characterizes a transaction's current condition. Whenever a transaction progresses from one stage to another, the transaction state is updated to reflect a transaction transition. Transaction states are maintained in memory. Transaction states are also stored in the RTR Journal for recovery purposes.

Three different states are used internally by RTR to keep track of transaction status.

Transaction Runtime State
Transaction Journal State
Transaction Server State

These three state types are very closely related. The Transaction Runtime State, also known as Transaction State, describes how a transaction progresses from an RTR role (FE, TR, BE) point of view. For example, a transaction can enter a stage in which its transaction state from an RTR frontend viewpoint is different from its transaction state from the viewpoint of an RTR router.

The Transaction Journal State describes how a transaction branch running on an RTR backend progresses from the RTR journal perspective. The Transaction Journal State and the Transaction Server State belong to each separate branch (participating partition) of the transaction. When a transaction branch changes state, its corresponding Transaction Journal State is updated and the new state, along with other information pertaining to this transaction, is stored in the RTR journal. The Transaction Journal State is primarily used by RTR to perform the recovery replay of a transaction after a failure, if necessary. An RTR frontend and router will not see this state. Note that because the Transaction Runtime State is not always stored immediately in the journal, the state in the journal may not always reflect the actual state of the transaction. The following table describes the Transaction Journal States.

Table 4-1 Transaction Journal States
Transaction Journal State (by Branch) Explanation of State

SENDING Initial state of the transaction branch.

VOTED The server has voted and the vote has been written to disk.

COMMIT RTR has asked the servers to commit the transaction.

ABORT RTR has asked the servers to rollback the transaction.

DONE The servers have informed RTR that the transaction has been committed to the database. It is safe to FORGET the transaction.

PRI_DONE The primary server has committed the transaction; the secondary may not have done so. This is the typical case of a REMEMBER transaction.

EXCEPTION RTR asked the server to commit the transaction, but the server failed to commit it to the database. The transaction needs manual reconciliation.

**Table 4-1 Transaction Journal States**
Transaction Journal State (by Branch)	Explanation of State
SENDING	Initial state of the transaction branch.
VOTED	The server has voted and the vote has been written to disk.
COMMIT	RTR has asked the servers to commit the transaction.
ABORT	RTR has asked the servers to rollback the transaction.
DONE	The servers have informed RTR that the transaction has been committed to the database. It is safe to FORGET the transaction.
PRI_DONE	The primary server has committed the transaction; the secondary may not have done so. This is the typical case of a REMEMBER transaction.
EXCEPTION	RTR asked the server to commit the transaction, but the server failed to commit it to the database. The transaction needs manual reconciliation.

The Transaction Server State describes transaction state as seen by a specific server, serving that branch of the transaction. RTR uses this state to determine if a server is available to process a new transaction or if a server has voted on a particular transaction. As with the Transaction Journal State, the Transaction Server State is only relevant at the backend.

RTR provides a set of comprehensive management utilities to help users closely monitor the flow of a transaction and all three types of states associated with that transaction. These utilities help users understand how a transaction migrates from one stage to another and help diagnose problems.

Use the SHOW TRANSACTION command to examine a transaction's up-to-date status on frontend, router or backend roles. With this command, users can see all three types of transaction states of a particular transaction and also understand how the RTR journal and server applications perceive this transaction. When a transaction commits or aborts, all status associated with this transaction is removed from memory and can no longer be monitored by the command.

The DUMP JOURNAL command can be used to trace and review the flow of a transaction. The RTR journal saves all of the information about a transaction. This includes its transaction journal state and the transaction messages (records) received from the RTR client and sent to the server. The information is kept until a transaction is committed or aborted and all participants have been notified.

Use the SET TRANSACTION command to modify the current state of a transaction to a new state. This command can be used to circumvent an unexpected situation. For example, in a situation where two shadowed servers are configured, the system administrator might decide not to replay (recover) all remembered transactions in an RTR journal after a failure. The SET TRANSACTION command could set specified transactions in a PRI_DONE or REMEMBER state to a DONE state and avoid the delay of transactions being remembered from a journal for fast recovery. The SET TRANSACTION command should only be used by experienced RTR system administrators as the command introduces the risk of corrupting or losing transactions if used incorrectly. It can be used on the backend only and the RTR log file must be turned on for this command.

Log file entries are made for all transaction state changes for debugging and auditing purposes.

Contents

Index

Reliable Transaction RouterSystem Manager's Manual

3.5.2 Programmed Partition Management

3.6.6 Controlling Transaction Replay

Chapter 4Transaction Management

4.1 Overview

Reliable Transaction Router
System Manager's Manual

Chapter 4
Transaction Management