Reliable Transaction Router
System Manager's Manual

Chapter 3
Partition Management

3.1 Overview

This section describes the concepts and operations of RTR's partitions.

3.1.1 What is a Partition?

Partitions are subdivisions of a routing key range of values. They are used with a partitioned data model and RTR data content routing. Partitions exist for each distinct range of values in the routing key for which a server is available to process transactions. RTR provides for failure tolerance by allowing system operators to start redundant instances of partitions in a distributed network and by automatically managing the state and flow of transactions to the partition instances.

Partition instances support the following relationships:

Concurrency - this attribute permits multiple server channels to be connected to an instance of a partition.
Standbys - multiple instances of a partition distributed over the nodes of a cluster. A standby set may have as many members as a cluster has nodes, or with some restrictions you may place a standby on any network node. At any one time, one member of the set is active while the others wait in standby mode to take over in the event of failure of the active member.
Shadows - shadow instances provide site disaster protection by allowing replication of transaction processing at a remote site. A pair of partition instances or (standby sets thereof) cooperate to provide this replication, with provision for automatic recovery of a shadow member restarting after a failure.

Prior to RTR V3.2, the creation and behavior of a partition was tied to the declaration of server application channels. Partitions and their characteristics can now be defined by the system operator. This has the following advantages:

It allows a further de-coupling of the application from its operating environment, thus reducing application programming requirements.
Allows the system operators to make choices concerning the runtime behavior of the system.

3.1.2 What is Partition Management?

Before RTR V3.2, the management of the state of a partition was an entirely automatic function of the distributed RTR system. Starting with RTR V3.2, the system operator can issue commands to control certain partition characteristics, and to set preferences concerning partition behavior.

3.2 Partition Naming

A prerequisite for partition management is the ability to identify a partition in the system that is to be the subject of management commands. For this purpose, partitions have been given names, which may be drawn from a number of sources described below.

3.2.1 Default Partition Names

Unless supplied by one of the methods described below, partitions receive automatically generated default names. They allow system operators access to the partition command set without the need to change existing application programs or site configuration procedures.

3.2.2 Programmer Supplied Names

An extension to the rtr_open_channel() call allows the application programmer to supply a name when opening a server channel. The pkeyseg argument specifies an additional item of type rtr_keyseg_t, assigning the following values:

ks_type = rtr_keyseg_partition, indicating that a partition name is being passed.
code_example>(ks_lo_bound) should point to the null-terminated string to use for the partition name.
code_example>(ks_hi_bound) must be NULL.

Using this model, the partition segments and key ranges served by the server are still specified by the server when the channel is opened.

3.2.3 System Manager Supplied Partition Names

Partitions can be defined by the system manager through the use of the code_example>(create partition) system management command, or through use of rtr_open_channel() flag arguments. The system manager can set partition characteristics with this command and applications can open channels to the partition by name. See the Section 3.4 for an example of passing a partition name with rtr_open_channel().

3.2.4 Name Format and Scope

A valid partition name must contain no more than 63 characters in length and can combine alphanumeric characters (abc123), the plus sign (+), the underscore (_), and the dollar sign ($). Partition names must be unique within a facility name and should be referenced on the command line with the facility name when using partition commands. Partition names exist only on the backend where the partition resides. You won't see the partition names at the RTR routers.

3.3 Life Cycle of a Partition

3.3.1 Implicit Partition Creation

Partitions are created implicitly when an application program calls rtr_open_channel() to create a server channel, specifying the key segments and value ranges for the segments with the pkeyseg argument. Other partition attributes are established with the flags argument. Before RTR V3.2, this was the only way in which partitions could be created. Partitions created in this way are automatically deleted when the last server channel to the partition is closed.

3.3.2 Explicit Partition Creation

Partitions can also be created by the system operator before server application program start up using system management commands. This gives the operator more control over partition characteristics. Partitions created in this way remain in the system until either explicitly deleted by the operator, or RTR is stopped.

3.3.3 Persistence of Partition Definitions

RTR stores partition definitions in the journal, and records for each transaction the partition in which it was processed. This is convenient when viewing or editing the contents of the journal, where the partition name can be used to select a subset of the transactions in the journal. RTR will not permit a change in the partition name or definition as long as transactions remain in the journal that were processed under the current name or definition for the partition. If transactions remain in the journal and you need to change the partition name or definition, you can take the following actions:

Start appropriate servers to complete processing of the transactions.
Remove the transactions from the journal with the SET TRANSACTION command.
Replace the RTR journal with the CREATE JOURNAL/SUPERSEDE command. Note that this will destroy any transactions remaining in the journal and should be done with caution.

3.4 Binding Server Channels to Named Partitions

For a server application to be able to open a channel to an explicitly created partition, the application passes the name of the partition through the pkeyseg argument of rtr_open_channel() call. It is not necessary to pass key segment descriptors, but if the application does so, they must be compatible with the existing partition definition. You may pass partition characteristics through the flags argument, but these will be superseded by those of the existing partition.

Example:

RTR> create partition/KEY1=(type. . .) par_one . . . rtr_keyseg_t partition_name; partition_name.ks_type = rtr_keyseg_partition; partition_name.ks_lo_bound = "par_one"; status - rtr_open_channel( . . ., RTR_F_OPE_SERVER, . . ., 1, &partition_name);

Summarizing, to fully de-couple server applications from the definition of the partitions to be processed, write applications that open server channels where only the required partition name is passed. Leave the management of the partition characteristics to the system managers and operators.

3.5 Entering Partition Commands

Partitions can be managed by issuing partition commands directed at the required partition after they are created. Partition commands can be entered in one of two ways:

A command line processed by the RTR command line interface, for example RTR> SET PARTITION
Programmed using rtr_set_info()

Enter partition commands on the backend where the partition is located. Note that commands that affect a partition state only take effect once the first server joins a partition. Errors encountered at that time will appear as log file entries. Using partition commands to change the state of the system causes a log file entry.

3.5.1 Command Line Usage

Partition management in the RTR command language is implemented with the following command set:

RTR> CREATE PARTITION
RTR> SET PARTITION
RTR> DELETE PARTITION

The name of the facility in which the partition resides may be specified with the slashFACILITY command line qualifier, or as a colon-separated prefix to the partition name (for example Facility1:Partition1). Detailed descriptions of the command syntax are given in the Command Line Reference section of this manual, and are summarized in the discussions below. Examples in the following sections use a partition name of Partition1 in the facility name of Facility1.

3.5.2 Programmed Partition Management

Partition commands are programmed using rtr_set_info(). Usage of the arguments are as follows:

pchannel - Supplies the address of a rtr_channel_t to receive the channel opened in the event of a successful call.
Flags must be RTR_NO_FLAGS
Verb must be the value verb_set (from the enumeration rtr_verb_t)
Object must be rtr_partition_object

select_qualifiers should identify the facility and partition, by name, for example:

rtr_qualifier_value_t select_qualifiers[ 3 ]; select_qualifiers[ 0 ].qv_qualifier = rtr_facility_name; select_qualifiers[ 0 ].qv_value = "your_facility_name_here"; select_qualifiers[ 1 ].qv_qualifier = rtr_partition_name; select_qualifiers[ 1 ].qv_value = "your_partition_name_here"; select_qualifiers[ 2 ].qv_qualifier = rtr_qualifiers_end; select_qualifiers[ 2 ].qv_value = NULL;

The set_qualifier list expresses the required change in partition behaviour or characteristic.

The rtr_set_info() call completes asynchronously. If the function call is successful, completion will be signaled by the delivery of an RTR message of type rtr_mt_closed on the channel whose identifier is returned through the pchannel argument. The programmer should retrieve this message by using rtr_receive_message(). The data accompanying the message is of type rtr_status_data_t. The completion status of the partition command can be accessed as the status field of the message data.

3.6 Managing Partitions

To manage partitions a set of commands or program calls are used. Information on managing partitions is provided in this section.

3.6.1 Controlling Shadowing

The state of shadowing for a partition can be enabled or disabled. This may be useful in the following circumstances:

Enabling site disaster protection for an application partition for the first time
A recovery aid following prolonged outage of a former shadow site.

The following restrictions apply. Shadowing for a partition can be turned off only in the absence of an active secondary site, The active member must be running in remember mode. The command will fail if entered on either an active primary or secondary with a message to this effect. If entered on a standby of either the primary or secondary, the command is accepted but fails in the RTR router. This failure is recorded with a log file entry at the router. Once shadowing is disabled, the secondary site servers will be unable to startup in shadow mode until shadowing is enabled again. Shadowing for the partition can be turned on by entering the command at the current active member or on any of its standbys.

3.6.1.1 Command Line Example

RTR> SET PARTITION/FACILITY=Facility1/SHADOW Facility1:Partition1

For further information see the SET PARTITION command in Chapter 6.

3.6.1.2 Programming Information

To enable shadowing, program the set_qualifier argument of rtr_set_info() as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_state_t newState = rtr_partition_state_shadow; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

To disable shadowing, specify newState as rtr_partition_state_noshadow.

3.6.2 Controlling Transaction Presentation

Transaction presentation is the process of passing transactions to idle server channels for processing. While transaction presentation is active, new transactions are started on the first free server channel for the appropriate partition.

Use the /SUSPEND qualifier to the SET PARTITION command to halt the presentation of new transactions to servers on the backend where the command is entered. The command completes when the processing of all currently active transactions is complete. The optional /TIMEOUT qualifier specifies, as a number of seconds, the time that the command waits for completion. If the command times out, presentation of new transactions are suspended, but there still exist transactions for which servers have yet to complete processing. The operator must decide either to reenter the command and wait a further period of time, or resume the partition. Note that use of this command does not affect any transaction timeout value specified by RTR clients, so such transactions may encounter a timeout condition if the partition remains suspended.

/RESUME qualifier restarts presentation of transactions to the server application channels.

3.6.2.1 Command Line Example

Example usage of the qualifiers:

RTR> SET PARTITION/FACILITY=Facility1/SUSPEND/TIMEOUT=5 Facility1:Partition1 RTR> RTR> SET PARTITION/FACILITY=Facility1/RESUME Facility1:Partition1

For a more complete description see the SET PARTITION command in Chapter 6.

3.6.2.2 Programming Information

To suspend transaction presentation on a partition with a timeout of 30 seconds, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 3 ]; rtr_partition_state_t newState = rtr_partition_state_suspend; rtr_uns_32_t ulTimeoutSecs = 30; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_partition_cmd_timeout_secs; set_qualifiers[ 1 ].qv_value = &ulTimeoutSecs; set_qualifiers[ 2 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 2 ].qv_value = NULL;

Note that the timeout is an optional element. To resume transaction presentation, specify newState as rtr_partition_state_resume.

3.6.3 Controlling Recovery

The purpose of RTR automated recovery is to ensure the best possible consistency of application databases across a distributed computing environment. To achieve this RTR relies in part on information stored in the journals of the participating systems. Should one or more of these systems be unavailable at recovery time, automated recovery may stall or fail awaiting availability of these systems and their journals. This is good from the point of view of data consistency, but bad when viewed from an application availability perspective.

If a partition enters a wait state or fails but has neither a local or remote journal, an operator can instruct RTR to skip the current step in the recovery process with the /IGNORE_RECOVERY qualifier. Since this command bypasses parts of the recovery cycle use it with caution in cases where availability above consistency in application databases is desired.

The recovery cycle can also be manually restarted with the /RESTART_RECOVERY qualifier. This may be useful if the operator previously aborted automated recovery. Since this command can result in recovery of transactions from previously inaccessible journals, do not use this if your applications are sensitive to the order in which transactions are processed by the servers.

3.6.3.1 Command Line Example

Example of the qualifiers:

RTR> SET PARTITION/FACILITY=Facility1/IGNORE_RECOVERY Facility1:Partition1 RTR> RTR> SET PARTITION/FACILITY=Facility1/RESTART_RECOVERY Facility1:Partition1

A complete description of the qualifiers to the SET PARTITION command can be found in Chapter 6.

3.6.3.2 Programming Information

To terminate the current recovery state, program the set_qualifier argument of rtr_set_info() as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_state_t newState = rtr_partition_state_exitwait; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_state; set_qualifiers[ 0 ].qv_value = &newState; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

To restart recovery, specify newState as rtr_partition_state_recover.

3.6.4 Controlling the Active Site

RTR lets the system operator to deploy a range of shadow and standby partitions in order to provide the desired degree of application resilience to failures. By default, RTR automatically manages the assignment of active and standby roles to the available partition instances. The operator can assign a relative priority to each backend on which a partition instance exists. Enter priority as a list of backend node names with the highest priority first in decreasing order. See the command example Section 3.6.4.1.

Suspend transaction presentation before entering or changing the priority list.

3.6.4.1 Command Line Example

RTR> SET PARTITION/PRIORITY_LIST=(BE1, BE2, BE3) Facility1:Partition1

For more information on the SET PARTITION command see Chapter 6.

3.6.4.2 Programming Information

To set the partition backend priority list, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; char *szNodeList = "your,list,of,node,names,here" set_qualifiers[ 0 ].qv_qualifier = rtr_partition_be_priority_list; set_qualifiers[ 0 ].qv_value = &szNodeList; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

3.6.5 Controlling Failover

In a system configured for maximum fault tolerance employing both shadows and standbys, there is a choice to be made in case of the failure of the primary site. The qualifier to the SET PARTITION command of /FAILOVER_POLICY= allows the system operator to select one of the following policies that RTR should pursue in selecting the new primary site in the event of a failure:

/FAILOVER_POLICY=STANDBY causes RTR to choose a standby of the failed primary (if any) to become the new primary. If there is more than one standby, the operator may addition use the priority list feature (described above) to control which standby is preferred. Depending on the size of the journal of the failed primary, there will be a hold up in the processing of transactions whilst the journal is recovered. This is the default behaviour.
/FAILOVER_POLICY=SHADOW instructs RTR to make the active secondary (if any) the new primary. A standby of the failed primary (if any) will be elected to become the new secondary. This option gives the shortest fail over time, but will move the primary to a different cluster that you may have located at a different site.
/FAILOVER_POLICY=COMPATIBLE_PRE_V32 is a mode that will operate with configurations that contain RTR routers running versions of the software prior to V3.2. This mode will be automatically adopted if such routers exists in or join the configuration.

3.6.5.1 Command Line Example

An example use of the /FAILOVER_POLICY qualifier:

RTR> SET PARTITION/FAILOVER_POLICY=SHADOW Facility1:Partition1

For more information see the SET PARTITION command in Chapter 6.

3.6.5.2 Programming Information

To set the partition failover policy, program the set_qualifier argument of the rtr_set_info() call as follows:

rtr_qualifier_value_t set_qualifiers[ 2 ]; rtr_partition_failover_policy_t newPolicy; set_qualifiers[ 0 ].qv_qualifier = rtr_partition_failover_policy; set_qualifiers[ 0 ].qv_value = &newPolicy; set_qualifiers[ 1 ].qv_qualifier = rtr_qualifiers_end; set_qualifiers[ 1 ].qv_value = NULL;

Legal values for newPolicy are:

rtr_partition_fail_to_standby
rtr_partition_fail_to_shadow
rtr_partition_pre32_compatible

Contents

Index

Reliable Transaction RouterSystem Manager's Manual

Chapter 3Partition Management