Reliable Transaction Router

Application Design Guide

Part Number: AA–REPMA–TE

June 1999

This guide explains how to design application programs for use with Reliable Transaction Router, and provides pointers on writing such applications in C.

Revision/Update Information: This is a new manual.

Operating System and Version:

OpenVMS Versions 6.2, 7.1, 7.2
Windows NT Version 4.0
Compaq Tru64 UNIX Version 4.0 to 4.0D
Sun Solaris Version 2.5, 2.6x
IBM AIX Version 4.2 to 4.3.x
Hewlett-Packard HP-UX Version 10.20

Software Version: Reliable Transaction Router Version 3.2

June 1999

COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED "AS IS" AND COMPAQ COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT.

This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.

ã 1999 Digital Equipment Corporation.
All rights reserved.

The software described in this guide is furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement.

Compaq and the Compaq logo are registered in United States Patent and Trademark Office.

The following are trademarks of Compaq Computer Corporation:
AlphaServer, DEC, DECnet, DIGITAL, OpenVMS, PATHWORKS, Reliable Transaction Router, TruCluster, VAX, VMScluster, and the DIGITAL Logo.

The following are third-party trademarks:

AIX and IBM are registered trademarks of International Business Machines, Corp.
Hewlett-Packard and HP-UX are registered trademarks of Hewlett-Packard Corp.
Informix is a registered trademark of Informix Software, Inc.
Intel is a registered trademark of Intel Corporation.
Memory Channel is a trademark of Encore Computer Corporation.
Microsoft, Microsoft Access, Microsoft SQL Server, Internet Explorer, and Windows NT are trademarks or registered trademarks of Microsoft Corporation.
Netscape and Netscape Navigator are registered trademarks of Netscape Communications Corporation.
Oracle is a registered trademark of Oracle Corporation.
Sun and Solaris are registered trademarks of Sun Microsystems, Inc.
Sybase is a registered trademark of Sybase, Inc.
Tuxedo is a registered trademark of Novell, Inc.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd.

All other trademarks and registered trademarks are the property of their respective holders.

This document was prepared using Microsoft Word 8.

Conventions

This manual adopts the following conventions:

 

 

Introduction

Table of Contents

Preface *

Reader's Comments *

1.1 Conventions *

Introduction *

1.2 Reliable Transaction Router *

1.2.1 System Design *

1.3 RTR Concepts *

1.4 RTR Terminology *

1.4.1 RTR Server Types *

1.5 The RTR Environment *

1.5.1 The RTR Runtime Environment *

1.5.2 The RTR System Management Environment *

1.6 RTR Performance *

1.6.1 Conclusions *

2 Design for Tolerating Process Failure *

2.1 Use of Concurrent Servers *

3 Design for Tolerating Storage Device Failure *

4 Design for Tolerating Node Failure *

4.1 Use of Standby Servers *

4.2 Use of Shadow Servers *

4.3 Router Failover *

4.4 Server Failover *

5 Design for Tolerating Site Disaster *

5.1 *

5.2 The Role of Quorum *

5.3 Surviving on Two Nodes *

5.4 Transaction Serialization and Partitioning *

5.5 Serialization *

5.5.1 *

5.5.2 Serialization Issues *

5.6 Batch Processing Considerations *

5.7 Recovery After a Failure *

5.8 Journal Accessibility *

5.8.1 Journal Sizing *

5.8.2 Replay Anomalies *

6 Design for Performance *

6.1 Concurrent Servers *

6.2 Partitions and Performance *

6.3 Facilities and Performance *

6.4 Router Placement *

6.5 Broadcast Messaging *

6.5.1 Making Broadcasts Reliable *

6.6 Large Configurations *

6.7 Using Read-Only Transactions *

6.8 Making Transactions Independent *

7 Design for Operability *

7.1 Firewalls and RTR *

7.2 Avoiding DNS Server Failures *

7.3 Batch Procedures *

8 RTR Implementation Considerations *

8.1 RTR Requirements on Applications *

8.1.1 Be Transaction Aware *

8.1.2 Avoid Server-Specific Data *

8.1.3 Be Independent of Time of Processing *

8.1.4 Two Identical Databases for Shadow Servers *

8.1.5 Make Transactions Self-Contained *

8.1.6 Lock Shared Resources *

8.2 ACID Compliance *

8.2.1 Atomicity Rules *

8.2.2 Consistency Rules *

8.2.3 Isolation Rules *

8.2.4 Durability Rule *

8.3 The RTR Application Programming Interface *

8.3.1 The RTR.H Header File *

8.4 RTR Command Line Interface *

8.5 Design of an RTR Client/Server Application *

8.5.1 The RTR Journal *

8.5.2 RTR Messaging *

8.5.2.1 Transactional Messages *

8.5.2.2 Broadcast Messages *

8.5.2.3 Location Transparency *

8.5.2.4 Data Content Routing with Partitions or Key Ranges *

8.5.2.5 Partitions or Key Ranges *

8.5.2.6 Multithreading *

8.5.2.7 RTR Call Sequence *

8.5.2.8 Transaction States *

8.5.2.9 RTR Message Types *

8.5.2.10 Message Format Definitions *

8.5.3 Using XA *

8.5.3.1 XA Oracle Example *

8.5.3.2 Using XA with MS DTC *

8.5.3.3 XA DTC Example *

8.5.3.4 XA DTC Example *

8.5.4 Using DECdtm *

8.5.5 Nested Transactions *

8.6 RTR Transaction Processing *

8.6.1 Message Reception Styles *

8.6.2 Starting a Transaction *

8.6.3 Identifying a Transaction *

8.6.4 Committing a Transaction *

8.6.5 Server-Side Side Transaction Timeouts *

8.6.6 Two-Phase Commit *

8.6.7 Transaction Recovery *

8.6.8 Broadcast Messaging Processes *

8.6.8.1 User Events *

8.6.8.2 RTR Events *

8.7 Handling Error Conditions *

8.7.1 Authentication Using Callout Servers *

8.7.2 Distributed Deadlock Considerations *

8.7.3 Parallelism *

8.7.4 ODBC Applications *

8.7.5 Replication *

8.7.6 Idempotency Issues *

8.7.7 Partition Locks *

8.7.8 Designing for a Heterogenous Environment *

8.7.9 The Multivendor Environment *

8.7.10 RTR V2 to V3 Considerations *

8.8 Compiling and Linking your Application *

9 Appendices *

9.1 Appendix A: RTR Design Examples *

9.1.1 A Transportation Example *

9.1.1.1 Brief History *

9.1.1.2 New Implementation *

9.1.2 A Stock Exchange Example *

9.1.2.1 *

9.1.2.2 Brief History *

9.1.2.3 New Implementation *

9.1.3 A Banking Example *

9.1.3.1 Brief History *

9.1.3.2 New Implementation *

9.2 Appendix B: RTR Cluster Configurations *

9.2.1.1 OpenVMS Cluster *

9.2.1.2 Compaq Tru64 UNIX TruCluster *

9.2.1.3 Windows NT Cluster *

9.3 Appendix C: RTR Sample Applications *

9.3.1 Client Application *

9.3.2 Server Application *

9.3.3 Active-X Case Study *

9.4 Appendix D: Evaluating Application Resource Requirements *

Glossary *

Index *

Preface

As an application programmer, you should be familiar with the following concepts:

If you are not familiar with these software concepts, you will need to augment your knowledge by reading, taking courses, or through discussion with colleagues. You should also become familiar with the other books in the RTR documentation kit, listed below.

The goal of this document is to assist an experienced application programmer to understand the Reliable Transaction Router (RTR) application programming interface (API), and to create applications that work with RTR. This document is intended to be read from start to finish; later you can use it for reference.

Additional resources in the RTR documentation kit include:

Document

Content

Reliable Transaction Router Application Programmer's Reference Manual

Explains how to design and code RTR applications; contains full descriptions of the RTR API calls.

Reliable Transaction Router
System Manager’s Manual

Describes how to configure, manage, and
monitor RTR.

Reliable Transaction Router
Migration Guide

Explains how to migrate from RTR Version 2 to RTR Version 3 (OpenVMS only).

Reliable Transaction Router
Installation Guide

Describes how to install RTR.

Reliable Transaction Router Release Notes

Describes new features, changes, and
known restrictions for RTR.

 

Other books that can be helpful in developing your transaction processing application include:

Philip A. Bernstein, Eric Newcomer, Principles of Transaction Processing, Morgan Kaufman, 1997

Jim Gray, Andreas Reuter, Transaction Processing: Concepts and Techniques, Morgan Kaufman, 1992

You will find additional information on RTR and existing implementations on the RTR web site http://www.software.digital.com/rtr/.

Reader's Comments

Compaq welcomes your comments on this guide. Please send your comments and suggestions by email to rtrdoc@compaq.com. Please include the document title, date from title page, order number, section and page numbers in your message.

 

Conventions

This manual adopts the following conventions:

Convention

Description

New term

New terms are shown in bold when introduced and defined. All RTR terms are defined in the glossary at the end of this document.

User input

User input and programming examples are shown in a monospaced font.

Parameter

Parameters you can change are shown in italics. Terms defined only in the glossary are also shown in italics when presented for the first time. Italics are also used for titles of manuals and books, and for emphasis.

FE

RTR frontend

TR

RTR transaction router or router

BE

RTR backend

 

 

 

Introduction

This document is for the application programmer who is developing an application that works with Reliable Transaction Router (RTR). It provides information on using RTR in the design and development of an application. The major emphasis is on providing design suggestions and considerations for writing the RTR application. Example designs describing existing applications that use RTR show implementations exploiting RTR features, and provide real examples where RTR is in use.

In developing your application design:

A design flaw is almost impossible to correct in your application, so doing a thorough design, fully discussed and understood by your team, is essential to the ultimate success of your application in operation.

Reliable Transaction Router

Reliable Transaction Router (RTR) is failure-tolerant transactional messaging middleware used to implement large, distributed applications with client/server technologies. RTR helps ensure business continuity across multivendor systems and helps maximize uptime. You use the architecture of RTR to ensure high availability and transaction completion.

RTR supports applications that run on different hardware and different operating systems. RTR also works with several database products including Oracle, Microsoft Access, Microsoft SQL Server, Sybase, and Informix. For specifics on operating systems, operating system versions, and supported hardware, see the Reliable Transaction Router Software Product Description for each supported operating system.

System Design

With RTR you can design your systems for:

To design your application for high availability, you will take advantage of RTR failover capabilities and system availability solutions such as hardware clusters. Transactional shadowing, single input (no need to log on to another node after a failure) with input logging are additional features that provide RTR high availability. You can create application designs to tolerate process failure, node failure, network failure, and site failure.

To design your application for high security, you can use RTR authentication servers or callout servers, operating system security features, and firewalls.

To design your application to ensure against loss of data, you will use RTR transactional shadowing. Transactional shadowing can be at a single site or at geographically separate sites. For example, you may need to locate sites in different cities or on different power grids.

To design your application for high transaction performance, you will use a partitioned database with RTR data-content routing. You will also consider hardware performance in designing an application for high performance in processing transactions.

These designs are further described in this document.

RTR Concepts

RTR provides a continuous computing environment that is particularly valuable in financial transactions, for example in banking, stock trading, or passenger reservations systems. RTR satisfies many requirements of a continuous computing environment:

RTR also ensures that transactions have the ACID properties. A transaction with the ACID properties has the following attributes:

For more information on transactional ACID properties, see the Reliable Transaction Router Application Programmer's Reference Manual and the discussion later in this document.

RTR Terminology

The following terms are either unique to RTR or redefined when used in the RTR context. If you have learned any of these terms in other contexts, take the time to assimilate their meaning in the RTR environment. The terms are described in the following order:

An RTR application is user-written software that executes within the confines of several distributed processes. The RTR application may perform user interface, business, and server logic tasks and is written in response to some business need. An RTR application can be written in any language, commonly C or C++, and includes calls to RTR. RTR applications are composed of two kinds of actors, client applications and server applications.

A client is always a client application, one that initiates and demarcates a piece of work. In the context of RTR, a client must run on a node defined to have the frontend role. Clients typically deal with presentation services, handling forms input, screens, and so on. A client could connect to a browser running a browser applet or be a webserver acting as a gateway. In other contexts, a client can be a physical system, but in RTR and in this document, physical clients are called frontends or nodes. You can have more than one instance of a client on a node.

A server is always a server application, one that reacts to a client’s units of work and carries them through to completion. This may involve updating persistent storage such as a database file, toggling a switch on a device, or performing another predefined task. In the context of RTR, a server must run on a node defined to have the backend role. In other contexts, a server can be a physical system, but in RTR and in this document, physical servers are called backends or nodes. You can have more than one instance of a server on a node.

Servers can have partition states such as primary, standby, or shadow.

RTR expects client and server applications to identify themselves before they request RTR services. During the identification process, RTR provides a tag or handle that is used for subsequent interactions. This tag or handle is called an RTR channel. A channel is used by client and server applications to exchange units of work with the help of RTR. An application process can have one or more client or server channels.

An RTR configuration consists of nodes that run RTR client and server applications. An RTR configuration can run on several operating systems including OpenVMS, DIGITAL UNIX, and Windows NT among others (for the full set of supported operating systems, see the title page of this document, and the appropriate SPD). Nodes are connected by network links.

A node that runs client applications is called a frontend (FE), or is said to have the frontend role. A node that runs server applications is called a backend (BE). Additionally, the transaction router (TR) contains no application software but acts as a traffic cop between frontends and backends, routing transactions to the appropriate destinations. The router also eliminates any need for frontends and backends to know about each other in advance. This relieves the application programmer from the need to be concerned about network configuration details.

The mapping between nodes and roles is done using a facility. An RTR facility is the user-defined name for a particular configuration whose definition provides the role-to-node map for a given application. Nodes can share several facilities. The role of a node is defined within the scope of a particular facility. The router is the only role that knows about all three roles. A router can run on the same physical node as the frontend or backend, if that is required by configuration constraints, but such a setup would not take full advantage of failover characteristics.

A facility name is mapped to specific physical nodes and their roles using the CREATE FACILITY command.

Figure 1, Components in the RTR Environment, shows the logical relationship between client application, server application, frontends (FEs), routers (TRs), and backends (BEs) in the RTR environment. The database is represented by the cylinder. Two facilities are shown (indicated by the large double-headed arrows), the user accounts facility and the general ledger facility. The user accounts facility uses three nodes, FE, TR, and BE, while the general ledger facility uses only two, TR and BE.

Figure 1 Components in the RTR Environment

Clients send messages to servers to ask that a piece of work be done. Such requests may be bundled together into transactions. An RTR transaction consists of one or more messages that have been grouped together by a client application, so that the work done as a result of each message can be undone completely, if some part of that work cannot be done. If the system fails or is disconnected before all parts of the transaction are done, then the transaction remains incomplete.

A transaction is a piece of work or group of operations that must be executed together to perform a consistent transformation of data. This group of operations can be distributed across many nodes serving multiple databases. Applications use services that RTR provides. .

RTR provides transactional messaging in in which transactions are enclosed in messages controlled by RTR.

Transactional messaging ensures that each transaction is complete, and not partially recorded. For example, a transaction or business exchange in a bank account might be to move money from a checking account to a savings account. The complete transaction is to remove the money from the checking account and add it to the savings account.

A transaction that transfers funds from one account to another consists of two individual updates: one to debit the first account, and one to credit the second account. The transaction is not complete until both actions are done. If a system performing this work goes down after the money has been debited from the checking account but before it has been credited to the savings account, the transaction is incomplete. With transactional messaging, RTR ensures that a transaction is "all or nothing"—either fully completed or discarded; either both the checking account debit and the savings account credit are done, or the checking account debit is backed out and not recorded in the database. RTR transactions have the ACID properties (see the section ACID Compliance, page *, for more detail on these properties).

An application will also contain nontransactional tasks such as writing diagnostic trace messages or sending a broadcast message about a change in a stock price after a transaction has been completed.

Every transaction is identified on initiation with a transaction identifier or transaction ID, with which it can be logged and tracked.

To reinforce the use of these terms in the RTR context, this section briefly reviews other uses of configuration terminology.

A traditional two-tier client/server environment is based on hardware that separates application presentation and business logic (the clients) from database server activities. The client hardware runs presentation and business logic software, and server hardware runs database or data manager (DM) software, also called resource managers (RM). This type of configuration is illustrated in Figure 2, Two-Tier Client/Server Environment. (In all diagrams, all lines are bidirectional.)

Figure 2 Two-Tier Client/Server Environment

Further separation into three tiers is achieved by separating presentation software from business logic on two systems, and retaining a third physical system for interaction with the database. This is illustrated in Figure 3, Three-Tier Client/Server Environment.

Figure 3 Three-Tier Client/Server Environment

RTR extends the three-tier model based on hardware to a multitier, multilayer, or multicomponent software model.

RTR provides a multicomponent software model where clients running on frontends, routers, and servers running on backends cooperate to provide reliable service and transactional integrity. Application users interact with the client (presentation layer) on the frontend node that forwards messages to the current router. The router in turn routes the messages to the current, appropriate backend, where server applications reside, for processing. The connection to the current router is maintained until the current router fails or connections to it are lost.

All components can reside on a single node but are typically deployed on different nodes to achieve modularity, scalability, and redundancy for availability. With different systems, if one physical node goes down or off line, another router and backend node takes over.

In a slightly different configuration, you could have an application that uses an external applet running on a browser that connects to a client running on the RTR frontend. Such a configuration is shown in Figure 4, Browser Applet Configuration.

Figure 4 Browser Applet Configuration

The RTR client application could be an ASP (Active Server Page) script or a process interfacing to the webserver through a standard interface such as CGI (Common Gateway Interface).

RTR provides automatic software failure tolerance and failure recovery in multinode environments by sustaining transaction integrity in spite of hardware, communications, application, or site failures. Automatic failover and recovery of service can exploit redundant or underutilized hardware and network links.

As you modularize your application and distribute its components on frontends and backends, you can add new nodes, identify usage bottlenecks, and provide redundancy to increase availability. Adding backend nodes can help divide the transactional load and distribute it more evenly. For example, you could have a single node configuration as shown in Figure 5, RTR with Browser, Single Node and Database. A single node configuration can be useful during development, but would not normally be used when your application is deployed.

Figure 5 RTR with Browser, Single Node, and Database

When creating the configuration used by an application and defining the nodes where a facility has its frontends, routers, and backends, the setup must also define which nodes will have journal files. Each backend in an RTR configuration must have a journal file to capture transactions when other nodes are unavailable. When applications are deployed, often the backend is separated from the frontend and router, as shown in Figure 6, RTR Deployed on Two Nodes.

Figure 6 RTR Deployed on Two Nodes

In this example, the frontend with the client and the router reside on one node, and the server resides on the backend. . Frequently, routers are placed on backends rather than on frontends. A further separation of workload onto three nodes is shown in Figure 7, RTR Deployed on Three Nodes.

Figure 7 RTR Deployed on Three Nodes

This three-node configuration separates transaction load onto three nodes, but does not provide for continuing work if one of the nodes fails or becomes disconnected from the others. In many applications, there is a need to ensure that there is a server always available to access the database.

In this case, a standby server will do the job. A standby server (see Figure 8, Standby Server Configuration) is a process that can take over when the primary server is not available. Both the primary and the standby server access the same database, but the primary processes all transactions unless it is unavailable. The standby processes transactions only when the primary is unavailable. At other times, the standby can do other work. The standby server is often placed on a node other than the node where the primary server runs.

Figure 8 Standby Server Configuration

To increase transaction availability, transactions can be shadowed with a shadow server. This is called transactional shadowing and is accomplished by having a second location, perhaps often at a different site, where transactions are also recorded. This is illustrated in Figure 9, Transactional Shadowing Configuration. Data are recorded in two separate data stores or databases. The router knows about both backends and sends all transactions to both backends. RTR provides the server application with the necessary information to keep the two databases synchronized.

Figure 9 Transactional Shadowing Configuration

In the RTR environment, one data store (database or data file) is elected the primary, and a second data store is made the shadow. The shadow data store is a copy of the data store kept on the primary. If either data store becomes unavailable, all transactions continue to be processed and stored on the surviving data store. At the same time, RTR makes a record of (remembers) all transactions stored only on the shadow data store in the RTR journal by the shadow server. When the primary server and data store become available again, RTR replays the transactions in the journal to the primary data store through the primary server. This brings the data store back into synchronization.

With transactional shadowing, there is no requirement that hardware, the data store, or the operating system at different sites be the same. You could, for example, have one site running OpenVMS and another running Windows NT; the RTR transactional commit process would be the same at each site.

Note: Transactional shadowing shadows only transactions controlled by RTR.

For full redundancy to assure maximum availability, a configuration could employ both disk shadowing in clusters at separate sites coupled with transactional shadowing across sites with standby servers at each site. This configuration is shown in Figure 10, Two Sites: Transactional Shadowing and Disk Shadowing with Standby Servers. For clarity, not all possible connections are shown. In the figure, backends running standby servers are shaded, connected to routers by dashed lines. Only one site (the upper site) does full disk shadowing; the lower site is the shadow for transactions, shadowing all transactions being done at the upper site.

Figure 10 Two Sites: Transactional and Disk Shadowing with Standby Servers

RTR Server Types

In the RTR environment, in addition to the placement of frontends, routers, and servers, the application designer must determine what server capabilities to use. RTR provides four types of software servers for application use:

These are described in the next few paragraphs. You specify server types to your application in RTR API calls.

RTR server types help to provide continuous availability and a secure transactional environment.

The standby server remains idle while the RTR primary backend server performs its work, accepting transactions and updating the database. When the primary server fails, the standby server takes over, recovers any in-progress transactions, updates the database, and communicates with clients until the primary server returns. There can be many instances of a standby server. Activation of the standby server is transparent to the user.

A typical standby configuration is shown in Figure 8, Standby Server Configuration. Both physical servers running the RTR backend software are assumed by RTR to connect to the same database. The primary server is typically in use, and the standby server can be either idle or used for other applications, or data partitions, or facilities. When the primary server becomes unavailable, the standby server takes over and completes transactions as shown by the dashed line. Primary server failure could be caused by server process failure or backend (node) failure.

The intended and most common use of a standby server is in a cluster environment. In a non-cluster environment, seamless failover of standbys is not guaranteed. There can be several standby servers in an RTR configuration.

The transactional shadow server places all transactions recorded on the primary server on a second database. The transactional shadow server can be at the same site or at a different site, and must exist in a networked environment.

A transactional shadow server can also have standby servers for greater reliability. When one member of a shadow set set fails, RTR remembers the transactions executed at the surviving site in a journal, and replays them when the failed site returns. Only after all journaled transactions are recovered does the recovering site receive new online transactions. Transactional shadowing is done by partition. A transactional shadow configuration can have only two members of the shadow set.

The concurrent server is an additional instance of a a server application running on the same node. RTR delivers transactions to a free server from the pool of concurrent servers. If one server fails, the transaction in process is replayed to another server in the concurrent pool. Concurrent servers are designed primarily to increase throughput and can exploit Symmetric Multiprocessing (SMP) systems. Figure 11 illustrates the use of concurrent servers sending transactions to the same partition on a backend, the partition A-N.

Figure 11 Concurrent Servers

The callout server provides message authentication on transaction requests made in a given facility, and could be used, for example, to provide audit trail logging. A callout server can run on either backend or router nodes. A callout server receives a copy of all messages in a facility. Because the callout server votes on the outcome of each transaction it receives, it can veto any transaction that does not pass its security checks.

A callout server is facility based, not partition based; any message arriving at the facility is routed to both the server and the callout. A callout server is enabled when the facility is defined. Figure 12, A Callout Server, illustrates the use of a callout server that authenticates every transaction (txn) in a facility.

Figure 12 A Callout Server

To authenticate any part of a transaction, the callout server must vote on the transaction, but does not write to the database. RTR does not replay a transaction that is only authenticated.

When working with database systems, partitioning the database can be essential to ensuring smooth and untrammeled performance with a minimum of bottlenecks. When you partition your database, you locate different parts of your database on different disk drives to spread both the physical storage of your database onto different physical media and to balance access traffic across different disk controllers and drives.

For example, in a banking environment, you could partition your database by account number, as shown in Figure 13, Bank Partitioning Example. A partition is a segment of your database.

Figure 13 Bank Partitioning Example

Once you have decided to partition your database, you use key ranges in your application to specify how to route transactions to the appropriate database partition. A key range is the range of data held in each partition. For example, the key range for the first partition in the bank partitioning example goes from 00001 to 19999. You can assign a partition name in your application program or have it set by the system manager. Note that sometimes the terms key range and partition are used as synonyms in code examples and samples with RTR, but strictly speaking, the key range defines the partition. A partition has both a name, its partition name, and an identifier generated by RTR—the partition ID. The properties of a partition (callout, standby, shadow, concurrent, key segment range) can be defined by the system manager with a CREATE PARTITION command. For details of the command syntax, see the RTR System Manager’s Manual.

A significant advantage of the partitioning shown in the bank example is that you can add more account numbers without making changes to your application; you need only add another server and disk drive for the new account numbers. For example, say you need to add account numbers from 90,000 to 99,999 to the basic configuration of Figure 13, Bank Partitioning Example. You can add these accounts and bring them on line easily. The system manager can change the key range with a command, for example, in an overnight operation, or you can plan to do this during scheduled maintenance.

A partition can also have multiple standby servers.

A node can be configured as a primary server for one key range and as a standby server for another key range. This helps to distribute the work of the standby servers. Figure 14 illustrates this use of standbys with distributed partitioning. As shown in Figure 14, Standby with Partitioning, Application Server A is the primary server for accounts 1 to 19,999 and Application Server B is the standby for these same accounts. Application Server B is the primary for accounts 20,000 to 39,999 and Application Server A can be the standby for these same accounts (not shown in the figure). For clarity, account numbers are shown only for primary servers and one standby server. For clarity, account numbers are shown only for primary servers and one standby server.

Figure 14 Standby with Partitioning

 

The RTR Environment

The RTR environment has two parts:

The RTR Runtime Environment

When all RTR and application components are running, the RTR runtime environment contains:

Figure 15, RTR Runtime Environment, shows these components and their placement on frontend, router, and backend nodes. The frontend, router, and backend can be on the same or different nodes. If these are all on the same node, there is only one RTRACP process.

Figure 15 RTR Runtime Environment

 

The RTR System Management Environment

The RTR system management environment contains four processes:

The RTR Control Process, RTRACP, is the master program. It resides on every node where RTR has been installed and is running. RTRACP performs the following functions:

RTRACP handles interprocess communication traffic, network traffic, and is the main repository of runtime information. ACP processes operate across all RTR roles and execute certain commands both locally and at remote nodes. These commands include:

RTR CLI is the Command Line Interface that:

Commands executed directly by the CLI include:

RTRCOMSERV is the Command Server Process that:

The Command Server Process executes commands both locally and across nodes. Commands that can be executed at the RTR COMSERV include:

The RTR system management environment is illustrated in Figure 16, RTR System Management Environment.

 

 

 

 

Figure 16 RTR System Management Environment

 

RTR Performance

An important part of your application design will concern performance considerations: how will your application perform when it is running with RTR on your systems and network? Providing a methodology for evaluating the performance of your network and systems is beyond the scope of this document, but to assist your understanding of the impact of running RTR on your systems and network, this section provides information on two major performance parameters:

This information is roughly scaleable to other CPUs and networks. The material is based on empirical tests run on a relatively busy Ethernet network operating at 700 to 800 Kbps (kilobytes per second). This baseline for the network was based on FTP tests (doing file transfer using a File Transfer Protocol tool) because in a given configuration, network bandwidth is often a limiting factor in performance. For a typical CPU (for example, a DIGITAL Compaq AlphaServer 4100 5/466 4 4 MB) opening 80 to 100 channels with a small (100 byte) message size, a TPS (transactions per second) rate of 1400 to 1600 is usual.

Tests were performed using simple application programs (TPSREQ - client - and TPSSRV - server - ) that that use RTR Version 3 application programming interface calls to generate and accept transactions. (TPSREQ and TPSSRV are supplied on the RTR software kit.) The transactions consisted of a single message from client to server. The tests were conducted on OpenVMS Version 7.1 running on AlphaServer 4100 5/466 4 4 MB machines. Two hardware configurations were used:

  1. A single node, both client and server running on the same machine
  2. Two nodes, one configured as a frontend, the other as a router and backend

In each configuration, transactions per second (TPS) and CPU-load (CPU%) consumed created by the application (app-cpu) and the RTR ACP process (acp-cpu) were measured as a function of:

The number of client channels opened by the TPSREQ test program

The size of the message sent

 



Figure 17 Single-Node Node TPS and CPU Load by Number of Channels

The transactions used in these tests were regular read/write transactions; there was no use of optimizations such as ‘READONLY’ or ‘ACCEPT_FORGET’. The results for a single node with an incrementing number of channels are shown in Figure 17, Single-Node TPS and CPU Load by Number of Channels.

This test using 100-byte messages suggests the following:

CPU saturation limited the maximum TPS at about 2500.

CPU resource cost per transaction goes down rapidly as offered load increases (probably due to more effective use of RTR optimizations to ‘batch’ I/Os for disk and interprocess communication (IPC) as more transactions are being processed concurrently).

In an SMP environment, the RTRACP will likely limit the maximum TPS per system to about 3000, regardless of the number of CPUs added.

The results for a single node with a changing message size are shown in Figure 18, Single-Node TPS and CPU Load by Message Size.

Figure 18 Single-Node TPS and CPU Load by Message Size

This test using 80 client and server channels suggests that:

The results for the two-node configuration are shown in Figure 19, Two-Node TPS and CPU Load by Number of Channels.

Figure 19 Two-Node TPS and CPU Load by Number of Channels

This two-node test using 100-byte messages provides CPU usage with totals for frontend and backend combined (hence a maximum of 200 percent). This test suggests that the constraint in this case appears to be network bandwidth. The TPS rate flattens out at a network traffic level consistent with that measured on the same LAN by other independent tests (for example, using FTP to transfer data across the same network links).

Conclusions

Determining the factors limiting performance in a particular configuration can be complex. While the above performance data can be used as a rough guide to what can be achieved in particular configurations, they should be applied with caution. Performance will certainly vary depending on the capabilities of the hardware, operating system, and RTR version in use, as well as the work performed by the user application (the above tests employ a dummy application which does no real end-user work.)

In general, performance in a particular case is constrained by contention for a required resource. Typical resource constraints are:

Additionally, achieving a high TPS rate can be limited by:

For suggestions on examining your RTR environment for performance, see Appendix D in this document, Evaluating Application Resource Requirements.

Design for Tolerating Process Failure

To design an application to tolerate process failure, the application designer can use concurrent servers with RTR.

Use of Concurrent Servers

Concurrent servers can be implemented as many channels in one process or as one or many channels in many processes. By default, a server channel is declared as concurrent.

RTR delivers transactions to any open channels, so each application thread must be ready to receive and process transactions. The main constraint in using concurrent servers is the limit of available resources on the machine where the concurrent servers run.

When an application opens a channel with the rtr_open_channel call, it specifies whether the server is to be concurrent or not, as follows:

For example, the following code snippet fragment establishes a server with concurrency:

rtr_open_channel(&(&Channel,
RTR_F_OPE_SERVER,
FACILITY_NAME,
NULL,
RTR_NO_PEVTNUM,
NULL,
Key.GetKeySize(),
Key.GetKey()!= ()!= RTR_STS_OK);

If an application starts up a second server for a partition on the same node, the second server is a concurrent server by default.

The following example establishes a server with no concurrency:

rtr_open_channel(&(&Channel,
RTR_F_OPE_SERVER|RTR_F_OPE_NOCONCURRENT,
FACILITY_NAME,
NULL,
RTR_NO_PEVTNUM,
NULL,
Key.GetKeySize(),
Key.GetKey()!= ()!= RTR_STS_OK);

When If a concurrent server fails, the server application can fail over to another running concurrent server, if one exists.

Concurrent servers are useful both to improve throughput using multiple channels on a single node, and to make process failover possible. Concurrent servers can also help to minimize timeout problems in certain server applications. For more information on this topic, see the section later in this manual on Server-side Transaction Timeouts, page *.

 

For more information on the rtr_open_channel call, see the Reliable Transaction Router Application Programmer’s Reference Manual and the discussion later in this document.

Design for Tolerating Storage Device Failure

To design a system that tolerates storage device failure well, consider incorporating the following in your configuration and software designs:

Further discussion of these devices is outside the scope of this document.

 

 

Design for Tolerating Node Failure

RTR failover employs concurrent servers, standby servers, shadow servers, and journaling, or some combination of these. To survive node failure, you can use standby and shadow servers in several configurations.

The application specifies the server type in the rtr_open_channel call as follows:

rtr_status_t
rtr_open_channel ((

rtr_ope_flag_t

To To add a transactional shadow server, include the following flags:

flags = RTR_F_OPE_SERVER | RTR_F_OPE_SHADOW;

Or to disallow concurrent and standby servers, use the following flags:

flags = RTR_F_OPE_SERVER | RTR_F_OPE_NOCONCURRENT | RTR_F_OPE_NOSTANDBY;

Use of Standby Servers

RTR manages the activation of standby servers at runtime.

When an application opens a channel, it specifies whether the server is to be standby or not, as follows:

If the application starts up a second server for the partition, the server is a standby server by default.

Consider using a standby server to improve data availability, so that if your backend node fails or becomes unavailable, you can continue to process your transactions on the standby server. You can have multiple standby servers in your RTR configuration.

Use of Shadow Servers

When an application opens a channel, it specifies whether the server is to have have the capability to be a transactional shadow server or not, as follows:

Only one primary and one secondary shadow server can be established. Shadow servers can have both standby and concurrent servers.

When partition state is important to an application, the application can determine if a shadow server is in the primary or secondary partition state after server restart and recovery following a server failure. The application does this using RTR events in the rtr_open_channel call, specifying the events RTR_EVTNUM_SRPRIMARY and RTR_EVTNUM_SRSECONDARY. For example, the following is the usual rtr_open_channel declaration:

rtr_status_t
rtr_open_channel ((
rtr_channel_t *p_channel, //Channel
rtr_ope_flag_t flags, //Flags
rtr_facnam_t facnam, //Facility
rtr_rcpnam_t rcpnam, //Name of of the channel
rtr_evtnum_t *p_evtnum, //Event number list
//(for for partition states)
rtr_access_t access, //Access password
rtr_numseg_t numseg, //Number of key segments
rtr_keyseg_t *p_keyseg //Pointer to key-segment data
))

To enable receipt of RTR events that show shadow state, used if If the an application needs to include logic depending on partition state, the application enables receipt of RTR events that show shadow state. The declaration includes the events as follows:

rtr_evtnum_t evtnum = {{
RTR_EVTNUM_RTRDEF,
RTR_EVTNUM_SRPRIMARY,
RTR_EVTNUM_SRSECONDARY,
RTR_EVTNUM_ENDLIST
};};
rtr_evtnum_t *p_evtnum = &evtnum;

Broadcasts deliver using name and subscription name. For details, see the descriptions of rtr_open_channel and rtr_broadcast_event in the RTR Application Programmer's Reference Manual.

Router Failover

RTR deals with router failures automatically and transparently to the application. In the event of a router failure, neither client nor server applications need to do anything, and do not see an interruption in service. Consider router configuration when defining your RTR facility to minimize the impact of failure of the node where a router resides. If possible, place your routers on independent nodes, not on either the frontend or backend nodes of your configuration. If you do not have enough nodes to place routers on separate machines, configure routers with backends. This assumes a typical situation with many client applications on multiple frontends connecting to a few routers. For tradeoffs, see the sections on Design for Performance and Design for Operability.

Provide multiple routers for redundancy. For configurations with a large number of frontends, the failure of a router causes many frontends to seek an alternate router. Therefore, configure sufficient routers to handle reconnection activity. When you configure multiple routers, one becomes the current router. If it fails, RTR automatically fails over to another.

For read-only applications, routers can be effective for establishing multiple sites for failover without using shadowing. To achieve this, define multiple non-overlapping facilities with the same facility name in your network. Then provide client applications in the facility with the list of routers. When the router for the active facility fails, client applications are automatically connected to an alternate site. Read-only transactions can alternatively be handled by a second partition running on a standby server. This can help reduce network traffic.

When a router fails, in-progress transactions are routed to another router if one is available in that facility.

Server Failover

Server failover in the RTR environment can be for caused by failure of concurrent servers, standby servers, transactional shadow servers, or a combination of these; servers in a cluster have additional failover attributes. You enable RTR failover behavior with flags set when your application executes the rtr_open_channel statement or command.

Conceptually, Comparing server process failures can be contrasted as follows:

Note: A standby server can be configured over nodes that are not in the same cluster, but recovery of a failed node's journal is not possible until a server is restarted on the failed node. You may wish to use a standby server in another cluster to increase site-disaster tolerance (see the next chapter on Design for Tolerating Site Disaster for more details on this configuration).

Failover of any server is either event-driven or timer-based. For example, server loss due to process failure is event-driven and routinely handled by RTR. Server loss due to network link failure is timer-based, with timeout set by the SET LINK/INACTIVITY timer (default: 60 seconds). For more information on setting the inactivity timer, see the SET LINK command in the RTR System Manager's Manual.

The server type of a specific server depends on whether it is in a cluster environment and what other servers are declared for the same key range. In a cluster environment, a server declared as a standby and as a shadow becomes a standby server if there is another server for the same key range on the cluster. In a non-cluster environment, the server becomes a shadow server. For example, Figure 20, Transaction Flow with Concurrent Servers, illustrates use of concurrent servers to process transactions for Partition A.

Figure 20 Transaction Flow with Concurrent Servers

When one of the concurrent servers cannot service transactions going to partition A, another concurrent server (shown by the dashed line) processes the transaction. Failover to the concurrent server is transparent to the application and the user.

Concurrent servers are useful in environments where more than one transaction can be usefully carried out on a database partition at one time to increase throughput.

Standby servers provide additional availability and node-failure tolerance. Figure 21, Transaction Flow on Standby Servers, illustrates the processing of transactions for two partitions using standby servers.

Figure 21 Transaction Flow on Standby Servers

When the configuration is operating normally, the primary servers send transactions to the designated partition (solid lines); transactions "A" proceed through primary server A to database partition A and transactions "B" proceed through primary server B to database partition B. However, when the primary server fails, the router reroutes transactions "A" through the standby server A’ to partition A, and transactions "B" through the standby server B’ to database partition B. Note that standby servers for different partitions can be on different nodes to improve throughput and availability. For example, the bottom node could be the primary server for partition B, with the top node the standby. The normal route is shown with a solid line, the standby route with a dashed line.

When the primary path for transactions intended for a specific partition fails, the standby server can still process the transactions. Standby servers automatically take over from the primary server if it fails, transparently to the application. Standby servers recover all in-progress transactions and replay them to complete the transactions. As shown in Figure 21, Transaction Flow on Standby Servers, there can be multiple standby servers for a partition.

A transactional shadow server handles the same transactions as the primary server, and maintains an identical copy of the database on the shadow. Both the primary and the shadow server receive every transaction for their key range or partition. If the primary server fails, the shadow server continues to operate and completes the transaction. This helps to protect transactions against site failure. For greater reliability, a shadow server can have one or more standby servers. Figure 22, Transaction Flow on Shadow Servers, shows two primary servers, A and B, and their shadow servers, As and Bs.

 

Figure 22 Transaction Flow on Shadow Servers

Design for Tolerating Site Disaster

To prevent database loss at an entire site, you can use either transactional shadowing or standby servers or both. For example, for the highest level of fault tolerance, the configuration should contain two shadowed databases, each supported by a remote journal, with each server backed up by a separate standby server.

With such a configuration, you can use RTR shadowing to capture client transactions at two different physically separated sites. Then if one site becomes unavailable, the second site continues to record and process the transactions. This feature protects against site disaster. Figure 23, Two Sites with Shadowing and Standby Servers, illustrates such a configuration. The journal at each site is accessed by whichever backend is in use.

 

Figure 23 Two Sites with Shadowing and Standby Servers

To understand and plan for smooth internode communication you must understand quorum.

The Role of Quorum

Quorum is used by RTR to ensure facility consistency and deal with potential network partitioning. A facility achieves quorum if the right number of routers and backends in a facility (referred to in RTR as the quorum threshold), usually a majority, are active and connected.

In an OpenVMS cluster, for example, nodes communicate with each other to ensure that they have quorum, which is used to determine the state of the cluster; for cluster nodes to achieve quorum, a majority of possible voting member nodes must be available. In an OpenVMS cluster, quorum is node-based. In the RTR environment, quorum is role-based and facility-specific. Nodes/roles in a facility that has quorum are quorate; a node that cannot participate in transactions becomes inquorate.

RTR computes a quorum threshold based on the distributed view of connected roles. The minimum value can be two. This means that a minimum of one router and one backend is required to achieve quorum. If the computed value of quorum is less than two, quorum cannot be achieved. In exceptional circumstances, the system manager can reset the quorum threshold below its computed value to continue operations, even when only a minimum number of nodes, less than a majority, is available. Note, however, that RTR uses other heuristics, not based on simple computation of available roles, to determine quorum viability. For instance, if a missing (but configured) backend’s journal were accessible, then that journal is used to count for the missing backend.

A facility without quorum cannot complete transactions. Only a facility that has quorum, whose nodes/roles are quorate, can complete transactions. A node/role that becomes inquorate cannot participate in transactions.

Your facility definition also has an impact on the quorum negotiation undertaken for each transaction. To ensure that your configuration can survive a variety of failure scenarios (for example, loss of one or several nodes), you may need to define a node that does not process transactions. The sole use of this node in your RTR facility is to make quorum negotiation possible, even when you are left with only two nodes in your configuration. This quorum node prevents a network partition from occurring, which could cause major update synchronization problems.

Quorum is used:

Surviving on Two Nodes

If your configuration is reduced to two server nodes out of a larger population, or if you are limited to two servers only, you may need to make some adjustments in how to manage quorum to ensure that transactions are processed. Use a quorum node as a tie-breaker to ensure achieving quorum.

 

Figure 24 Configuration with Quorum Node

For example, with a five-node configuration (see Figure 24, Configuration with Quorum Node) in which one node acts as a quorum node, processing still continues even if one entire site fails (only two nodes left). When an RTR configuration is down reduced to two nodes, the system manager can manually override the calculated quorum threshold. For details on this practice, see the RTR Reliable Transaction Router System Manager’s Manual.

Transaction Serialization and Partitioning

Transactions are serialized by accept committing them in chronological ordering within a partition. Do not share data records between partitions because they cannot be serialized correctly on the shadow site.

Dependent transactions operate on the same record and must be executed in the same order on the primary and the secondary servers. Independent transactions do not update the same data records and can be processed in any order.

RTR relies on database locking during its accept phase to determine if transactions executing on concurrent servers within a partition are dependent. A server that holds a lock on a data record during its vote call (rtr_accept_tx) blocks other servers from updating the same record. Therefore only independent transactions can vote at the same time.

RTR tracks time in cycles using windows; a vote window is the time between the close of one commit cycle and the start of the next commit cycle.

RTR commit grouping enables independent transactions to be scheduled together on the shadow secondary. A group of transactions that execute an rtr_accept_tx call within a vote window form an RTR commit group, identified by a unique commit sequence number (CSN). For example, given a router (TR), backend (BE), and database (DB), each transaction sent by the backend to the database server is represented by a vote. When the database receives each vote, it locks the database and responds as ‘voted.’ The backend responds to the router in a time interval called the vote window, during which all votes that have locked the database receive the same commit sequence number. This is illustrated in Figure 25, Commit Sequence Number.

 

 

 

Figure 25 Commit Sequence Number

To improve performance on the secondary server, RTR lets this commit group of transactions execute in any order on the secondary.

RTR reuses the current CSN if it determines that the current transaction is independent of previous transactions. This way, transactions can be sent to the shadow in a bunch.

In a little more detail, RTR assumes that transactions within the vote window are independent. For example, given a router and a backend processing transactions as shown in Figure 26, CSN Vote Window, transactions processed between execution of the rtr_accept_tx call and the final rtr_receive_msg call that occurs after the SQL commit or rollback will have the same commit sequence number.

Figure 26 CSN Vote Window

A specific example using the independent transaction flag is shown in the section Making Transactions Independent, page *.

Not all database managers require locking before the SQL commit operation. For example, some insert calls create a record only during the commit operation. For such calls, the application must ensure that the table or some token is locked so that other transactions are not incorrectly placed by RTR in the same commit group.

All database systems do locking at some level, at the database, file, page, record, field, or token level, depending on the database software. You The application designer must determine the capabilities of whatever database software with which your application will interface with, and consider these in developing your application. For full use of RTR, the database your application works with must at minimum be capable of being locked at the record level.

When a transaction is specified as being independent with the INDEPENDENT flag, the current commit sequence number is assigned to the independent transaction. Thus the transaction is can be scheduled simultaneously with other transactions having the same CSN, but after all transactions with lower CSNs have been processed. RTR tracks time in cycles using windows; a vote window is the time between the close of one commit cycle and the start of the next commit cycle.

For example, independent transactions include transactions such as zero-hour ledger posting (posting of interest on all accounts at midnight), and selling bets (assuming that the order in which bets are received has no bearing on their value).

RTR examines the vote sequence of transactions executing on the primary server, and determines dependencies between these transactions. The assumption is that if two or more transactions vote within a vote window, then these transactions could be processed in any order and still produce the same result in the database. Such a group of transactions are is considered independent of each other. Such groups of transactions which that are mutually independent may still be dependent on an earlier group of independent transactions.

RTR tracks these groups through the CSN ordering. A transaction belonging to a group with a higher CSN is considered to be dependent on all transactions in a group with a lower CSN. Since Because RTR infers CSNs based on run-time behavior of servers, there is scope for improvement if the application can provide hints regarding actual dependence. If the application knows that the order in which a transaction is committed within a range of other transactions is not significant, then use of the independent transaction flag is recommended. If this flag is not used, RTR determines the CSN grouping based on its observation of the timing of the vote.

To force RTR to provide a CSN boundary, the application must:

(The CSN boundary is between the end of one CSN and the start of the next, as represented by the last transaction in one commit group and the first transaction in the subsequent commit group.)

In practice, for the transaction to be voted on after its dependent transactions, it is enough for the dependent transaction to access a common database resource, so that the database manager can serialize the transaction correctly.

Transactions that do not have the independent flag set do not automatically have a higher CSN. To ensure a higher CSN, the transaction also needs to access a record that is locked by a previous transaction. This will ensure that the dependent transaction does not vote in the same vote cycle as the transaction on which it is dependent. Similarly, transactions that have the independent flag do not automatically all have the same CSN. In particular, if they are separated by a transaction that does not have the dependent flag, then that transaction creates a CSN boundary.

Serialization

In a transactional shadow configuration using the same facility, the same partition, and the same key-range, RTR ensures that data in both databases are correctly serialized, provided that the application follows a few rules (see RTR Implementation Considerations, page *, for a description of these rules).

The application runs on the backends, processes transactions based on the business and database logic required, and hands off transactions to the database engine that to updates the database. The application can take advantage of multiple CPUs on the backends.

 

 

Serialization Issues

 

Given a series of transactions, numbered 1 through 6, where odd- numbered transactions are processed on Frontend A (FE A) and even- numbered transactions are processed on Frontend B (FE B), RTR ensures that transactions are passed to the database engine on the shadow backend in the same order as presented to the primary server. The following table represents the processing order of transactions on the front ends.

Transaction Ordering on Fronte Ends

FE A

FE B

1

2

3

4

5

6


The order in which transactions are committed on the backends, however, may not be the same as initial presentation. For example, the order in which transactions are committed on the primary server may be 2,1,4,3,5,6, as shown in the following table. :

Transaction Ordering on Backend
- Primary BE A

2

1

4

3

5

6

 

The secondary shadowed database needs to commit these transactions in the same order and RTR will ensure that this happens, transparently to the application.

However, if the application cannot take advantage of partitioning, there can be situations where delays occur while the application waits, say, for transaction 2 to be committed on the secondary. The best way to minimize this type of serialization issue is to use a partitioned database.

However, to achieve strict serialization where every transaction accepts in the same order on the primary and on the shadow, the application must use a single partition only. Transaction serialization is not guaranteed across partitions.

Batch Processing Considerations

Some of your applications may rely on batch processing for periodic activity. Application facilities can be done with batch processing. (The process for creating batch jobs is operating-system specific, and is thus outside the scope of this document.) Be careful in your design when using batch transactions. For example, accepting data in batch from legacy systems can have an impact on application results or performance. If such batch transactions update the same database as online transactions, major database inconsistencies or long transaction processing delays can occur.

Recovery After a Failure

An example of a typical failure scenario follows. The basic configuration setup is RTR with a database manager such as Sybase, that does not take advantage of Memory Channel in the Compaq Tru64 UNIX TruCluster. There are four data servers, A and B at Site 1, and C and D at Site 2, with just two partitions, 1 and 2 as shown in Figure 27, Recovery After a Failure. The database is shadowed.

Site 1 A runs Primary Partition P1

B runs Primary Partition P2 and is a standby to A for P1

Site 2 C runs Shadow Partition S1

D runs Shadow Partition S2

 

Figure 27 Recovery After a Failure

The goal for this environment is to be able to survive a "double hit," without any loss of performance. While A is down, there is a window during which there is a single point of failure in the system. To meet this need, a standby server can be launched, on machine B as a new P1, and the transactions being journalled in [P1] on C can be played across to Site 1. This can be done without any downtime, and P1 on C can continue to accept new transactions. When the playing across is finished, recovery is complete because all new transactions will be sent to both [P1] on C and P1 on B.

In more detail, the following sequence of events occurs:

  1. Node A fails with the P1.
  2. A standby server on B is started and takes over for P1 on A.
  1. Node C assumes the primary role for P1 and starts remembering transactions.
  2. RTR starts its local recovery processing. To do so, it will try to access any nodes (defined as backend nodes in the RTR configuration) in its own cluster to locate journals that may have recovery information on them. Because A and B are not in the same cluster, it does not look for A’s journal.
  3. After completing local recovery processing (with zero transactions found in its own journal) it proceeds to do shadow catchup recovery. For this it seeks a backend node outside its own cluster (that is, any of A, C or D will be suitable candidates) and checks whether that journal has any remembered transactions for this partition. Only node C will respond positively to this search. Node B will then proceed to do shadow recovery from node C’s journal.

The fact that node A is not accessible does not prevent B from being able to shadow P1 on node C. In this configuration, the absence of node A is unlikely to cause a quorum outage.

Journal Accessibility

The RTR journal on each node must be accessible to be used to replay transactions. When setting up your system, consider both journal sizing and how to deal with replay anomalies.

Journal Sizing

To size a journal, use the following rough estimates as guidelines:

Allow 140 bytes per inbound message (rtr_send_to_server call).

Use of large transactions generally causes poor performance, not only for initial processing and recording in the database, but also during recovery. Large transactions fill up the RTR journals more quickly than small ones.

Replay Anomalies

You can use the RTR_STS_REPLYDIFF status message to determine if a transaction has been recorded differently during replay. For details on this and other status messages, see the RTR Application Programmer’s Reference Manual.

You should also consider how the application is to handle secondary or shadow server errors and aborts, and write your application accordingly.

Design for Performance

In designing for performance, take the following into account:

Consider the amount of data being transferred.

Keep transactions message size short.

Don't tie up the database longer than necessary.

Use the independent transaction flag (3.2 feature).

Use multichannel applications, which are more efficient than multiple single channel applications.

Use the READ_ONLY flag to reduce RTR journaling.

Use single accept_txn flags for client/server calls to minimize transaction activity; for example, send/accept or reply//forget.

When using transactional shadowing to two sites, have high speed links between sites.

Evaluate your hardware, in particular:

Concurrent Servers

Use concurrent servers in database applications to optimize performance and continue processing when a concurrent server fails.

Partitions and Performance

Partitioning data enables the application to balance traffic to different parts of the database on different disk drives. This achieves parallelism and provides better throughput than using a single partition. Using partitions may also enable your application to survive single-drive failure in a multi-drive environment more gracefully. Transactions for the failed drive are logged by RTR while other drives continue to record data.

 

 

Facilities and Performance

To achieve performance goals, you establish facilities spread across the nodes in your physical configuration using the most powerful nodes for your backends that will have the most traffic.

In some applications with several different types of transactions, you may need to ensure that certain transactions go only to certain nodes. For example, a common type of transaction is for a client application to receive a stock sale transaction, which then proceeds through the router to the current server application. The server may then respond with a broadcast transaction to only certain client applications. This exchange of messages between frontends and backends and back again can be dictated by your facility definition of frontends, routers, and backends.

Router Placement

Placement of routers can have a significant effect on your system performance. With connectivity over a wide-area network possible, do not place your routers at long distances from your backends, if possible, and make the links between your routers and backends as high speed as possible. However, recognize that site failover may send transactions across slower-speed links. For example, Figure 28, Two-site Configuration, shows high-speed links to local backends, but lower speed links that will come into use for failover.

 

Figure 28, Two-Site Configuration

Additionally, placing routers on separate nodes from backends provides better failover capabilities than placing them on the same node as the backend.

Broadcast Messaging

When a server or client application sends out a broadcast message, the message passes through the router and is sent to the client or server application as appropriate. A client application sending a broadcast message to a small number of server applications will probably have little impact on performance, but a server application sending a broadcast message to many, potentially hundreds of clients, can have a significant impact. Therefore, consider the impact of frequent use of large messages broadcast to many destinations. If your application requires use of frequent broadcasts, place them in messages as small as possible. Broadcasts could be used, for example, to inform all clients of a change in the database that affects all clients.

Figure 29, Message Fan-Out, illustrates message fan-out from client to server, and from server to client.

 

Figure 29 Message Fan-Out

You can also improve performance by creating separate facilities for sending broadcasts.

Making Broadcasts Reliable

To help ensure that broadcasts are received at every intended destination, the application can might number them with an incrementing sequence number and have the receiving application check that all numbers are received. When a message is missing, have a retransmit server re-send the message.

Large Configurations

Very large configurations with unstable or slow network links can reduce performance significantly. In addition to ensuring that your network links are the fastest you can afford and put in place, examine the volume of inter-node traffic created by other uses and applications. RTR need not be isolated from other network and application traffic, but can be slowed down by them.

Using Read-Only Transactions

Read-only transactions can significantly improve throughput because they do not need to be journaled. A read-only database additionally can sometimes only be updated only periodically, for example, once a week rather than continuously, which again can reduce application and network traffic.

Making Transactions Independent

It can enhance performance when using transactional shadowing to process certain transactions as independent. When transactions are declared as independent, processing on the shadow server proceeds without enforced serialization. Your application analysis must establish what transactions can be considered independent, and you must then write your application accordingly. For example, bets placed at a racetrack for a specific race are typically independent of each other. In another example, transactions within one customer's bank account are typically independent of transactions within another customer’s account.

Within your application server code, you identify those transactions that can be considered independent, and process them with the independent transaction flags on rtr_accept_tx or rtr_reply_to_client calls, as appropriate. For example, the following code snippet illustrates use of the independent transaction flag on the rtr_accept_tx call:.

case rtr_mt_prepare:

/* if (txn is independent)…*/
status = rtr_accept_tx (&channel,
RTR_F_ACC_INDEPENDENT,
RTR_NO_REASON);

if (status != RTR_STS_OK)

Design for Operability

To help make your RTR system as manageable and operable as possible, you can consider several tradeoffs in establishing your RTR configuration. Consider these before creating your RTR facilities and deploying an application. Make these considerations part of your design and validation process.

Define your facilities with an eye to the number and placement of frontends, routers, and backends.

To avoid problems with quorum resolution, design your configuration with an odd number of routers to ensure that quorum can be achieved.

Place your routers separate from your backends to improve failover, so that failure of one node does not take out both the router and the backend.

If your application requires frontend failover when a router fails, frontends must be on separate nodes from the routers, but frontends and routers must of course be in the same facility. For frontend failover, a frontend must be in a facility with multiple routers.

To identify a node used only for quorum resolution, define the node as a router or as a router and frontend. Define all backends in the facility, but no other frontends.

With a widely dispersed set of nodes, for example, nodes distributed across an entire country, use local routers to deal with local front ends. This can be more efficient than having many dispersed frontends connecting to a small number of distant routers.

In many configurations, it may be more effective to place routers near backends.

Firewalls and RTR

For security purposes, your application transactions may need to use pass through firewalls in the path from the client to the server application. RTR provides this capability within the CREATE FACILITY syntax. (See the RTR Reliable Transaction Router System Manager’s Manual, Network Transports, for specifics on how to specify a node to be used as a firewall, and how to set up your application to tunnel through the firewall.)

Avoiding DNS Server Failures

Nodes in your configuration are often specified with names and IP or DECnet addresses fielded by a name server. When the name server goes down or becomes unavailable, the name service is not available and certain requests may fail. To minimize such outages, declare the referenced node name entries in a local host names file that is available even when the name server is not. Using a host names file can also improve performance for name lookups.

Batch Procedures

Operations staff often create batch or command procedures to take snapshots of system status to assist in monitoring applications. RTR’s The character cell displays (ASCII output) of RTR can provide input to such procedures. Be aware that system responses from RTR can change with each release, which can cause such command procedures to fail. If possible, plan for such changes when bringing up new versions of the product.

RTR Implementation Considerations

In addition to understanding the RTR run-time and system management environments, you must also understand the RTR applications environment and the implications of that environment in your implementation. This section provides you with information on requirements that transaction processing applications must take into account and deal with effectively and gracefully, and rules you should follow so that your application does not violate any of the rules for ensuring that your transactions are ACID compliant. The requirements and rules complement each other and sometimes repeat a similar concept. Your application must take both into account.

RTR Requirements on Applications

Applications written to operate in the RTR environment should adhere to the following rules:

Be Transaction Aware

RTR expects server applications to be transaction aware; an application must be able to roll back an appropriate amount of work when asked. Furthermore, to preserve transaction integrity, rollback must be all or nothing. Each transaction incurs some overhead, and the application must be prepared to deal with failures and concomitant rollback gracefully.

When designing your client and server applications, note the outcome of transactions. Transactional applications often store data in variables outside the control of RTR that pertain to the operation taking place. Depending on the outcome of the RTR transaction, the values of these variables may need to be adjusted. RTR guarantees delivery of messages (usually to a database), but RTR does not know about any data not passed through RTR.

The rule is: Code your application to preserve transaction integrity through failures.

Avoid Server-Specific Data

The client and server applications must not exchange any data that makes sense on only one node in the configuration. Such data can include, for example, a memory reference pointer, whose purpose is to allow the client to reference this context in a later transaction, indexes into files, node names, or database record numbers. These values would only make sense on the machine on which they were generated. If your application sends data to another machine, that machine will not be able to interpret the data correctly. Furthermore, data cannot be shared across servers or across channels.

The rule is: The means you adopt to track state must be meaningful on all nodes where your application runs.

Be Independent of Time of Processing

Transactions are assumed to contain all the context information required to be successfully executed. An RTR transaction is assumed to be independent of time of processing. For example, in a shadow environment, if the secondary server cannot credit an account because it is past midnight, but the transaction has already been successfully committed on the primary server, this would cause an inconsistency between the primary and secondary databases. Or, in another example, Transaction B cannot rely on the fact that Transaction A performed some operation before it.

Make no assumptions about the amount of time that will occur between transactions, and avoid using a transaction to establish a session with a server application that can time out. Such a timeout might occur in a client application that logs into a server application that sets a timer to determine when to log the client off. If a crash occurs after a successful logon, subsequent transactions may fail because the logged-on session is no longer valid.

The rule is: If you have operations that must not be shadowed, identify them and exclude them from your application. Furthermore, do not keep a state that can become stale over time.

In your application, you can define transactions as independent, using the independent transaction flag in your rtr_accept_tx or rtr_reply_to_client calls. For more information on the independent transaction flag and the different uses of these calls, see the RTR Application Programmer’s Reference Manual.

Two Identical Databases for Shadow Servers

Shadow server use is aimed at keeping two identical copies of the database synchronized. For example, Figure 30, Transactional Shadow Servers, illustrates a configuration with a router serving two backends to two shadow databases. The second router is for router failover.

Figure 30 Transactional Shadow Servers

If an update of a copy triggers the update of a third common database, then the application must determine whether it is running as a primary or a secondary, and only carry out an update if it is the primary. Otherwise, there can be complex failure scenarios where duplication can occur.

For example, RTR has no way to determine if a transaction being shadowed is a one-time only transaction, such as a bookstore debiting your credit card for the purchase of a book. If this transaction is processed on the primary node and the processed data fed to a third common database, and the transaction is later processed on the secondary node, your account would incorrectly be double charged. The application must handle this situation correctly.

The rule is: Design your application to deal correctly with transactions, such as debiting a credit card or bank account, that must never be done more than once.

Figure 31, Shadow Servers and Third Common Database (not recommended) shows a configuration with two shadow servers and a third independent server for a third common database. This is not a configuration recommended for use with RTR without application software that deals with the kind of failure situation described above. Another method is to decouple the shadow message from the other branch.

Figure 31 Shadow Servers and Third Common Database (not recommended)

The recommended method to use when updating a single resource through multiple paths is to use the RTR standby functionality.

Make Transactions Self-Contained

All information required to process a transaction from the perspective of the server application should be contained within the transaction message. For example, if the application required a user-id established earlier to successfully execute the transaction, then the user-id should be included in the transaction message.

The rule is: Construct complete transaction messages within your application.

Lock Shared Resources

While a server application is processing a transaction, and particularly before it "accepts" the transaction, it must assume ensure that all shared resources accessed by that transaction are locked. Failure to do so can cause unpredictable results in shadowing or recovery.

The rule is: Lock shared resources while processing each transaction.

ACID Compliance

To ensure that your application deals with transactions correctly, its transactions must be:

Atomicity Rules

For the atomic attribute, the result of a transaction is all or nothing, that is, either totally committed or totally rolled back. To ensure atomicity, do not use a data manager that cannot roll back its updates on request. All standard data managers or database management systems have the atomicity attribute, but in some cases when interfacing to an external legacy system, a flat-file system, or an in-memory database, a transaction may not be atomic.

For example, a client application may believe that a transaction has been rejected, but the database server does not. With a database manager that can make this mistake, the application itself must be able to generate a compensating transaction to roll back the update.

Data managers that do not use XA/DTC or DECdtm and Microsoft DTC to integrate with RTR using XA or DECdtm must be programmed to handle rtr_mt_msg1_uncertain messages.

For example, to illustrate the atomicity rules, Figure 32, Uncertain Interval for Transactions, shows the uncertain interval in a transaction sequence that the application program must be aware of and take into account, with by performing appropriate rollback.

Figure 32 Uncertain Interval for Transactions

If there is a crash before the rtr_accept statement is executed, on recovery, the transaction is replayed as rtr_mt_msg1 because the database will have rolled back the prior transaction instance. However, if there is a crash after the rtr_accept statement is executed, on recovery, the transaction is replayed as rtr_mt_msg1_uncertain because RTR does not know the status of the prior transaction instance. Your application must understand the implications of such failures and deal with them appropriately.

Consistency Rules

Several rules must be considered to ensure consistency. They are:

Isolation Rules

The changes to shared resources that a transaction effects do not become visible outside the transaction until the transaction commits. This makes transactions serializable. To ensure isolation:

Hold record locks throughout the RTR commit cycle. If the server crashes, the transaction could be recovered after changes to the data by a dependent transaction, generating results that are different from those already sent to the client. Also, new transactions can overtake the completion of a previous transaction from the same client if shared records are not locked.

Do not use RTR concurrent servers if your data manager does not support record locking. This can be important, for example, with in-memory databases. Concurrency relies on the independence of two operations that may affect common data records. Record locking ensures that a concurrent transaction cannot affect the consistency of data being operated on by another transaction.

Avoid certain site-dependent actions when running RTR shadow servers. For example, using transaction sequence numbers or time and date comparisons, can introduce problems. Shadowed transactions are serialized based on commit groups. If your application requires absolute transaction serialization, you cannot run concurrent servers. For example, Figure 33, Concurrent Server Commit Grouping, illustrates the serialization of commit groups. The first commit group is A1, processed on the primary backend; it is followed by commit group A2, followed by A3. Commit to the database, however, is in the order A3, A1, A2, as shown in the diagram. One the shadow site, commit to the database will be in the order A3, A2, A1, due to use of concurrent servers. If absolute serialization of CSNs is important in your application, you cannot use concurrent servers.

Figure 33 Concurrent Server Commit Grouping

RTR commit grouping allows independent transactions to be scheduled together on the shadow secondary.

Durability Rule

For a transaction to be durable, the changes that are the result of transaction commitment, must survive subsequent system and media failure. Thus transactions are both persistent and stable.

For example, your bank deposit is durable if, once the transaction is complete, your account balance reflects what you have deposited.

The durability rule is:

The RTR Application Programming Interface

RTR provides an application programming interface (API) that features transaction semantics and intelligent routing in the client/server environment. It provides software fault tolerance using shadow servers, standby servers, and concurrent server processing. It can provide authentication with callout servers. RTR makes single-point failures transparent to the application, guaranteeing that, within the limits of reliability of the basic infrastructure and the physical hardware used, a transaction will arrive at its destination.

The RTR API contains sixteen 16 calls that address 4 groups of activity:

The initialization call signals RTR that a client or server application wants to use RTR services and the termination call releases the connection once the requested work is done.

Messaging calls enable client and server applications to send and receive messages and broadcasts.

Transactional calls collect groups of messages as transactions (tx).

Informational calls enable an application to set RTR options or interrogate RTR data structures.

Initiation/
termination
Calls

Messaging Calls

Transactional Calls

Informational Calls

rtr_open_channel

rtr_broadcast_event

rtr_start_tx

rtr_request_info

rtr_close_channel

rtr_reply_to_client

rtr_accept_tx

rtr_get_tid (tid is the transaction identifier)

 

rtr_prepare_tx

rtr_reject_tx

rtr_error_text

 

rtr_send_to_server

 

rtr_set_info

 

rtr_receive_message

 

rtr_set_user_
handle

 

 

 

rtr_set_wakeup

To execute these calls using the RTR CLI, precede each with the keyword CALL. For example,

RTR> CALL RTR_OPEN_CHANNEL

 

The following table provides additional information on RTR API calls, which are listed in alphabetical order.

Routine Name

Description

Client
and Server

Client
Only

Server
Only

rtr_accept_tx()

Accepts a transaction.

Yes

 

 

rtr_broadcast_event()

Broadcasts an event message.

Yes

 

 

rtr_close_channel()

Closes a previously opened channel.

Yes

 

 

rtr_error_text()

Converts RTR message numbers to
message text.

Yes

 

 

rtr_get_tid()

Gets the current transaction ID.

Yes

 

 

rtr_open_channel()

Opens a channel for sending and
receiving messages.

Yes

 

 

rtr_prepare_tx()

Prepares a nested transaction to be committed.

 

Yes

Yes

rtr_receive_message()

Receives the next message
(transaction message, event or
completion status); returns a message
and a message status block.

Yes

 

 

rtr_reject_tx()

Rejects a transaction.

Yes

 

 

rtr_reply_to_client()

Sends a response from a server to a
client.

 

 

Yes

rtr_request_info()

Requests information from RTR.

Yes

 

 

rtr_send_to_server()

Sends a message from a client to the
server(s).

 

Yes

 

rtr_set_info()

Sets an RTR parameter.

Yes

 

 

rtr_set_user_handle()

Associates a user value with a
transaction.

Yes

 

 

rtr_set_wakeup()

Sets a function to be called on
message arrival.

Yes

 

 

rtr_start_tx()

Explicitly starts a transaction and specifies its characteristics.

 

Yes

 

 

The RTR.H Header File

The rtr.h file included with the product defines all RTR data, status, and message types, including text that can be returned in error messages from an application. You must use it in application compilation.

To support the multioperating system environment, error codes are processed by RTR using data values in rtr.h, and translated into text messages. Status codes are described in the RTR Reliable Transaction Router Application Programmer’s Reference Manual.

RTR Command Line Interface

The command line interface (CLI) to the RTR API enables the programmer to write short RTR applications from the RTR command line. This can be useful for testing short program segments and exploring how RTR works. For example, the following sequence of commands starts RTR and exchanges a message between a client and a server. To use these examples, you execute RTR commands simulating your RTR client application on the frontend and commands simulating your server application on the backend.

Note: The channel identifier identifies the application process to the ACP. The client and server process must each have a unique channel identifier. In this example, the channel identifier for the client is C and for the server is S. Both use the facility called DESIGN.

 

The following example shows communication between a client and a server created by entering commands at a terminal keyboard. The client application is executing on the frontend and the server on the backend:.

The user is called user, the facility being defined is called DESIGN, a client and a server are established, and test messages containing the words "Kathy's text today" is sent from the client to the server. After the server receives this text, the user on the server enters the words "And this is my response." System responses begin with the characters % RTR-. Notes on the procedure are enclosed in square brackets [ ]. For clarity, commands you enter are shown in bold. You can view the status of a transaction with the SHOW TRANSACTION command.

$ RTR
Copyright Digital Equipment Corporation 1994,1997. All rights reserved.
RTR> set mode/group
%RTR-I-STACOMSRV, starting command server on node NODEA in group user
RTR> CREATE JOURNAL
%RTR-F-JOUALREXI, journal already created
RTR> START RTR
%RTR-S-RTRSTART, RTR started on node NODEA in group "user"
RTR> CREATE FACILITY DESIGN/ALL_ROLES=(NodeA,NodeB) [- or /all=NodeA]
%RTR-S-FACCREATED, facility DESIGN created
RTR> SHOW FACILITY
Facilities:

Facility Frontend Router Backend
DESIGN yes yes yes
RTR> RTR_OPEN_CHANNEL/CHANNEL=C/CLIENT/fac=DESIGN
%RTR-S-OK, normal successful completion

RTR> RTR_RECEIVE_MESSAGE/CHANNEL=C/tim [to get mt_opened or mt_closed]
%RTR-S-OK, normal successful completion
Channel name: C
msgsb
msgtype: rtr_mt_opened
msglen: 8
message
status: normal successful completion
reason: Ox00000000
RTR> RTR_START_TX/CHAN=C
%RTR-S-OK, normal successful completion
RTR> RTR_SEND_TO_SERVER/CHAN=C "Kathy's text today." [text sent to the server]
%RTR-S-OK, normal successful completion
RTR> show transaction
Frontend transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999

Tid Facility FE-User State
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. SENDING

Router transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. SENDING

Backend transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. SENDING

RTR> RTR_START_TX
RTR> RTR_SEND_TO_SERVER
RTR> RTR_RECEIVE_MESSAGE/TIME=0CHAN=C
[done by RTR after user does commands at server]
%RTR-S-OK, normal successful completion
Channel name: C
msgsb
msgtype: rtr_mt_reply
msglen: 25
usrhdl: 0
Tid: 63b01d10,0,0,0,0,2e59,43ea2002
message
offset bytes text
000000 41 6E 64 20 74 68 69 73 20 69 73 20 6D 79 20 72 And this is my r
000010 65 73 70 6F 6E 73 65 2E 00 esponse..
reason: Ox00000000

RTR> RTR_ACCEPT_TX/CHANNEL=C
%RTR-S-OK, normal successful completion
RTR> show transaction
Frontend transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999

Tid Facility FE-User State
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. VOTING

Router transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER.

Backend transactions on node NodeA in group "USER" at Thu Jan 28 10:49:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. COMMIT


RTR
> RTR_RECEIVE_MESSAGE

Commands the user issues on the server application where RTR is running on the backend:

$ RTR
RTR> SHOW JOURNAL
RTR journal:-

Disk: DSA12: Blocks: 1000
RTR> start rtr
%RTR-S-RTRSTART, RTR started on node NodeA in group "username"
RTR> rtr_open/server/accept_explicit/prepare_explicit/chan=s/fac=DESIGN
%RTR-S-OK, normal successful completion
RTR> RTR_RECEIVE_MESSAGE/CHAN=S
%RTR-S-OK, normal successful completion
Channel name: S
msgsb
msgtype: rtr_mt_msg1
msglen: 19
usrhdl: 0
Tid: 63b01d10,0,0,0,0,2e59,43ea2002
message
offset bytes text
000000 4B 61 74 68 79 27 73 20 74 65 78 74 20 74 6F 64 Kathy's text tod
000010 61 79 00 ay.
reason: Ox00000000

RTR> RTR_RECEIVE_MESSAGE
RTR> RTR_REPLY_TO_CLIENT/CHAN=S "And this is my response."
%RTR-S-OK, normal successful completion
RTR> show transaction
Frontend transactions on node NodeA in group "USER" at Thu Jan 28 10:59:43 1999

Tid Facility FE-User State
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. SENDING

Router transactions on node NodeA in group "USER" at Thu Jan 28 10:59:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. SENDING

Backend transactions on node NodeA in group "USER" at Thu Jan 28 10:59:43 1999:
63b01d10,0,0,0,0,2e59,43ea2002 DESIGN USER. RECEIVING

RTR> RTR_RECEIVE_MESSAGE/CHAN=S
if OK, use: RTR_ACCEPT_TX
else,: RTR_REJECT_TX
RTR> RTR_RECEIVE_MESSAGE
RTR> STOP RTR [to end example test]

The exchange of messages you observe in executing these commands illustrates RTR activity. You must will need to retain a similar sequence in your own designs for starting up RTR and initiating your own application.

You can use RTR SHOW and MONITOR commands to display status and examine system state at any time from the CLI. For more information on RTR commands, see the Reliable Transaction Router System Manager’s Manual.

Note: The rtr_receive_message command waits or blocks if no message is currently available. When using the rtr_receive_message command in the RTR CLI, use the /TIME=0 qualifier or TIMEOUT to poll for a message, if you do not want your rtr_receive_message command to block.


Snippets from similar client and server programs using the RTR API follow and are more fully shown in the appendix to this document.

Example of an open channel call in an RTR client program:

status = rtr_open_channel(&Channel, Flags, FFacility, Recipient,

RTR_NO_PEVTNUM,

Access,

RTR_NO_NUMSEG,

RTR_NO_PKEYSEG);

if (Status != RTR_STS_OK)

Example of a receive message call in an RTR server program:

status = rtr_receive_message(&Channel, RTR_NO_FLAGS, RTR_ANYCHAN, MsgBuffer, DataLen, RTR_NO_TIMOUTMS,

&MsgStatusBlock);

if (status != RTR_STS_OK)

 

A client can have one or multiple channels, and a server can have one or multiple channels. A server can use concurrent servers, each with one channel. How you create your design depends on whether you have a single CPU or a multiple CPU machine, and on your overall design goals and implementation requirements.

Design of an RTR Client/Server Application

The design of an RTR client/server application should capitalize on the RTR multilayer model, separating client activity in the application from server and database activity. The RTR client software layers passes messages transparently to the router layer, and the router layer sends messages and transactions reliably and transparently, based on message content, to appropriate server processes. Server application software typically interacts with database software to update and query the database, and respond back to the client.

Most RTR calls complete asynchronously. Subsequent completion events are returned as messages to the application. However, the following calls can complete synchronously:

The client/server environment has both plusses and minuses. Performing processing on the client that does not need to be handled by the server is a plus, as it enables the client to perform tasks that the server need have no knowledge of, and need expend no resources to support. And with RTR as the medium for moving transactions from client to server, the application need not be concerned in detail about how the transactions are sent, or what to do in the event of node or site failures. RTR takes care of all required journaling and recovery without direct intervention by the application. The application needs no code to deal with server and link failures.

However, in a client/server environment, application design becomes more complex because the designer must consider what tasks the clients are to perform, and what the servers must do. Typically the client application will capture information entered by the user, while the server interacts with the database to update the database with transactions passed to it by the router.

The RTR Journal

The RTR journal is always in play, recording transaction activity to persistent storage. The journal thus provides the capability of recovery from any single hardware failure. When a server no longer provides service, for example, when it goes offline, goes down, or is taken out of service temporarily, RTR continues to record transaction information in a journal. Then when the offline server returns to service, the appropriate journal entries are resent to the server and recorded in the server's database. Similarly, when doing transactional shadowing, transactions that must be recorded in two shadowed databases are kept in synchronization. With RTR Version 3.2, journaling on frontends is required to support nested transactions.

If transactions do not update the database, specify them as read-only by using the RTR_F_SEN_READONLY flag on the rtr_send_to_server call. This will minimize journaling activity in a shadowed environment. RTR journaling assumes that writes are atomic at the 512-byte block level.

RTR Messaging

With RTR, client/server messaging enables the application to send:

Transactional Messages

With RTR, client and server applications communicate by exchanging messages in a transaction dialog. Transactional messages are grouped in a unit of work called a transaction. RTR takes ownership of a message when called by the application.

A transaction is a group of logically connected messages exchanged in a transaction dialog. Each dialog forms a transaction in which all participants have the opportunity to accept or reject the transaction. A transaction either succeeds or fails. When the transaction is complete, all participants are informed of the transaction’s completion status. The transaction succeeds if all participants accept it, but fails if even one participant rejects it.

In the context of a transaction, an RTR client application sends one or more messages to the server application, which responds with zero or more replies to the client application. Client messages can be grouped to form a transaction. All work within a transaction is either fully completed or all work is undone. This ensures transaction integrity from client initiation to database commit.

For example, say you want to take $20 from your checking account and add it to your savings account. With an application using RTR you are assured that this entire transaction is completed; you will not be left at the point where you have taken $20 from your checking account but it has not yet been deposited in your savings account. This feature of RTR is transactional integrity, illustrated in Figure 34, Transactional Messaging.

Figure 34 Transactional Messaging

The transactional message is either all or nothing for each bracket (enclosed in brackets [ ] in the figure).

An RTR client application sends one or more messages to one or more server applications and receives zero or more responses from one or more server applications. For example:

RTR generates a unique number, the transaction ID or tid, for each transaction.

Figure 35, Transactional Messaging Interaction, illustrates frontend/backend interaction with pseudo-code for transactions and shows transaction brackets. The transaction brackets show the steps in completing all parts of a transaction, working from left to right and top to bottom. The transaction (txn) is initiated at "Start txn" at the frontend, and completed after the "Commit txn" step on the backend. The transaction ID is encoded to ensure its uniqueness throughout the entire distributed system. In the prepare phase on the server, the application should lock the relevant database (db) records. The commit of a transaction hardens the commit to the database. The rtr_start_tx message specifies the characteristics of the transaction.

Figure 35 Transactional Messaging Interaction

Figure 36, RTR API Calls for Transactional Messaging, shows the RTR API calls you use to achieve transactional messaging in your application.

Figure 36 RTR API Calls for Transactional Messaging

Broadcast Messages

Broadcast messaging lets client and server applications send non-transactional information to recipients. Recipients must be declared with the opposite role; that is, a client can send a broadcast message to servers, and a server can send a broadcast message to clients. Broadcasts are delivered on a ""best try"" basis; they are not guaranteed to reach every potential recipient. A broadcast message can be directed to a specific client or server, or be sent to all reachable recipients.

This point-to-point messaging using broadcast semantics is a feature to use rather than transactions when the information being sent is not recorded as a transaction in the database, and when you need to send information to several clients (or servers) simultaneously. For example, in a stock trading environment, when a trade has been completed and the stock price has changed, the application can use a broadcast message to send the new stock price to all trading stations.

Location Transparency

With location transparency, applications do not need to be modified when the hardware configuration is altered, whether changes are made to systems running RTR services or to the network topology. Client and server applications do not know the location of one another so services can be started anywhere in the network. Actual configuration binding is a system management operation at run time.

Because RTR automatically takes care of failover, applications need not be concerned with specifying the location of server resources.

Data Content Routing with Partitions or Key Ranges

With RTR, applications can use data content routing to partition separate transactions to different database segments. An application achieves data content or data partition routing using key ranges.

Partitions or Key Ranges

RTR enables an application to route transactions by partition or key range, rather than sending all transactions processed by an application to a single database disk.

To plan for future expansion, consider using compound keys rather than single field keys. For example, using a bank example, with a bank that has multiple branches, an application that routes data to each bank can use a BankID key field or partition. Over time, the application may also need to further subdivide transactions not only by bank but also by customer ID. If the application is initially written with a compound key providing both a BankID and a CustomerID key, it can be simpler to make such a change in future.

Multithreading

An application can be multithreaded. Check the RTR Reliable Transaction Router Release Notes and SPD for the current extent of support for multithreaded programming.

RTR Call Sequence

For a client, an application typically uses the following calls in this order:

 

For a server, an application typically uses the following calls in this order:

The rtr_receive_message call returns a message and a message status block (MsgStatusBlock). For example,

… status = rtr_receive_message(&Channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
MsgBuffer,
DataLen,
RTR_NO_TIMOUTMS,
&MsgStatusBlock);

The message status block contains the message type, the user handle, if any, the message length, the tid, and the event number, if the message type is rtr_mt_rtr_event or rtr_mt_user_event. For more information on the message status block, see the descriptions of rtr_receive_message and rtr_set_user_handle in the Reliable Transaction Router Application Programmer’s Reference Manual.

Transaction States

With the RTR SET TRANSACTION command and the rtr_set_info API call, users can change a transaction state in their application servers.

Consider the following scenario: upon receiving an mt_accepted message indicating that a transaction branch is committed, the server normally performs an SQL commit to write all changes to the underlying database. However, the server that is to detect the SQL commit would crash the server or the underlying database is not available temporarily.

In V3.2, users can now change transaction's state to "EXCEPTION" using rtr_set_info to temporarily put this transaction in the exception queue. When things get better, users can call rtr_set_info to change the transaction state back to "Commit" or "DONE".

The following example shows how a transaction state can be changed using rtr_set_info. Also refer to rtr_set_info documentation in the Reliable Transaction Router Application Programmer's Reference Manual for more details.

....

rtr_tid_t tid;

rtr_channel_t *pchannel;

rtr_qualifier_value_t select_qualifiers[4];

rtr_qualifier_value_t set_qualifiers[3];

int select_idx = 0;

int set_idx = 0;

 

....

 

rtr_get_tid(pchannel, RTR_F_TID_RTR, &tid);

 

Normally at the RTR command level, users need to provide at least facility name and partition name qualifiers to help RTR select a desired set of transactions to be processed. Because a transaction's tid (typed rtr_tid_t) is unambiguously unique, a user needs only to specify the transaction's current state and its tid.

Note that if a transaction has multiple branches running on different partitions simultaneously on this node, RTR will reject this set transaction request with an error. RTR SET TRANSACTION can only change state for a transaction running in one partition.

 

/* transaction id, tid */

select_qualifiers[select_idx].qv_qualifier = rtr_txn_tid;

select_qualifiers[select_idx].qv_value = &tid;

select_idx++;

/* transaction branch current state */

select_qualifiers[select_idx].qv_qualifier = rtr_txn_state;

select_qualifiers[select_idx].qv_value = &rtr_txn_state_commit;

select_idx++;

 

/* transaction branch new state to be set */

set_qualifiers[set_idx].qv_qualifier = rtr_txn_state;

set_qualifiers[set_idx].qv_value = &rtr_txn_state_exception;

 

set_idx++;

 

/* Allocate channel memory */

pchannel = rdm_get_block( sizeof( rtr_channel_t) * 2);

pchannel[1] = RTR_CHAN_ENDLIST;

 

sts = rtr_set_info( pchannel,

(rtr_flags_t) 0,

(rtr_verb_t)verb_set,

rtr_transaction_object,

select_qualifiers,

set_qualifiers);

 

if (sts != RTR_STS_OK)

write an error ...

sts = rtr_receive_message (

/* channel */ &channel_id,

/* flags */ RTR_NO_FLAGS,

/* prcvchan */ pchannel,

/* pmsg */ msg,

/* maxlen */ RTR_MAX_MSGLEN,

/* timoutms */ 0,

/* pmsgsb */ &msgsb);

if (sts == RTR_STS_OK) {

const rtr_status_data_t *pstatus = (rtr_status_data_t *)msg;

rtr_uns_32_t num;

 

 

switch( pstatus -> status)

{

case RTR_STS_SETTRANDONE: /* Set Tran done successfully */

memcpy(&num, (char *)msg + sizeof(rtr_status_data_t),

sizeof(rtr_uns_32_t));

 

printf(" %d transacion(s) have been processed\n");

break;

default:

write an error ..

}

....

 

RTR Message Types

RTR calls and responses to them contain RTR message types (mt) such as rtr_mt_reply or rtr_mt_rejected. There are four groups of message types:

The following table lists all RTR message types.

Transactional

Status

Event-related

Informational

rtr_mt_msg1

rtr_mt_accepted

rtr_mt_user_event

rtr_mt_opened

rtr_mt_msg1_
uncertain

rtr_mt_rejected

rtr_mt_rtr_event

rtr_mt_closed

rtr_mt_msgn

rtr_mt_prepare

 

rtr_mt_request_info

rtr_mt_reply

rtr_mt_prepared

 

rtr_mt_rettosend


Applications should include code for all RTR expected return message types. Message types are returned to the application in the message status block. For more detail on message types see the Reliable Transaction Router Application Programmer's Reference Manual.

Message Format Definitions

To work in a mixed-operating system environment, an application can specify a message format definition on the following calls:

RTR does data marshalling, evaluating and converting data appropriately as directed by the message format descriptions provided in the application.

The following example shows an RTR application using a message format declaration; first the RTR call specifying that TXN_MSG_FMT contains the actual format declaration, then the definition used by the call.

RTR application call:

status = rtr_send_to_server(
p_channel,
RTR_NOFLAGS,
&txn_msg,
msg_size,
TXN_MSG_FMT );

Data definition:

#define TXN_MSG_FMT "%1C%UL%7UL%31C"

This data structure accommodates an 8-bit signed character field ( C ) for the key, a 32-bit unsigned field (UL) for the server number, a 224-bit (7x32) field (7UL) for the transaction ID, and a 31-bit byte (248-bit) field (31C) for character text. For details of defining message format for a mixed-endian environment, see the Reliable Transaction Router Application Programmer’s Reference Manual.

Using XA

You use the XA protocol supported by RTR to communicate with database managers. This eliminates the need for an application program to process rtr_mt_uncertain messages.

XA Oracle Example

void main( int argc, char argv[] )
{
server_key[0].ks_type = rtr_keyseg_unsigned;
server_key[0].ks_length = sizeof(rtr_uns_8_t);
server_key[0].ks_offset = 0;
server_key[0].ks_lo_bound = &low;
server_key[0].ks_hi_bound = &high;
server_key[1].ks_type = rtr_keyseg_rmname; / RM name */

server_key[1].ks_length = 0; /* not applicable /
server_key[1].ks_offset = 0;
server_key[1].ks_lo_bound = rm_name;
server_key[1].ks_hi_bound = xa_open_string;


flag = RTR_F_OPE_SERVER |
RTR_F_OPE_NOSTANDBY |
RTR_F_OPE_XA_MANAGED | / XA flag */

RTR_F_OPE_EXPLICIT_PREPARE |
RTR_F_OPE_EXPLICIT_ACCEPT;

rtr_open_channel(&server_channel, flag, fac_name,
NULL, RTR_NO_PEVTNUM, NULL, 2, server_key);
...
rtr_receive_message(&server_channel, RTR_NO_FLAGS, RTR_ANYCHAN,
&receive_msg,sizeof(receive_msg), RTR_NO_TIMOUTMS, &msgsb);
...
msg = receive_msg.receive_data_msg;

switch(msgsb.msgtype)
{
case rtr_mt_msg1:
case rtr_mt_msgn:

switch(msg.txn_type)
{
case ...
EXEC SQL ...
}
...
rtr_reply_to_client(server_channel, RTR_NO_FLAGS,
&reply_msg, sizeof(reply_msg), RTR_NO_MSGFMT);
...
case rtr_mt_prepare:
...
rtr_accept_tx(s_chan,RTR_NO_FLAGS,RTR_NO_REASON);
...
case rtr_mt_accepted:
/* EXEC SQL COMMIT; Comment out SQL Commits */

case rtr_mt_accepted:
/* EXEC SQL ROLLBACK; Comment out SQL rollbacks */

/*
case rtr_mt_msg1_uncertain:
...
*/
...
}

} while(...)

EXEC SQL COMMIT WORK RELEASE;
exit(0);
}

Using XA with MS DTC

The XA software architecture of RTR provides interoperability with the Distributed Transaction Controller of Microsoft, MS DTC. Thus RTR users can develop application programs that update MS SQL databases, MS MQ, or other Microsoft resource managers under the control of a true distributed transaction. RTR as a distributed transaction manager communicates directly with MS DTC to manage a transaction or perform recovery using the XA protocol. For each standard XA call received from RTR, MS DTC translates it into a corresponding OLE transaction call that SQL Server or MS MQ expects to perform database updates. This is shown in Figure 40, MS DTC and RTR.

Figure 40 MS DTC and RTR

For example, using XA and DTC (DIGITAL Compaq Tru64 UNIX and Microsoft Windows NT only) eliminates the need to process uncertain messages (rtr_mt_msg1_uncertain). To use the XA protocol with RTR, you do the following:

Both the resource manager instance name and the database (RM) name in [OPEN-STRING] must be identical to that in the previously executed REGISTER RM command. The information is stored in the RTR key segment structure, and the RTR_F_OPE_XA_MANAGED flag associates the channel with the XA interface.

Only one transaction at a time is processed on an RTR channel; thus a server process or thread of control can only open one channel to handle a single XA request. Better throughput may be achieved by using a multithreaded application.

For example, the following snippets from a sample server applications show (in boldface type) use of the RM key, the XA flag, and commenting out commits and rollbacks for the Oracle and DTC environments.

XA DTC Example

The following XA/DTC server application example is for a Windows NT environment only.

void main( int argc, char *argv[] )
{
server_key[0].ks_type = rtr_keyseg_unsigned;
server_key[0].ks_length = sizeof(rtr_uns_8_t);
server_key[0].ks_offset = 0;
server_key[0].ks_lo_bound = &low;
server_key[0].ks_hi_bound = &high;
server_key[1].ks_type = rtr_keyseg_rmname;

server_key[1].ks_length = 0; /* not applicable */
server_key[1].ks_offset = 0;
server_key[1].ks_lo_bound = rm_name;
server_key[1].ks_hi_bound = xa_open_string;


flag = RTR_F_OPE_SERVER |
RTR_F_OPE_NOSTANDBY |
RTR_F_OPE_XA_MANAGED |

RTR_F_OPE_EXPLICIT_PREPARE |
RTR_F_OPE_EXPLICIT_ACCEPT;

rtr_open_channel(&server_channel, flag, fac_name,
NULL, RTR_NO_PEVTNUM, NULL, 2, server_key);
...
rtr_receive_message(&server_channel, RTR_NO_FLAGS, RTR_ANYCHAN,
&receive_msg,sizeof(receive_msg), RTR_NO_TIMOUTMS, &msgsb);
...
msg = receive_msg.receive_data_msg;

switch(msgsb.msgtype)
{
case rtr_mt_msg1:
case rtr_mt_msgn:

switch(msg.txn_type)
{
case ...
EXEC SQL ...
}
...
rtr_reply_to_client(server_channel, RTR_NO_FLAGS,
&reply_msg, sizeof(reply_msg), RTR_NO_MSGFMT);
...
case rtr_mt_prepare:
...
rtr_accept_tx(s_chan,RTR_NO_FLAGS,RTR_NO_REASON);
...
case rtr_mt_accepted:
/* EXEC SQL COMMIT; Comment out SQL Commits */

case rtr_mt_accepted:
/* EXEC SQL ROLLBACK; Comment out SQL rollbacks */

/*
case rtr_mt_msg1_uncertain:
...
*/
...
}

} while(...)

EXEC SQL COMMIT WORK RELEASE;
exit(0);
}

XA DTC Example

The following XA/DTC server application example is for a Windows NT environment only.

void main( int argc, char argv[] )
{
server_key[0].ks_type = rtr_keyseg_unsigned;
server_key[0].ks_length = sizeof(rtr_uns_8_t);
server_key[0].ks_offset = 0;
server_key[0].ks_lo_bound = &low;
server_key[0].ks_hi_bound = &high;
server_key[1].ks_type = rtr_keyseg_rmname; /* RM name */
server_key[1].ks_length = sizeof(String32)+sizeof(String128);
server_key[1].ks_offset = 0;
server_key[1].ks_lo_bound = rm_name;
server_key[1].ks_hi_bound = xa_open_string;

flag = RTR_F_OPE_SERVER |
RTR_F_OPE_XA_MANAGED | /* XA flag */
RTR_F_OPE_NOSTANDBY |
RTR_F_OPE_EXPLICIT_PREPARE |
RTR_F_OPE_EXPLICIT_ACCEPT;

/* Connect SQL server thru DB-Library */

dbinit();
login = dblogin();
DBSETLUSER(login, sql_username);
DBSETLPWD(login, sql_password);
dbproc = dbopen(login, sql_servername);
dbfreelogin(login);
dbuse(dbproc, sql_dbname);

rtr_open_channel(&server_channel, flag, fac_name,
NULL, RTR_NO_PEVTNUM, NULL,2, server_key);

...

rtr_receive_message(&server_channel, RTR_NO_FLAGS, RTR_ANYCHAN,
&receive_msg,sizeof(receive_msg), RTR_NO_TIMOUTMS, &msgsb);

...
do
{
rtr_receive_message(&server_channel, RTR_NO_FLAGS, RTR_ANYCHAN,
&receive_msg, sizeof(receive_msg), RTR_NO_TIMOUTMS, &msgsb);

...
msg = receive_msg.receive_data_msg;
switch(msgsb.msgtype)
{
case rtr_mt_msg1:
dbenlistxatrans(dbproc, RTR_TRUE);

/* Remove uncertain processing

case rtr_mt_msg1_uncertain:
...

*/
case rtr_mt_msgn:
switch(msg.txn_type)
{
case
...
dbfcmd(dbproc, "...");
dbsqlexec(dbproc);
while(1) {
dbresults(dbproc);
...
break;
}

...

rtr_reply_to_client(server_channel, RTR_NO_FLAGS,
&reply_msg, sizeof(reply_msg), RTR_NO_MSGFMT);

...

case rtr_mt_prepare:

...

rtr_accept_tx(s_chan,RTR_NO_FLAGS,RTR_NO_REASON);

...

case rtr_mt_accepted:
/* EXEC SQL COMMIT; Comment out SQL Commits */
case rtr_mt_accepted:
/* EXEC SQL ROLLBACK; Comment out SQL rollbacks */

...

}
while(...);
exit(0);
}

Using DECdtm

You can use the DECdtm protocol to communicate with OpenVMS Rdb. This provides a two-phase commit capability. For additional information on using this protocol, see the OpenVMS documentation, for example, Managing DECdtm Services in the OpenVMS System Manager's Manual, and the Oracle Rdb Guide to Distributed Transactions available from Oracle.

Nested Transactions

An RTR transaction can be part of a transaction that is coordinated by a parent transaction manager such as RTR itself, or Tuxedo, or MS DTC. To implement nested transactions, the application uses the following RTR API calls:

The jointxid is the tid of the parent transaction that includes the RTR transaction as one of its parts.

rtr_open_channel using the RTR_F_OPE_FOREIGN_TM flag on the client

For additional details on the syntax and use of these calls, see the RTR Reliable Transaction Router Application Programmer’s Reference Manual.

When using nested transactions, a journal is required on the frontend node, if that node does not already support a frontend.

RTR lets transactions be embedded within other transactions; such transactions are called nested transactions or subtransactions. A nested transaction is considered indivisible within its enclosing transaction, typically coordinated by a parent transaction manager.

A transaction that is not nested is called a top-level transaction. A nested transaction is a child of its parent (enclosing) transaction. A parent may have several children who are siblings; ancestor and descendent relationships apply. A top-level transaction and its descendants are collectively known as a transaction family or a family.

A nested transaction must be strictly nested within its enclosing transaction; it must be completed (committed or aborted) before the enclosing transaction can complete. If the enclosing transaction aborts, all effects of the nested transaction are also undone.

A transaction can create several child transactions; the parent transaction performs no work until all child transactions are complete. A transaction cannot, however, observe the effects of a sibling transaction until that sibling completes.

Nested transactions isolate the effect of failures from the enclosing transaction and from other concurrent transactions. A nested transaction that has not completed can abort without causing its parent transaction to abort.

Committed nested transactions are durable (permanent) only with respect to certain other transactions: a committed child is permanent with respect to its parent. To stop a committed child, the parent transaction is stopped. The child is said to be committed with respect to its parent, or with respect to its siblings. Every transaction is committed with respect to itself and its descendants. To abort a committed nested transaction, all of its committed-with-respect-to transactions must be aborted.

 

RTR Transaction Processing

To pass transactions from client to server, RTR uses channels as identifiers. Each application communicates with RTR on a particular channel. In a multithreaded application, when multiple transactions are outstanding, the channel is the means through which the application informs RTR which transaction a command is for.

With RTR, the client or server application can :

To open a channel, the application uses the rtr_open_channel call. This opens a channel for communication with a client or server application on a specific facility. Each application process can open up to 255 channels.

For example, the rtr_open_channel call in this client application opens a single channel for the facility called DESIGN:

status = rtr_open_channel(&Channel,
RTR_F_OPE_CLIENT,
DESIGN, /* Facility name / [1]
client_name,
rtrEvents,
NULL, /* Access key / [2]
RTR_NO_NUMSEG,
RTR_NO_PKEYSEG /* Key range */ [3]
);

The application uses parameters on the rtr_open_channel call to define the application environment. Typically, the application defines the following:

For a server application, the rtr_open_channel call additionally supplies the number of key segments, numseg, and the partition name, in pkeyseg.

The syntax of the rtr_open_channel call is as follows:

status = rtr_open_channel (pchannel,flags,facnam,rcpnam, pevtnum,access,numseg,pkeyseg)

You can set up a variable section in your client program to define the required parameters and then set up your rtr_open_channel call to pass those parameters. For example, the variables definition would contain something like the following:

/*
** ---------------- Variables -------------------
*/
rtr_status_t Status;
rtr_channel_t Channel;
rtr_ope_flag_t Flags = RTR_F_OPE_CLIENT;
rtr_facnam_t Facility = "DESIGN";
rtr_rcpnam_t Recipient = RTR_NO_RCPNAM;
rtr_access_t Access = RTR_NO_ACCESS;

The rtr_open_channel call would contain:

status = rtr_open_channel(&Channel,
Flags,
Facility,
Recipient,
Evtnum,
Access,
RTR_NONUMSEG,
RTR_NO_PKEYSEG);
if (Status != RTR_STS_OK)
/*
{ Provide for error return */}

You will find more complete samples of client and server code in the appendix of this document and on the RTR software kit in the [EXAMPLES] directory.

To specify the location to return the channel identifier, use the channel argument in the rtr_open_channel call. For example,

rtr_channel_t channel;

or
rtr_channel_t
*p_channel = &channel;

This parameter points to a valid channel identifier when the application receives an rtr_mt_opened message.

To define the application role type (client or server), use the flags parameter. For example,

rtr_ope_flag_t
flags = RTR_F_OPE_CLIENT;

or

flags = RTR_F_OPE_SERVER;

The facility name is a required string supplied by the application. It identifies the RTR facility used by the application. The default facility name for the RTR CLI only is RTR$DEFAULT_FACILITY; there is no default facility name for an RTR application. You must supply one.

To define the facility name, use the facnam parameter. For example,

rtr_facnam_t
facnam = "DESIGN";

To specify a recipient name, use the rcpnam parameter, which is case sensitive. For example,
rtr_rcpnam_t
rcpnam = "* Rogers";

To specify user event numbers, use the evtnum parameter. For example,

rtr_evtnum_t all user_events[]={
RTR_EVTNUM_USERDEF,
RTR_EVTNUM_USERBASE,
RTR_EVTNUM_UP_TO,
RTR_EVTNUM_USERMAX,
RTR_EVTNUM_ENDLIST
};
There are both RTR events and user events. For more information on employing events, see Broadcast Messaging Processes, page
*, and the section on RTR Events in the Reliable Transaction Router Application Programmer's Reference Manual.

You can use the facility access key to restrict client or server access to a facility. The key acts as a password to restrict access to the specific facility for which it is declared.

To define the facility access key, use the access parameter. For example,

rtr_access_t
access = "amalasuntha";

The facility access key is a string supplied by the application. The first server channel in an RTR facility defines the access key; all subsequent server and client open channel requests must specify the same access value. To use no access key, use RTR_NO_ACCESS or NULL for the access argument.

To specify the number of key segments defined for a server application, use the numseg parameter. For example,

rtr_numseg_t
numseg = 2;

To specify the key range for a partition to do data content routing, the server application defines the routing key when it opens a channel on a facility with the rtr_open_channel call. All servers in a facility must specify the same offset, length, and data type for the key segments in the rtr_open_channel call; only high and low bounds (*ks_lo_bound, *ks_hi_bound) can be unique to each server key segment. By application convention, the client places key data in the message at the offset, length, and data type defined by the server.

The channel-open operation completes asynchronously. Call completion status is returned in a subsequent message. RTR sends a message to the application indicating successful or unsuccessful completion; the application receives the status message using an rtr_receive_message call. If status is rtr_mt_opened, the open operation is successful. If status is rtr_mt_closed, the open operation is unsuccessful, and the application must examine the failure and respond accordingly. The channel is closed.

Data returned in the user buffer with rtr_mt_opened and rtr_mt_closed include both the status and a reason. For example:

case rtr_mt_opened:
printf(" Channel %d opened\n", channel);
status = RTR_STS_OK;
break;
case rtr_mt_closed:
p_status_data = (rtr_status_data_t *)txn_msg;
printf(" cannot open channel because %s\n",
rtr_error_text(p_status_data->status));
exit(-1);

Use the call rtr_error_text to find the meaning of returned status. A client channel will receive no message at all if a facility is configured but no server is available. Once a server becomes available, RTR sends the rtr_mt_accepted message.

To close a channel, the application uses the rtr_close_channel call. A channel can be closed at any time after it has been opened. Once closed, no further operations can be performed on a channel, and no further messages for the channel are received.

To receive on a channel and obtain status information from RTR, use the rtr_receive_message call. To receive on any open channel, use the RTR_ANYCHAN value for the p_rcvchan parameter in the rtr_receive_message call. To receive from a list of channels, use the p_rcvchan parameter as a pointer to a list of channels, ending the list with RTR_CHAN_ENDLIST. An application can receive on one or more opened channels. RTR returns the channel identifier. A pointer to a channel is supplied on the rtr_open_channel call, and RTR returns the channel id identification (ID) by filling in the contents of that pointer.

To simplify matching an RTR channel ID with an application thread, an application can associate a user handle with a channel. The handle is returned in the message status block with the rtr_receive_message call. The application can use the message status block (MsgStatusBlock) to identify the message type of a received message. For example,

{
rtr_receive_message (&channel, RTR_NO_FLAGS, RTR_ANYCHAN, txn_msg, maxlen, RTR_NO_TIMOUTMS, MsgStatusBlock);
} . . .
typedef struct {
rtr_msg_type_t msgtype;
rtr_usrhdl_t usrhdl;
rtr_msglen_t msglen;
rtr_tid_t tid;
rtr_evtnum_t evtnum;
} rtr_msgsb_t;

RTR delivers both RTR and application messages when the rtr_receive_message call completes. The application can use the message type indicator in the message status block to determine relevant buffer format. For further details on using message types and interpreting the contents of the user buffer, see the Reliable Transaction Router Application Programmer's Reference Manual.

Message Reception Styles

An application can specify one of three reception styles for the rtr_receive_message call. These are:

An application can use a blocking receive to wait until a message arrives. To use a blocking receive, include RTR__NO_TIMOUTMS in the rtr_receive_message call. The call completes only when a message is available on one of the specified channels. For example,

rtr_receive_message (%(&channel, RTR_NO_FLAGS, RTR_ANYCHAN,
MsgBuffer, DataLen, RTR_NO_TIMOUTMS, &MsgStatusBlock);

An application can use a polled receive to poll RTR with a specified timeout. To use a polled receive, the application can set a value in milliseconds on the timoutms parameter.

The timeout can be:

The call completes after the specified timeout or when a message is available on one of the specified channels.

For example, the following declaration sets polling at 1 second (1000 milliseconds).

rtr_receive_message (%(&channel, RTR_NO_FLAGS, RTR_ANYCHAN, MsgBuffer, DataLen, 1000, &MsgStatusBlock);

Note: The rtr_receive_message timeout is not the same as a transaction timeout.

 

An application can use a signaled receive to be alerted by RTR when a message is received. The application establishes a signal handler using the rtr_set_wakeup call, informing RTR where to call it back when the message is ready.

To use a signaled receive, the application uses the rtr_set_wakeup call and provides the address of a routine to be called by RTR when a message is available. When the wakeup routine is called, the application can use the rtr_receive_message call to get the message. For example,

rtr_status_t
rtr_set_wakeup(
procedure
)

void
wakeup_handler(void){
rtr_receive_message();
}

main(){
rtr_set_wakeup(wakeup_handler);
sleep();
}

Note: To disable wakeup, call rtr_set_wakeup with a null routine address.

 

When using rtr_set_wakeup in a multithreaded application, be careful not to call any non-re-entrant functions or tie up system resources unnecessarily inside the callback routine.

The rtr_open_channel parameters are further described in the Reliable Transaction Router Application Programmer's Reference Manual.

Starting a Transaction

There are two ways to start a transaction:

Use the rtr_start_tx call when the application must set a client-side transaction timeout to ensure that both client and server do not wait too long for a message. When a transaction is started with rtr_send_to_server, no timeout is specified.

For example:

rtr_start_tx(&Channel,
RTR_NO_FLAGS,
RTR_NO_TIMOUTMS,
RTR_NO_JOINCHAN); //or NULL

The rtr_send_to_server() call sends a message as part of a transaction from a client. If there is no transaction currently active on the channel, a new one is started. The transaction accept can be bundled with the last message. A client has two options for message delivery after a failure:

Use the RTR_F_SEN_RETURN_TO_SENDER flag to tell RTR to return the message with a message type of rtr_mt_rettosend if delivery fails. This lets a client determine which message failed in a multiple message stream.

Use the RTR_F_SEN_EXPENDABLE flag to tell RTR not to reject the transaction associated with the message if the message cannot be delivered. This lets other non-expendable messages be delivered without creating a dependency on the flagged message. RTR does not abort the transaction if delivery fails. To specify a read-only server operation for which neither shadowing nor journaling are used, use the RTR_F_SEN_READONLY flag.

The rtr_reply_to_client() call sends a reply message from a server to the client. The reply message is part of the transaction initiated by the client. For example:,

status = rtr_reply_to_client (&Channel,
RTR_NO_FLAGS,
MsgBuffer,
DataLen,
RTR_NO_MSGFMT);

The reply message format can be of any form as designed by the application. For example:

struct acct_inq_msg_t {
char reply_text[80];
} acct_reply_msg;

Identifying a Transaction

When an application receives a message with the rtr_receive_message() call, the message status block (MsgStatusBlock) contains the transaction identifier. For example:

status = rtr_receive_message (&Channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
MsgBuffer,
DataLen,
RTR_NO_TIMOUTMS,
&MsgStatusBlock);

The pointer &MsgStatusBlock points to the message status block that describes the received message. For example:,

typedef struct {rtr_msg_type_t msgtype;
rtr_usrhdl_t usrhdl;
rtr_msglen_t msglen;
rtr_tid_t tid;
/*If a transactional message, the transaction ID or tid, msgsb.tid */
rtr_evtnum_t evtnum;
} rtr_msgsb_t;

Use the rtr_get_tid call to obtain the RTR transaction identifier for the current transaction. (The tid (transaction identifier) is a unique number generated by RTR for each transaction.) The application can use the tid if the client needs to know the tid to take some action before receiving a response.

Use the rtr_set_user_handle call to set a user handle on a transaction rather than by channel. A client application with multiple transactions outstanding can match a reply or completion status with the appropriate transaction by establishing a new user handle each time a transaction is started.

Committing a Transaction

A server application ends a transaction by accepting or rejecting it. A transaction is accepted explicitly with the rtr_accept_tx call, and rejected explicitly with the rtr_reject_tx call. RTR can reject a transaction at any time once the transaction is started but before it is committed. If RTR cannot deliver a transaction to its destination, it rejects the transaction explicitly and delivers the reject completion status to all participants.

A transaction participant can specify a reason for an accept or reject on the rtr_accept_tx and rtr_reject_tx call. If more than one transaction participant specifies a reason, RTR uses the OR operator to combine the reason values together. For example, with two servers A and B, each providing a reason code of 1 and 2, respectively, the client receives the result of the OR operation, reason code 3, in its message buffer.

Server A ServerB
rtr_reason_t rtr_reason_t
reason =1; reason=2;
rtr_reject_tx ( rtr_reject_tx (
channel, channel,
flags, flags,
reason ); reason );

typedef struct {
rtr_status_t status;
rtr_reason_t reason;
} rtr_status_data_t;

The client receives the results of the OR operation in its message buffer.

rtr_status_data_t
msgbuf;
msgbuf.reason = 3;

A transaction is done once a client or server application receives a completion message, either an rtr_mt_closed, rtr_mt_accepted, or rtr_mt_rejected message from RTR. An application no longer receives messages related to a transaction after receiving a completion message or if the application calls rtr_reject_tx. A client or server can also specify RTR_F_ACC_FORGET on the rtr_accept_tx call to signal its acceptance and end its involvement in a transaction early. RTR returns no more messages (including completion messages) associated with the transaction; any such messages received will be returned to the caller.

When issuing the rtr_accept_tx call with RTR_NO_FLAGS on the call, the caller expresses its request for successful completion of the transaction, and may give an accept ""reason"" that is passed on to all participants in the transaction. The accept is final: the caller cannot reject the transaction later. The caller cannot send any more messages for this transaction.

A client can accept a transaction in one of two ways: with the rtr_accept_tx call or by using the RTR_F_SEN_ACCEPT flag on the rtr_send_to_server call.

When the client sets RTR_F_SEN_ACCEPT on the rtr_send_to_server call, this removes the need to issue an rtr_accept_tx call and can help optimization of client traffic. Merging the data and accept messages in one call puts them in a single network packet. This can make better use of network resources and improve throughput.

The rtr_reject_tx call rejects a transaction. Any participant in a transaction can call rtr_reject_tx. The reject is final; the caller cannot accept the transaction later. The caller can specify a reject "reason" that is passed to all accepting participants of the transaction. Once the transaction has been rejected, the caller receives no more messages for this transaction.

The server can set the retry flag RTR_F_REJ_RETRY to have RTR redeliver the transaction beginning with msg1 without aborting the transaction for other participants. Issuing an rtr_reject_tx call with this flag can let another transaction proceed if locks held by this transaction cause a database deadlock.

Server-Side Side Transaction Timeouts

RTR provides client applications the option to specify a transaction timeout, but has no provision for server applications to specify a timeout on transaction duration. If there is a scarcity of server application processes, then all other client transactions remain queued. If these transactions have also specified timeouts, then these would be aborted by RTR (assuming that the timeout value is less than 2 minutes).

To get around this problem, the application designer has two options:

The first (and easier) option is to use concurrent server processes. This allows transaction requests to be serviced by other free servers, even if one server is occupied by such a transaction which is taking a long time to disappear. The second option is to design the server application so that it can abort the transaction independently.

There are three cases where this use of concurrent servers is not ideal. First, there is an implicit assumption about how many such lingering transactions might remain on the system. In the worst case, this could exceed or equal the number of client processes. But having so many concurrent server processes to cater to this contingency is wasteful of system resources. Second, use of concurrent servers is beneficial when the servers do not need to access a common resource. For instance, if all these servers needed to update the same record in the database, they would simply be waiting on a lock taken by the first server. Additional servers do not resolve this issue. Third, it must make business sense to have additional servers. For example, if transactions must be executed in the exact order in which they entered the system, concurrent servers may introduce sequencing problems.

Take the example of the order matcher in a stock trading application. Business rules may dictate that orders be matched on a first-come first-matched basis; using concurrent servers would negate this rule.

The second option is to let the server application process administer its own timeout and abort the transaction when it sees no activity on its input stream. RTR provides a relatively simple way to do this in the server. Use of timeout values on the rtr_receive_message function lets an server application specify how long it is prepared to wait for the next message. (Of course, the server should be prepared to wait forever to get a new transaction or for the result of an already-voted transaction.)

One way to achieve this would be to have a channel-specific global variable, say, called SERVER_INACTIVITY_TIMEOUT, which is set to the desired value (in milliseconds—that is, use a value of 5000 to set a 5 second timeout). Note that this timeout value should be used AFTER receiving the first message of the transaction. The value should be reset to RTR_NO_TIMOUTMS after receiving the rtr_mt_prepare message. Whenever the rtr_receive_message completes with a RTR_STS_TIMOUT, the server calls the rtr_reject_tx function on that channel to abort the partially-processed transaction. This would prevent transactions from occupying the server process beyond a reasonable time.

Two-Phase Commit

RTR uses the two-phase commit process for committing a transaction, with a prepare phase and a commit phase. Transactions must reach the commit phase before they are hardened in the database.

Transactions are prepared before being committed by accept processing. Transaction preparation consists of several steps represented by transaction states as seen by an RTR backend:

Phase

State

Meaning

Phase 0

WAITING

Waiting for a server to become free

RECEIVING

Processing client messages

Phase 1

VREQ

Vote of server requested

VOTED

Server has voted and awaits final transaction status

Phase 2

COMMIT

Final status of a committed transaction delivered to server

ABORT

Final status of an aborted transaction delivered to server

The RTR frontend sees several transaction states during accept processing:

State

Meaning

SENDING

Processing, not ready to accept.

VOTING

Accept processing in process; router has issued rtr_accept_tx call, but the transaction has not been acknowledged.

DONE

Transaction is complete, either accepted or rejected.

To initiate the prepare phase, the application specifies the RTR_F_OPE_EXPLICIT_PREPARE flag when opening the channel, and can use the message rtr_mt_prepare to check commit status. The message indicates to the server application that it is time to prepare any updates for a later commit or rollback operation. RTR lets the server application explicitly accept a transaction using the RTR_F_OPE_EXPLICIT_ACCEPT flag on the rtr_open_channel call. Alternatively, RTR implicitly accepts the transaction after the rtr_mt_accepted message is received when the server issues its next rtr_receive_message call.

The two-phase commit process is initiated by the client application when it issues a call to RTR indicating that the client "accepts" the transaction. This does not mean that the transaction is fully accepted, only that the client is prepared to accept it. RTR then asks the server applications participating in the transaction if they are prepared to accept the transaction. A server application that is prepared to accept the transaction votes its intention by issuing the rtr_accept_tx call, an " accept" vote. A server application that is not prepared to accept the transaction issues the rtr_reject_tx call, a "not accept" vote. Issuing all votes concludes the prepare phase.

When RTR has collected all votes from all participating server applications, it determines if the transaction is to be committed. If all collected votes are "accept," the transaction is committed; RTR informs all participating channels. If any vote is "not accept," the transaction is not committed. A server application can expose the prepare phase of two-phase commit by using the rtr_mt_prepare message type with the RTR_F_OPE_EXPLICIT_PREPARE flag. If the application’s rtr_open_channel call sets neither RTR_F_OPE_EXPLICIT_ACCEPT nor RTR_F_OPE_EXPLICIT_PREPARE flags, then both prepare and accept processing are implicit.

The server application can participate in the two-phase commit process fully, somewhat, a little, or not at all. To participate fully, the server does an explicit prepare and an explicit accept of the transaction; to participate somewhat, the server does an explicit prepare and an implicit accept of the transaction; to participate a little, the server does an explicit accept of the transaction; to participate not at all, the server does an implicit accept of the transaction. The following table summarizes level of server participation:

Commit Phase

Full

Somewhat

Little

Not at all

Explicit prepare

yes

yes

 

 

Explicit accept

yes

 

yes

 

Implicit accept

 

yes

 

yes


Your application can use the level of participation that makes the most sense for your business and operations needs.

To request an explicit accept and explicit prepare of transactions, the server channel is opened with the RTR_F_OPE_EXPLICIT_PREPARE and RTR_F_OPE_EXPLICIT_ACCEPT flags. These specify that the channel will receive both prepare and accept messages. The server then explicitly accepts or rejects a transaction when it receives the prepare message. The transaction sequence for an explicit prepare and explicit accept is as follows:

Client RTR Server
rtr_start_tx
rtr_send_to_server
à rtr_mt_msg1 à rtr_receive_message
rtr_accept_tx
à rtr_mt_prepare à rtr_receive_message
ß rtr_accept_tx
rtr_receive_message
ß rtr_mt_accepted ß rtr_receive_message

With explicit transaction handling, the following steps occur:

The server application waits for a message from the client application.

The server application receives the rtr_mt_prepare request message from RTR.

The server application issues the accept or reject.

A participant can reject the transaction up to the time RTR has sent the rtr_mt_prepare message type to the server in the rtr_accept_tx call. A participant can reject the transaction up to the time the rtr_accept_tx call is executed. Once the client application has called rtr_accept_tx, the result cannot be changed.

The sequence for an implicit prepare and explicit accept is as follows:

Client RTR Server
rtr_start_tx
rtr_send_to_server
à rtr_mt_msg1 à rtr_receive_message
rtr_accept_tx
à ß rtr_accept_tx
rtr_receive_message
ß rtr_mt_accepted ß rtr_receive_message

In certain database applications, where the database manager does not let an application explicitly prepare the database, transactions can simply be accepted or rejected. Server applications that do not specify the RTR_F_EXPLICIT_ACCEPT flag in their rtr_open_channel call implicitly accept the in-progress transaction when an rtr_receive_message call is issued after the last message has been received for the transaction. This call returns the final status for the transaction, rtr_mt_accepted or rtr_mt_rejected. If neither the RTR_F_OPE_EXPLICIT_ACCEPT nor the RTR_F_OPE_EXPLICT_PREPARE flags are set in the rtr_open_channel call, then both prepare and accept processing will be implicit.

For server optimization, the server can signal its acceptance of a transaction with either rtr_reply_to_client, using the RTR_F_REP_ACCEPT flag, or with the client issuing the rtr_send_to_server call, using the RTR_F_SEN_ACCEPT flag. This helps to minimize network traffic for transactions by increasing the likelihood that the data message and the RTR accept message will be sent in the same network packet.

Transaction Recovery

When a transaction fails in progress, RTR provides recovery support using RTR replay technology. RTR, as a distributed transaction manager, communicates with a database resource manager in directing the two-phase commit process. When using the XA protocol, RTR does not need to process rtr_mt_uncertain messages (see the section Using XA, page *, for more details on using XA).

The typical server application transaction sequence for committing a transaction to the database is as follows:

rtr_receive_message (rtr_mt_msg1)
SQL update
rtr_accept_tx
rtr_receive_message (rtr_mt_accepted)
SQL commit
rtr_receive_message [wait for next transaction]

This sequence is also illustrated in Figure 26, CSN Vote Window.

A failure can occur at any step in this sequence; the impact of a failure depends on when (at which step) it occurs, and on the server configuration.

If a failure occurs before the rtr_accept_tx call is issued, RTR causes the following to occur:

If a failure occurs after the rtr_accept_tx call is issued but before the rtr_receive_message, the transaction is replayed. The type of the first message is then rtr_mt_uncertain when the server is restarted. Servers should check to see if the transaction has already been executed in a previous presentation. If not, it is safe to re-execute the transaction since because the database operation never occurred. After the failure, the following occurs:

If a failure occurs after the SQL commit but before receipt of a message starting the next transaction, RTR does not know the difference.

If a failure occurs after an rtr_receive_message() call is made to begin a new transaction, RTR assumes a successful commit if a server calls rtr_receive_message after receiving the rtr_mt_accepted message and will forget the transaction. There is no replay following these events.

RTR keeps track of how many times a transaction is presented to a server application before it is VOTED. The rule is: three strikes and you’re out! After the third strike, RTR rejects the transaction with the reason RTR_STS_SRVDIED. The server application has committed the transaction and the client believes that the transaction is committed. The transaction is flagged as an exception and the database is not committed. Such an exception transaction can be manually committed if necessary. This process eliminates the possibility that a single rogue transaction can crash all available copies of a server application at both primary and secondary sites.

Application design can change this behavior. The application can specify the retry count to use when in recovery using the /recovery_retry_count qualifier in the rtr_set_info call, or the system administrator can set the retry count from the RTR CLI with the SET PARTITION command. If no recovery retry count is specified, RTR retries replay three time. For recovery, retries are infinite. For more information on the SET PARTITION command, see the Reliable Transaction Router System Manager's Manual; for more information on the rtr_set_info call, see the Reliable Transaction Router Application Programmer's Reference Manual.

When a node is down, the operator can select a different backend to use for the server restart. To complete any incomplete transactions, RTR searches the journals of all backend nodes of a facility for any transactions for the key range specified in the server’s rtr_open_channel call.

Broadcast Messaging Processes

A client or server application may need to send unsolicited messages to one or more participants. Applications tell RTR which broadcast classes they want to receive.

The sender sends one message received by several recipients. Recipients subscribe to a specific type of message. Delivery is not guaranteed.

Broadcast messages can be:

Client channels cannot broadcast to other client channels, and server channels cannot broadcast to other server channels. To enable communication between two applications of the same type, open a second channel of the other type. Messaging destination names can include wildcards, enabling flexible definition of the subset of recipients for a particular broadcast.

Use the rtr_broadcast_event call to broadcast a user event message.

Broadcast types include user events and RTR events; both are numbered.

User Events

Event numbers are provided as a list beginning with RTR_EVTNUM_USERDEF and ending with RTR_EVTNUM_ENDLIST. To subscribe to all user events, an application can use the range indicators RTR_EVTNUM_USERBASE and RTR_EVTNUM_USERMAX, separated by RTR_EVTNUM_UP_TO, to specify all possible user event numbers.

A user broadcast is named or unnamed. An unnamed broadcast does a match on user event number; the event number completely describes the event. A named broadcast does a match on user event number and recipient name. The recipient name is a user-defined string. Named broadcasts provide greater control over who receives a particular broadcast.

Named events specify an event number and a textual recipient name. The name can include wildcards (% and *).

For all unnamed events specify the evtnum field and RTR_NO_RCPSPC as the recipient name.

For example, the following snippet specifies named events for all subscribers:

rtr_status_t
rtr_open_channel {

rtr_rcpnam_t rcpnam = "*";
rtr_evtnum_t evtnum = {
RTR_EVTNUM_USERDEF,
RTR_EVTNUM_USERBASE,
RTR_EVTNUM_UP_TO,
RTR_EVTNUM_USRMAX,
RTR_EVTNUM_ENDLIST
};
rtr_evtnum_t *p_evtnum = &evtnum;

For a broadcast to be received by an application, the recipient name specified by the subscribing application on its rtr_open_channel call must match the recipient specifier used by the broadcast sender on the rtr_broadcast_event call.

Note: RTR_NO_RCPSPC is not the same as "*".

An application receives broadcasts with the rtr_receive_message call. A message type returned in the message status block informs the application of the type of broadcast received. For example:,

rtr_receive_message(…pmsg,maxlen,…pmsgsb);

The user event would be in msgsb.msgtype == rtr_mt_user_event. User broadcasts can also contain a broadcast message. This message is returned in the message buffer provided by the application. The size of the user’s buffer is determined by the maxlen field. The number of bytes actually received is returned by RTR in the msglen field of the message status block.

RTR Events

RTR delivers status information to which client and server applications can subscribe. Status information is delivered as messages, where the type of each message is an RTR event.

RTR events are numbered. The base value for RTR events is defined by the symbol RTR_EVTNUM_RTRBASE; its maximum value is defined by the symbol RTR_EVTNUM_RTRMAX. RTR events and event numbers are listed in the Reliable Transaction Router Application Programmer's Reference Manual and in the RTR header file rtr.h.

An application can subscribe to RTR events to receive notification of external events that are of interest to the application. For example, a shadow server may need to know if it is a primary or a secondary server to perform certain work, such as uploading information to a central database, that is done at only one site. An application subscribes to an RTR event with the rtr_open_channel call. For example:

rtr_status_t
rtr_open_channel(

rtr_rcpnam_t rcpnam = RTR_NO_RCPNAM;
rtr_evtnum_t evtnum = {
RTR_EVTNUM_RTRDEF,
RTR_EVTNUM_SRPRIMARY,
RTR_EVTNUM_ENDLIST

};
rtr_evtnum_t *p_evtnum = &evtnum;

To subscribe to all RTR events, use the range indicators RTR_EVTNUM_RTRBASE and RTR_EVTNUM_RTRMAX.

RTR events are delivered as messages of type _mt.rtr.event. You read the message type to determine what RTR has delivered. For example:,

rtr_status_t
rtr_receive_message (

rtr_msgsb_t *p msgsb
)

Use a data structure of the following form to receive the message:

typedef struct {
rtr_msg_type_t msgtype;
rtr_usrhdl_t usrhdl;
rtr_msglen_t msglen;
rtr_tid_t tid;
rtr_evtnum_t evtnum; /*Event Number*/
} rtr_msgsb_t;

The event number is returned in the message status block in the evtnum field. The following RTR events return key range data back to the client application:

These data are appended to the end of the rtr_msgsb_t structure; size is msglen_sizeof(rtr_msgsb_t). Other events do not have additional data.

In application design, you may wish to consider creating separate facilities for sending broadcasts. By separating broadcast notification from transactional traffic, performance improvements can be substantial. Facilities can further be reconfigured to place the RTR routers strategically to minimize wide area traffic.

Handling Error Conditions

RTR can provide information to an application with the rtr_error_text call. The rtr_error_text call translates RTR internal error message values to informational text that is more meaningful to the reader.

If an application encounters an error, it should log the error message received. Error messages are more fully described in rtr.h, where each error code is explained.

For example, the following short program uses the standard C library output function to output the text of an error status code.

Program "prog":

#include "rtr.h" or #include <rtr.h>

main() {
printf("%s",
rtr_error_text(RTR_STS_NOLICENSE));
}

When this program is run, it produces the following output:

$run prog
No license installed

The several hundred error or status codes reside in the rtr.h header file; status codes can come from any RTR subsystem. A few codes that an application is likely to encounter are described in the following table:.

Status Code

Meaning

RTR_STS_COMSTAUNO

Commitment status unobtainable. The fate of the transaction currently being committed is unobtainable; this may be due to a hardware failure.

RTR_STS_DLKTXRES

The transaction being processed was aborted due to deadlock with other transactions using the same servers. RTR will replay the transaction after the deadlock has been resolved and cleared.

RTR_STS_FELINLOS

Frontend link lost; probably due to a network failure.

RTR_STS_INVFLAGS

Invalid flags.

RTR_STS_NODSTFND

No destination found; no server had declared itself to handle the key value specified in the sent message. Probably a server down or disconnected.

RTR_STS_OK

All is well; normal successful completion.

RTR_STS_REPLYDIFF

Two servers respond with different information during a replay; transaction aborted.

RTR_STS_TIMOUT

Timeout expired; transaction aborted.

RTR_STS_SRVDIED

Probably a server image exited, for example because a node is down.

RTR_STS_SRVDIEDVOT

A server exited before committing a transaction.

RTR_STS_SRVDIEDCOM

A server exited after being told to commit a transaction.

 

RTR can abort a transaction at any time, so the application must be prepared to deal with such aborted transactions. Server applications are expected to roll back transactions as the need arises, and must be built to take the correct action, and subsequently carry on to deal with new transactions that are received.

A client application can also get a reject and must also be built to deal with the likely cases it will encounter. The application must be built to decide on the correct course of action in the event of a transaction abort.

Authentication Using Callout Servers

RTR callout servers enable security checks to be carried out on all requests using a facility. Callout servers can run on backend or router nodes. They receive a copy of every transaction delivered to or passing through the node, and they vote on every transaction. To enable a callout server, use the /CALLOUT qualifier when issuing the RTR CREATE FACILITY command. Callout servers are facility based, not key-range or partition based.

An application enables a callout server by setting a flag on the rtr_open_channel call.

For a router callout server, the application sets the following flag on the rtr_open_channel call:

rtr_ope_flag_t

flags=RTR_F_OPE_SERVER|RTR_F_OPE_TR_CALL_OUT

For a backend callout server, the application sets the following flag on the rtr_open_channel call:

rtr_ope_flag_t

flags=RTR_F_OPE_SERVER|RTR_F_OPE_BE_CALL_OUT

Distributed Deadlock Considerations

A deadlock or deadly embrace can occur when transactions lock data items in a database. The typical scenario is with two transactions txn1 and txn2 executing concurrently, with the following sequence of events:

  1. txn1 write-locks data item A
  2. txn2 write-locks data item B
  3. txn1 requests a lock on data item B but must wait because txn2 still has a lock on data item B
  4. txn2 requests a lock on data item A but must wait because txn1 still has a lock on data item A

Neither txn1 nor txn2 can proceed; they are in a deadly embrace. Figure 36 illustrates a deadly embrace.

Figure 36 Deadly Embrace

With RTR, to avoid such deadlocks, follow these guidelines:

  1. Always engage servers in the same order, and wait for the reply before each send.
  2. Provide several concurrent servers to minimize contention. Estimate the number of concurrent servers needed by determining the volume of transactions the servers must support, considering periods of maximum activity, and allowing for growth. The larger the volume on your servers, the more likely it is that your application will benefit from using concurrent servers.

RTR attempts to resolve deadlocks by aborting one deadlocked transaction path with error code RTR_STS_DLKTXRES and replaying the transaction. Other paths are not affected. Server applications should be written to handle this status appropriately.

Parallelism

One method for improving response time is to send multiple messages from clients without waiting for a reply. The messages can be sent to different partitions to provide parallel processing of transactions.

ODBC Applications

For a standard GUI development tool that has no knowledge of RTR calls, RTR becomes the transport for ODBC calls transparent to the application. The backend must be running OpenVMS with Oracle7.

Replication

For certain read-only applications, RTR can be used without shadowing to establish sites to protect against site failure. The method is to define multiple non-overlapping facilities with the same facility name across a network. In the facility, define a failover list of routers; then a client is automatically reconnected to an alternate site when the local router fails.

Another method is to define a partition on a standby server for read-only transactions. This minimizes network traffic to the standby.

A read-only partition on a standby server can reduce node-to-node transaction locking.

Idempotency Issues

Generally, databases and applications built to work with them are required to be idempotent. That is, given a specific state of the database, the same transaction applied many times would always produce the same result. Because RTR relies on replays and rollbacks, if there is a server failure before a transaction is committed, then RTR assumes the database will automatically roll back to the previous state, and the replayed transaction will produce results identical to the previous presentation of the transaction. RTR assumes that the database manager and server application provides idempotency.

For example, consider an internet transaction where you log into your bank account and transfer money from one account to another, perhaps from savings to checking. If you interrupt the transfer, and replay it two hours later, the transfer may not succeed because it would be required to have been done within a certain time interval after the login. Such a transaction is not idempotent.

Partition Locks

Under some circumstances, client applications need to be blocked from accessing a server. For example, multiple client applications may receive a common feed, but the transactional update to the database should only be done only by one. The application can establish a partition lock or "deadman" using a second key range that only one client can access at a time. With the deadman, if two clients attempt to start a transaction on their shared key range, only one is started. The first client can then use the data channel and begin processing the feed, updating the database. The second client takes over only if the first client’s deadman is terminated.

Designing for a Heterogenous Environment

In a heterogeneous environment, you can use RTR with several hardware architectures, both little endian and big endian. RTR provides the capability to do data marshalling in your application so that you can take advantage of such a mixed environment.

If you are constructing an application for a heterogeneous environment:

The Multivendor Environment

With RTR, applications can run on systems from more than one vendor. You can mix operating systems with RTR, and all supported operating systems and hardware architectures can interoperate in the RTR environment. For example, you can have some nodes in your RTR configuration running OpenVMS and others running Windows NT.

To develop your applications in a multivendor environment:

Develop your application on one system, for example, on Windows NT using Microsoft Visual C++ following strict ANSI C implementation.

When both the server and client code are debugged, move them to the non-NTother system.

Build and debug it on the non-NTother system, that is not a Windows NT system.

 

RTR V2 to V3 Considerations

If you have an existing application written using RTR Version 2 with OpenVMS, it will still operate with RTR Version 3. See the Reliable Transaction Router Migration Guide for pointers on using RTR Version 2 applications with RTR Version 3, and moving RTR Version 2 applications to RTR Version 3.

Compiling and Linking your Application

All client and application programs must be written using C, C++, or a language that can user RTR API calls. Include the RTR data types and error messages file rtr.h in your compilation so that it will be appropriately referenced by your application. For each client and server application, your compilation/link process is as follows:

  1. Write your application code using RTR calls.
  2. Use RTR data and status types for cross-platform interoperability.
  3. Compile your application code calling in rtr.h using ANSI C include rules. For example, if rtr.h is in the same directory as your C code, use with the following statement: #include "rtr.h".
  4. Link your object code with the RTR library to produce your application executable.

This process is illustrated in the following Figure 38, RTR Compile Sequence:

 

Figure 38 RTR Compile Sequence

 

Appendices

Appendix A: RTR Design Examples

To inform provide information for the design of new applications, this section contains scenarios or descriptions of existing applications that use RTR for a variety of reasons. They include:

A transportation example. This example shows a nation-wide use of partitioned, distributed databases and surrogate clients.

A stock exchange example. This example shows use of reliable broadcasts, database partitioning, standby and concurrent servers.

A banking example. This example shows use of application multithreading and an FDDI cluster.

Customer names are not used, but these designs reflect successfully implemented, working applications.

A Transportation Example

Brief History

In the 1980’s, a large railway system implemented a monolithic application in FORTRAN for local reservations with local databases separated into five administrative domains or regions: Site A, Site B, Site C, Site D, and Site E. By policy, rail travel for each region was controlled at the central site for each region, and each central site owned all trains leaving from that site. For example, all trains leaving from Site B were owned by Site B. The railway system supported reservations for about 1000 trains.

One result of this architecture was that for a passenger to book a round-trip journey, from, say, Site B to Site A and return, the passenger had to stand in two lines, one to book the journey from Site B to Site A, and the second to book the journey from Site A to Site B.

The implementation was on a DIGITAL OpenVMS cluster at each site, with a database engine built on RMS, using flat files. The application displayed a form for filling out the relevant journey and passenger information: (train, date, route, class, and passenger name, age, sex, concessions). The structure of the database was the same for each site, though the content was different. RTR was not used. Additionally, the architecture was not scalable; it was not possible to add more terminals for client access or add more nodes to the existing clusters without suffering performance degradation.

New Implementation

 

New requirements from the railway system for a national passenger reservations system included the goal that a journey could be booked for any train from anywhere to anywhere within the system. Meeting this goal would also enable distributed processing and networking among all five independent sites. In addition to this new functionality, the new architecture had to be more scalable and adaptable for PCs to replace the current terminals, part of the long- term plan. With these objectives, the development team rewrote their application in C, revamped their database structure, adopted RTR as their underlying middleware, and was able to improve their overall application significantly. The application became scalable, and additional features could be introduced. Key objectives of the new design were improved performance, high reliability in a moderately unstable network, and high availability, even during network link loss.

The structure of the database at all sites was the same, but the data were for each local region only. The database was partitioned by trainID (which included departure time), date, and class of service., and RTR data content routing was used to route a reservation to the correct domain, and bind reservation transactions as complete transactions across the distributed sites to ensure booking without conflicts. This neatly avoided booking two passengers in the same seat, for example. Performance was not compromised, and data partitioning provided efficiency in database access, enabling the application to scale horizontally as load increased. This system currently deals with approximately 3,000,000 million transactions per day. One passenger reservation represents a single business transaction, but may be multiple RTR transactions. An inquiry is a single transaction.

An important new feature was the use of surrogate clients at each local site that act as clients of the remote sites using a store and forward mechanism. The implementation of these surrogate clients made it possible to book round-trip tickets to any of the regions from a single terminal. This design addressed the problem of frequent RTR quorum negotiations caused by network link drops, and ensured that these would not affect local transactions.

The local facility defined in one location (say, Site B) includes a gateway server acting as a surrogate client that communicates with the reservation server at the remote site (say, Site C). For example, to make a round trip reservation in a single client request from Site B to Site C and return, the reservation agent enters the request with passenger ID, destination, and date. For the Site B to Site C leg, the destination is Site C, and for the Site C to Site B leg, the destination is Site B. This information is entered only at Site B. The reservation transaction is made for the Site-B-to-Site-C leg locally, and the transaction for the return trip goes first to the surrogate client for Site C.

The surrogate forwards the transaction to the real Site C server that makes the reservation in the local Site C database. The response for the successful transaction is then sent back to the surrogate client at Site B, which passes the confirmation back to the real client, completing the reservation. There are extensive recovery mechanisms at the surrogate client for transaction binding and transaction integrity. When transaction recovery fails, a locally developed store-and-forward mechanism ensures smooth functioning at each site. The system configuration is illustrated in Figure 38, Transportation Example Configuration. For clarity, only three sites are shown, with a single set of connections. All other connections are in use, but not shown in the figure. Local connections are shown with single-headed arrows, though all are duplex; connections to other sites by network links are shown with double-headed arrows. Connections to the local databases are shown with solid lines. Reservations agents connect to frontends.

 

 

 

Figure 38 Transportation Example Configuration

Currently the two transactions (the local and the remote) are not related to each other. The application has to make compensations in case of failure because RTR does not know that the second transaction is a child of the first. (In RTR V3.2, nested transactions could be used to specify this relationship.) This ensures that reservations are booked without conflicts.

The emphasis in configurations is on availability: local sites keep running even when network links to other sites are not up. The disaster tolerant capabilities of RTR and the system architecture made it easy to introduce site-disaster tolerance, when needed, virtually without redesign.

A Stock Exchange Example

Brief History

For a number of years, a group of banks relied on traditional open-outcry stock exchanges in several cities for their trades in stocks and other financial scrip (paper). These were three separate markets, with three floor-trading operations and three order books. In the country, financial institutions manage a major portion of international assets, and this traditional form of stock trading inhibited growth. When the unified stock exchange opened, they planned to integrate these diverse market operations into a robust and standards-compliant system, and to make possible electronic trading between financial institutions throughout the country.

The stock exchange already had an implementation based on OpenVMS, but this system could not easily be extended to deal with other trading environments and different financial certificates.

New Implementation

For their implementation using RTR, the stock exchange introduced a wholly electronic exchange that is a single market for all securities listed in the country, including equities, options, bonds, and futures. The hardware superstructure is a cluster of 64-bit DIGITAL AlphaServer systems with a network with high-speed links through up to 120 gateway nodes connecting to over 1000 nodes at financial institutions throughout the country.

The stock exchange platform is based on the DIGITAL’s OpenVMS cluster technology, which achieves high performance and extraordinary availability by combining multiple systems into a fault tolerant configuration with redundancy to avoid any single point of failure. The standard trading configuration is either high-performance AlphaStations or Sun workstations, and members with multiseat operations such as banks use AlphaServer 4100s as local servers. Due to trading requirements that have strict time-dependency, shadowing is not used. For example, it would not be acceptable for a trade to be recorded on the primary server at exactly 5:00:00 PM and at 5:00:01 PM on the shadow.

From their desks, traders enter orders with a few keystrokes on customized, UNIX-based trading workstation software running UNIX that displays a graphical user interface. The stock exchange processes trades in order of entry, and within seconds:

Traders further have access to current and complete market data and can therefore more effectively monitor and manage risks. The implementation ensures that all members receive the same information at the same time, regardless of location, making fairness a major benefit of this electronic exchange. (In RTR itself, fairness is achieved using randomization, so that no trader would receive information first all the time. Using RTR alone, no trader would be favored.)

The stock exchange applications work with RTR to match, execute, and confirm buy / sell orders, and dispatch confirmed trades to the portfolio management system of the securities clearing organization, and to the international settlement system run by participating banks.

The stock exchange designed their client/server frontend to interface with the administrative systems of most banks; one result of this is that members can automate back-room processing of trades and greatly reduce per-order handling expenses. DIGITAL server reliability, DIGITAL clustering capability, and cross-platform connectivity are critical to the success of this implementation. RTR client application software resides on frontends on the gateways that connect to routers on access nodes. The access nodes connect to a 12-node DIGITAL OpenVMS cluster where the RTR server application resides. The configuration is illustrated in Figure 40, Stock Exchange Example. Only nine trader workstations are shown at each site, but many more are in the actual configuration. The frontends are gateways, and the routers are access points to the main system.

 

Figure 40 Stock Exchange Example

A further advantage of the RTR implementation is that the multivendor, multiprotocol 1000-node environment can be managed with only five staff people. This core staff can manage the network, the operating systems, and the applications with their own software that detects anomalies and alerts staff members by pagers and mobile computers. Using RTR also employs standard two-phase-commit processing, providing complete transaction integrity across the distributed systems. With this unique implementation, RTR swiftly became the underpinning of nationwide stock exchanges. RTR also provides ease of management, and with two-phase commit, makes it easier than previously to manage and control the databases.

The implementation using RTR also enables the stock exchange to provide innovative services and tools based on industry and technology standards, cooperate with other exchanges, and introduce new services without re-engineering existing systems. For example, with RTR as the foundation of their systems, they plan an Oracle 7 data warehouse of statistical data off a central Oracle Rdb database, with DIGITAL Object Broker tools to offer users rapid and rich ad-hoc query capabilities. Part of a new implementation includes the disaster-tolerant DIGITAL Business Recovery Server solution and replication of its OpenVMS cluster configuration across two data centers, connected with the DIGITAL DEChub 900 GIGAswitch/ATM networking technology.

The unique cross-platform scalability of these systems further enables the stock exchange to select the right operating system for each purpose. Systems range from the central OpenVMS cluster, to frontends based on UNIX or Microsoft Windows NT. To support trader desktops with spreadsheets, an in-process implementation uses Windows NT with Microsoft Office to report trading results to the trader workstation.

A Banking Example

Brief History

A large bank recognized the need to devise and deliver ever more convenient and efficient banking services to their customers several years ago. They early understood both the expense of face-to-face transactions at a bank office and wished to explore new ways not only to reduce these expenses but also to improve customer convenience with 24-hour service, a level of service not available at a bank office or teller window.

New Implementation

The bank had confidence in the technology, and with RTR, was able to implement the world’s first secure internet banking service. This enabled them to lower their costs as much as 80% percent and provide 24 x 365 convenience to their customers. They were additionally able to implement a global messaging backbone that links 20,000 users on a broad range of popular mail systems to a common directory service.

With the bank’s electronic banking service, treasurers and CEOs manage corporate finances, and individuals manage their own personal finances, from the convenience of their office or home. Private customers use a PC-based software package to access their account information, pay bills, download or print statements, and initiate transactions to any bank in the country, and to some foreign banks.

For commercial customers, the bank developed software interfaces that provide import and export links between popular business financial packages and the electronic banking system. Customers can use their own accounting system software and establish a seamless flow of data from their bank account to their company’s financial system and back again.

The bank developed its customized internet applications based on Microsoft Internet Information Server (IIS) and RTR, using DIGITAL Prioris servers running Windows NT as frontend web servers. The client application runs on a secure HTTP system using 128-bit encryption and employs CGI scripts in conjunction with RTR client code. All web transactions are routed by RTR through firewalls to the electronic banking cluster running OpenVMS. The IIS environment enabled rapid initial deployment and contains a full set of management tools that help ensure simple, low-cost operation. The service handles 8,000 to 12,000 users per day and is growing rapidly. Figure 41, Banking Example Configuration, illustrates the deployment of this banking system.

Figure 41 Banking Example Configuration

The RTR failure-tolerant transaction-messaging middleware is the heart of the internet banking service. Data is shadowed at the transactional level, not at the disk level, so that even with a network failure, in-progress transactions are completed with integrity in the transactional shadow environment.

The banking application takes full advantage of the multi-platform support provided by RTR; it achieves seamless transaction-processing flow across the backend OpenVMS clusters and secure web servers based on Windows NT frontends. With RTR scalability, backends can be added as volume increases, load can be balanced across systems, and maintenance done during full operation.

For the electronic banking application, the bank used RTR in conjunction with an Oracle Rdb database. The security and high availability of RTR and OpenVMS clusters provided what was needed for this sensitive financial application, which supports more than a quarter million customer accounts, and up to 38 million transactions a month with a total value of U.S. $300 to $400 million.

The bank’s electronic banking cluster is distributed across two data centers located five 5 miles apart and uses DIGITAL GIGAswitch/FDDI systems for ultra-fast throughput and instant failover across sites without data loss. The application also provides redundancy into many elements of the cluster. For example, each data center has two or more computer systems linked by dual GIGAswitch systems to multiple FDDI rings, and the cluster is also connected by an Ethernet link to the LAN at bank headquarters.

The cluster additionally contains 64-bit Very Large Memory (VLM) capabilities for its Oracle database; this has increased database performance by storing frequently used files and data in system memory rather than on disk. All systems in the electronic banking cluster share access to 350 gigabytes of SCSI-based disks. Storage is not directly connected to the cluster CPUs, but connected to the network through the FDDI backbone. Thus if a CPU goes down, storage survives, and is accessible to other systems in the cluster.

The multioperating system cluster is very economical to run, supported by a small staff of four system managers who handle all the electronic banking systems. Using clusters and RTR enables the bank to provide very high levels of service with a very lean staff.

 

Appendix B: RTR Cluster Configurations

The cluster environment can be important to the smooth failover characteristics of RTR. This environment is slightly different on each operating system. The essential features of clusters are availability and ability to access a common disk or disks. Basic cluster configurations are illustrated below for the different operating systems where RTR can run.

OpenVMS Cluster

An OpenVMS cluster provides disk shadowing capabilities, and can be based on several interconnects including:

Figure 42 shows a CI-based OpenVMS cluster configuration. Client applications run on the frontends; routers and backends are established on cluster nodes, with backend nodes having access to the storage subsystems. The LAN is the Local Area Network, and the CI is the Computer Interconnect joined by a Star Coupler to the nodes and storage subsystems. Network connections can include DIGITAL GIGAswitch subsystems.

Figure 42 OpenVMS CI-based Cluster

For other OpenVMS cluster configurations, see the web site http://www.digital.com/software/OpenVMS.

 

Compaq Tru64 UNIX TruCluster

The Compaq Tru64 UNIX TruCluster is typically a SCSI-based system, but can also use Memory Channels for greater throughput. Considered placement of frontends, routers, and backends can ensure transaction completion and database synchronization. The usual configuration with a Compaq Tru64 UNIX TruCluster contains PCs as frontends, establishes cluster nodes as backends, and can make one node the primary server for transactional shadowing with a second as standby server. Because this cluster normally contains only two nodes, a third non-cluster node on the network can be set up as a tie-breaker to ensure that the cluster can attain quorum. (For more information on quorum, see The Role of Quorum, page *.) Figure 43 illustrates a Compaq Tru64 UNIX TruCluster configuration.

Figure 43 Compaq Tru64 UNIX TruCluster Configuration

When using standby servers in the Compaq Tru64 UNIX TruCluster environment, the RTR journal needs to be on a shared device.

Windows NT Cluster

In the Windows NT environment, two servers managed and accessed as a single node comprise an NT cluster. You can use RAID storage for cluster disks with dual redundant controllers, and you can use either the Intel or the Alpha architecture, but must use the same architecture in the cluster. A typical configuration would place the RTR frontend, router, and backend on the cluster nodes as shown in Figure 44, Windows NT Cluster, and would include an additional tie-breaker node on the network to ensure that quorum can be achieved.

Figure 44 Windows NT Cluster

The cluster environment makes possible the use of standby servers in a shadow environment.

Appendix C: RTR Sample Applications

The software kit contains a short sample application that is unsupported and not part of the RTR product. Code for the sample application is in the [EXAMPLES] directory on the software kit. This sample application contains four components:

ADG_HEADER.h;1
ADG_SHARED.c;1
ADG_CLIENT.c;1
ADG_SERVER.c;1

A README.TXT file describes the application. The client and server code are shown on the next few pages.

 

Client Application

 

/**************************************************************************************
* Copyright Compaq Computer Corporation 1998. All rights reserved.
* Restricted Rights: Use, duplication, or disclosure by the U.S. Government
* is subject to restrictions as set forth in subparagraph © (1) (ii) of
* DFARS 252.227-7013, or in FAR 52.227-19, or in FAR 52.227-14 Alt. III, as
* applicable.
* This software is proprietary to and embodies the confidential technology of
* Compaq Computer Corporation. Possession, use, of copying of this software
* and media is authorized only pursuant to a valid written license from Compaq,
* Digital or an authorized sublicensor.
***************************************************************************************/
/**************************************************************************************
* APPLICATION: RTR Sample Client Application
* MODULE NAME: ADG_CLIENT.C
* AUTHOR : Compaq Computer Corporation
* DESCRIPTION: This client application initiates transactions and requests
* transaction status asynchronously. It is to be used with ADG_SERVER.C,
* ADG_HEADER.H, and ADG_SHARED.C.
* DATE : Oct 22, 1998
***************************************************************************************/
/*
* To build on Unix:
* cc –o client client.c shared.c /usr/shlib/librtr.so –DUNIX
*/

#include "ADG_header.h"
#include "rtr.h"
#include "sys\types.h"
#include "sys\timeb.h"
void declare_client ( rtr_channel_t *pchannel );
FILE *fpLog;

void main ( int argc, char *argv[] )
{

/*
* This program expects 3 parameters :
* 1: messages to send
* 2: client number (1 or 2)
* 3: partition range
*/

rtr_status_t status;
rtr_channel_t channel ;
time_t time_val = { 0 };

message_data_t send_msg = {0};
receive_msg_t receive_msg = {0};
int txn_cnt;
long receive_time_out = RTR_NO_TIMOUTMS;
rtr_msgsb_t msgsb;
char CliLog[80];

send_msg.sequence_number = 1 ;
strcpy( send_msg.text , "from Client");

get_client_parameters( argc , argv, &send_msg, &txn_cnt);

sprintf( CliLog, "CLIENT_%c_%d.LOG", send_msg.routing_key,
send_msg.client_number );
fpLog = fopen( CliLog, "w");

if ( fpLog == NULL )
{
printf( " Error opening client log %s\n", CliLog );
exit((int)errno);
}

printf( "\n Client log = %s\n", CliLog );
fprintf(fpLog, " txn count = %d\n", txn_cnt );
fprintf(fpLog, " client number = %d\n", send_msg.client_number );
fprintf(fpLog, " routing key = %c\n\n", send_msg.routing_key);

declare_client ( &channel );
/* Send the requested number of txns */
for ( ; txn_cnt > 0; txn_cnt--, send_msg.sequence_number++ )
{
status = rtr_send_to_server(
channel,
RTR_NO_FLAGS ,
&send_msg,
sizeof(send_msg),
RTR_NO_MSGFMT );

check_status( "rtr_send_to_server", status );

fprintf(fpLog, "\n ************* sequence %10d *************\n",
send_msg.sequence_number);
time(&time_val);
fprintf(fpLog, " send_to_server at: %s",
ctime( &time_val));
fflush(fpLog);

/*
* Get the server’s reply OR
* an rtr_mt_rejected
*/

status = rtr_receive_message(
&channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
&receive_msg,
sizeof(receive_msg),
receive_time_out,
&msgsb);

check_status( "rtr_receive_message", status );

time(&time_val);
switch (msgsb.msgtype)
{
case rtr_mt_reply:
fprintf(fpLog, " reply from server at: %s",
ctime( &time_val));

fprintf(fpLog, " sequence %10d from server %d\n",
receive_msg.receive_data_msg.sequence_number,
receive_msg.receive_data_msg.server_number);
fflush(fpLog);
break;

case rtr_mt_rejected:
fprintf(fpLog, " txn rejected at: %s",
ctime( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fprintf(fpLog, " status is : %d\n", status);
fprintf(fpLog, " %s\n", rtr_error_text(status));
fflush(fpLog);

/* Resend same sequence_number after reject */
send_msg.sequence_number--;
txn_cnt++;
break;

default:
fprintf(fpLog, " unexpected msg at: %s",ctime( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fflush(fpLog);
exit((int)-1);
}

if (msgsb.msgtype == rtr_mt_reply)
{
status = rtr_accept_tx(
channel,
RTR_NO_FLAGS,
RTR_NO_REASON );

check_status( "rtr_accept_tx", status );
status = rtr_receive_message(
&channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
&receive_msg,
sizeof(receive_msg),
receive_time_out,
&msgsb);

check_status( "rtr_receive_message", status );
time(&time_val);

switch ( msgsb.msgtype )
{
case rtr_mt_accepted:
fprintf(fpLog, " txn accepted at : %s",ctime( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fflush(fpLog);
break;

case rtr_mt_rejected:
fprintf(fpLog, " txn rejected at : %s",ctime( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fprintf(fpLog, " status is : %d\n",
receive_msg.receive_status_msg.status);

fprintf(fpLog, "
%s\n",rtr_error_text(receive_msg.receive_status_msg.status));
fflush(fpLog);

/* Resend same sequence_number after reject */

send_msg.sequence_number--;
txn_cnt++;
break;

default:
fprintf(fpLog, " unexpected status on rtr_mt_accepted message\n");
fprint_tid(fpLog, &msgsb.tid );
fprintf(fpLog, " status is : %d\n",
receive_msg.receive_status_msg.status);
fprintf(fpLog, "
%s\n",rtr_error_text(receive_msg.receive_status_msg.status));
fflush(fpLog);
break;
}
}

}

close_channel ( channel );
}

void
declare_client ( rtr_channel_t *pchannel )
{
rtr_status_t status;
receive_msg_t receive_msg;
long receive_time_out = RTR_NO_TIMOUTMS; /* 60 seconds */
rtr_msgsb_t msgsb; /* Structure into which receive puts msgtype */

status = rtr_open_channel(
pchannel,
RTR_F_OPE_CLIENT ,
FACILITY_NAME,
NULL, /* rpcnam */
RTR_NO_PEVTNUM,
NULL,
RTR_NO_NUMSEG ,
RTR_NO_PKEYSEG );

check_status( "rtr_open_channel", status );

status = rtr_receive_message(
pchannel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
&receive_msg,
sizeof(receive_msg),
receive_time_out,
&msgsb);

 

check_status( "rtr_receive_message", status );

if ( msgsb.msgtype != rtr_mt_opened )
{
fprintf(fpLog, " Error opening rtr channel : \n");

fprintf(fpLog,"%s",rtr_error_text(receive_msg.receive_status_msg.status));
exit((int)-1);
}
fprintf(fpLog, " Client channel successfully opened\n");
return;
}

 

 

Server Application

/**************************************************************************************
* Copyright Compaq Computer Corporation 1998. All rights reserved.
* Restricted Rights: Use, duplication, or disclosure by the U.S. Government
* is subject to restrictions as set forth in subparagraph © (1) (ii) of
* DFARS 252.227-7013, or in FAR 52.227-19, or in FAR 52.227-14 Alt. III, as
* applicable.
* This software is proprietary to and embodies the confidential technology of
* Compaq Computer Corporation. Possession, use, of copying of this software
* and media is authorized only pursuant to a valid written license from Compaq,
* Digital or an authorized sublicensor.
***************************************************************************************/
/**************************************************************************************
* APPLICATION: RTR Sample Server Application
* MODULE NAME: ADG_SERVER.C
* AUTHOR : Compaq Computer Corporation
* DESCRIPTION: This server application receives transactions and receives
* transaction status. It is to be used with ADG_CLIENT.C,
* ADG_HEADER.H, and ADG_SHARED.C.
* DATE : Oct 22, 1998
***************************************************************************************/
#define TRUE 1
#define FALSE 0
#define LOCKING FALSE
/*
* To build on Unix:
* cc –o server server.c shared.c /usr/shlib/librtr.so –DUNIX
*/
#include "ADG_header.h"
void declare_server (rtr_channel_t *channel, message_data_t *outmsg);
FILE *fpLog;
void mainS ( int argc, char *argv[] )
{
/*
* This program expects 2 parameters :
* 1: server number (1 or 2)
* 2: partition range
*/

rtr_msgsb_t msgsb;
receive_msg_t receive_msg;
message_data_t reply_msg;
long receive_time_out = RTR_NO_TIMOUTMS;
char SvrLog[80];
time_t time_val = { 0 };
rtr_channel_t channel;
rtr_status_t status = (rtr_status_t)0;
rtr_bool_t replay;


strcpy( reply_msg.text , "from Server");

get_server_parameters ( argc, argv, &reply_msg );
sprintf( SvrLog, "SERVER_%c_%d.LOG", reply_msg.routing_key,
reply_msg.server_number );
fpLog = fopen( SvrLog, "w");
if ( fpLog == NULL )
{
printf( " Error opening server log %s\n", SvrLog );
exit((int)errno);
}

printf( " Server log = %s\n", SvrLog );
fprintf(fpLog, " server number = %d\n", reply_msg.server_number );
fprintf(fpLog, " routing key = %c\n", reply_msg.routing_key);

declare_server(&channel, &reply_msg);
while ( RTR_TRUE )
{
status = rtr_receive_message(
&channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
&receive_msg,
sizeof(receive_msg),
receive_time_out,
&msgsb);
check_status( "rtr_receive_message", status);

time(&time_val);

switch (msgsb.msgtype)
{
case rtr_mt_msg1_uncertain:
case rtr_mt_msg1:
if (msgsb.msgtype == rtr_mt_msg1_uncertain)
replay = RTR_TRUE;
else
replay = RTR_FALSE;

fprintf(fpLog, "\n ************* sequence %10d *************\n",
receive_msg.receive_data_msg.sequence_number);

if ( replay == TRUE )
fprintf(fpLog, " uncertain txn started at :%s",
ctime( &time_val));
else
fprintf(fpLog, " normal txn started at :%s",
ctime( &time_val));

fprintf(fpLog, " sequence %10d from client %d\n",
receive_msg.receive_data_msg.sequence_number,
receive_msg.receive_data_msg.client_number);
fflush(fpLog);

reply_msg.sequence_number =receive_msg.receive_data_msg.sequence_number;

status = rtr_reply_to_client (
channel,
RTR_NO_FLAGS,
&reply_msg,
sizeof(reply_msg),
RTR_NO_MSGFMT);

check_status( "rtr_reply_to_client", status);
break;

case rtr_mt_prepare:
fprintf(fpLog, " txn prepared at : %s", ctime( &time_val));
fflush(fpLog);

status = rtr_accept_tx (
channel,
RTR_NO_FLAGS,
RTR_NO_REASON);
check_status( "rtr_accept_tx", status);
break;

case rtr_mt_rejected:
fprintf(fpLog, " txn rejected at : %s", time( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fprintf(fpLog, " status is : %d\n", status);
fprintf(fpLog, " %s\n", rtr_error_text(status));
fflush(fpLog);
break;

case rtr_mt_accepted:
fprintf(fpLog, " txn accepted at : %s", ctime( &time_val));
fprint_tid(fpLog, &msgsb.tid );
fflush(fpLog);
break;
} /* End of switch */
} /* While loop */
}
void
declare_server (rtr_channel_t *channel, message_data_t *outmsg)
{
rtr_uns_32_t status;
rtr_uns_32_t numseg = 1;
rtr_keyseg_t p_keyseg[1];
receive_msg_t receive_msg;
long receive_time_out = RTR_NO_TIMOUTMS; /* 60 seconds */
rtr_msgsb_t msgsb; /* Structure into which receive puts msgtype */
char *facility = FACILITY_NAME;

p_keyseg[0].ks_type = rtr_keyseg_string;
p_keyseg[0].ks_length = 1;
p_keyseg[0].ks_offset = 0;
p_keyseg[0].ks_lo_bound = &outmsg->routing_key;
p_keyseg[0].ks_hi_bound = &outmsg->routing_key;

status = rtr_open_channel(
channel,
RTR_F_OPE_SERVER,// | RTR_F_OPE_EXPLICIT_ACCEPT |
// RTR_F_OPE_EXPLICIT_PREPARE,
facility,
NULL, /* rpcnam */
RTR_NO_PEVTNUM,
NULL,
numseg,
p_keyseg);

check_status( "rtr_open_channel", status);

status = rtr_receive_message(
channel,
RTR_NO_FLAGS,
RTR_ANYCHAN,
&receive_msg,
sizeof(receive_msg),
receive_time_out,
&msgsb);

check_status( "rtr_receive_message", status);

if ( msgsb.msgtype != rtr_mt_opened )
{
fprintf(fpLog, " Error opening rtr channel : \n");

fprintf(fpLog,"%s",rtr_error_text(receive_msg.receive_status_msg.status));
fclose (fpLog);
exit(-1);
}

fprintf(fpLog, " Server channel successfully opened \n");
return;
}

Active-X Case Study

The Active-X Case Study uses RTR deployed on three laptops to demonstrate the capabilities of RTR using one client and two servers. It displays a graph of number of transactions that is continuously updated. Periodically, one of the servers goes off line, and the display and activity show RTR failing over to the other server while continuing to process transactions. The failure can be simulated or can be effected by someone disconnecting the cable from one of the laptops.

The code for the Active-X Case Study is available on the RTR website at http://www.software.digital.com/rtr. The website also contains a short case study describing the demo in more detail.

 

Appendix D: Evaluating Application Resource Requirements

The following somewhat simplistic check-list can help diagnose a particular performance problem:

  1. Check the CPU-load on the machines involved. A machine loaded over 60% is generally suspect, if reasonable response times are desired.

Possible fixes:

  1. Measure:

These should be comfortably below the rated capacity of the controller and drives. If not, you may be on the trail of a performance constraint. Try:

  1. Measure RTR network traffic generated by the application (use RTR MONITOR TRAFFIC for this while the application is running under load). Add the total bytes/second sent and received, and subtract the bytes/second sent and received from the local node to itself (intra-node data does not use the network). This total should be substantially lower than the measured capacity of the network.

(A rough-and-ready way to measure available network capacity is to do a file-transfer of a large file using FTP or some other program between the nodes, and divide the file size by the time taken. Note that multiple network connections may share the same hardware infrastructure, so you may need to try multiple simultaneous measurements between different node-pairs.)

If the RTR network traffic measured is not substantially less than the measured capacity of the network, then this may be the cause of the performance constraint for which you are looking. Try:

Often, networks are tuned for high performance when transferring large files, but perform badly for bursty traffic. Buffering of either side of the transfer and of intermediate hops ensures smooth data flow. Check each hop to see if packets are being retransmitted due to excessive loss, and tune your network accordingly.

  1. Measure delays in transmission through the network. Use "ping" to measure delay times between nodes whilst the system is under load. If reported round-trip delays are not in the low-millisecond range, you may be on to something. Additionally, use RTR MONITOR STALLS to measure whether delays are taking place in the acceptance of outgoing data by the network.

If MONITOR STALLS shows a large number of stalls, especially in the columns for stalls longer than three seconds, then you very likely have a packet-loss problem in the network. try:

(If MONITOR STALLS reports lots of long stalls, but standard network analysis indicates that the network is operating as expected, check network utilization more closely. Packet losses which cause these glitches are usually caused by overload peaks in network traffic. You may still see disturbing long delays or link-losses when the system gets busy, even if average traffic is well below the capacity of the hardware.)

Network monitors generally look at overall performance, measured over a period of time. It is often possible to show a 20 percent utilization of network bandwidth over time plotted at 5 minute intervals, but miss the peaks that last for 5 seconds and lose 50 packets. It is those 50 packets that account for the odd transaction getting a response time of 45 seconds instead of the usual 200 msec.

  1. Check whether the throughput on your backend machines is being limited by all the servers being busy. Measure this by issuing the command RTR SHOW PARTITION/BACKEND /FULL on the backend machines. To observe this information with automatic updating of the display, use the MONITOR QUEUE or MONITOR GROUP command.

Note: Excessive use of a MONITOR command can be disruptive to the system. For example, running several MONITOR commands simultaneously steals cycles away from RTR to do real work. To minimize the impact of using MONITOR commands, increase the sample size interval using /INTERVAL=<no-of-seconds>.


If the SHOW PARTITION command consistently shows the number of "Free Servers" as zero and the number of "Txns Active" larger than the number of servers, then a performance problem may be caused by queues building up because an inadequate number of server applications are ready to process incoming transactions. Try the following:

  1. If none of the above results in the TPS-rate you would like to see, are you sure that you are generating enough work for the servers to do? To check this, try increasing the number of clients accessing the system.

 

Glossary

Item

Meaning

ACID

Transaction properties supported by RTR: atomicity, consistency, isolation, durability.

ACP

The RTR Application Control Process.

API

Application Programming Interface.

applet

A small application designed for running on a browser.

application

User-written software that uses employs RTR.

backend

BE, the node where the server application runs.

bank

An establishment for the custody of money, which it pays out on a customer’s request.

branch

A subdivision of a bank; perhaps in another town.

broadcast

A nontransactional message.

callout server

A server process used for transactional authentication.

channel

A logical port opened by an application with an identifier to exchange messages with RTR.

client

A client is always a client application, one that initiates and demarcates a piece of work. In the context of RTR, a client must run on a node defined to have the frontend role. Clients typically deal with presentation services, handling forms input, screens, and so on. A browser, perhaps running an applet, could connect to a web application that acts as an RTR client, sending data to a server through RTR.

In other contexts, a client can be a physical system, but in the context of RTR and in this document, such a system is always called a frontend or a node.

commit process

The transactional process by which a transaction is prepared, accepted, committed, and hardened in the database.

commit sequence number (CSN)

A sequence number assigned to an RTR commit group, established by the vote window, the time internal interval during which transaction response is returned from the backend to the router. All transactions in the commit group have the same CSN and lock the database.

concurrent server

A server process identical to other server processes running on the same node.

CPU

Central processing unit.

data marshalling

The capability of using systems of different architectures (big endian, little endian) within one application.

deadlock

Deadly embrace, a situation that occurs when two transactions or parts of transactions conflict with each other, which could violate the consistency ACID property when committing them to the database.

disk shadowing

A process by which identical data are written to multiple disks to increase data availability in the event of a disk failure. Used in a cluster environment to replicate entire disks or disk volumes. See also transactional shadowing.

DTC

Microsoft Distributed Transaction Coordinator.

endian

The byte-ordering of multibyte values. Big endian: high-order byte at starting address; little endian: low-order byte at starting address.

facility

The mapping between nodes and roles used by RTR and established when the facility is created.

failover

The ability to continue operation on a second system when the first has failed or become disconnected.

failure tolerant

Software that enables an application to continue when failures such as node or site outages occur. Failover is automatic.

fault tolerant

Hardware built with redundant components to ensure that processing survives component failure.

frontend

FE, the physical node in an RTR facility where the client application runs.

FTP

File transfer protocol.

inquorate

Nodes/roles that cannot participate in a facility's transactions are inquorate.

journal

A file containing transactional messages used for recovery.

key range

The defining range for an RTR partition. RTR partitions are defined with to have key ranges with a high and a low bound.

LAN

Local area network.

link

A communications path between two nodes in a network.

message

A logical grouping of information transmitted between software components, typically over network links.

multichannel

An application that uses more than one channel. A server is usually multichannel.

multithreaded

An application that uses more than one thread of execution in a single process.

MS DTC

Microsoft DTC; see DTC,

node

A physical system.

nontransactional message

A message containing data that does not contain any part of a transaction such as a broadcast or diagnostic message. See transactional message.

partition

RTR transactions can be sent to a specific database segment or partition. This is data content routing and handled by RTR when so programmed in the application and specified by the system administrator. A partition can be in one of three states: primary, standby, and shadow.

primary

The state of the partition servicing the original data store or database. A primary has a secondary or shadow counterpart.

process

The basic software entity, including address space, scheduled by system software, that provides the context in which an image executes.

quorate

Nodes/roles in a facility that has quorum are quorate.

quorum

The minimum number of routers and backends in a facility, usually a majority, who must be active and connected for the valid completion of processing.

quorum node

A node, specified in a facility as a router, whose purpose is not to process transactions but to ensure that quorum negotiations are possible.

quorum threshold

The minimum number of routers and backends in a facility required to achieve quorum.

roles

Roles are defined for each node in an RTR configuration based on the requirements of a specific facility. Roles are frontend, router, or backend.

rollback

When a transaction has been committed on the primary database but cannot be committed on its shadow, the committed transaction must be removed or rolled back to restore the database to its pre-transaction state.

router

The RTR role that manages traffic between RTR clients and servers.

RTR configuration

The set of nodes, disk drives, and connections between them used by RTR.

RTR environment

The RTR run-time and system management areas.

secondary

See shadow.

server

A server is always a server application or process, one that reacts to a client application's units of work and carries them through to completion. This may involve updating persistent storage such as a database file, toggling the switch on a device, or performing another pre-defined task. In the context of RTR, a server must run on a node defined to have the backend role.

In other contexts, a server may be a physical node, but in RTR and in this document, physical servers are called backends or nodes.

shadow

The state of the server process that services a copy of the data store or primary database. In the context of RTR, the shadow method is transactional shadowing, not disk shadowing. Its counterpart is primary.

SMP

Symmetric MultiProcessing.

standby

The state of the partition that can take over if the process for which it is on standby is unavailable. It is held in reserve, ready for use.

TPS

Transactions per second.

transaction

An operation performed on a database, typically causing an update to the database. Analogous in many cases to a business transaction such as executing a stock trade or purchasing an item in a store. A business transaction may consist of one or more than one RTR transaction.

transactional message

A message containing transactional data.

transactional shadowing

A process by which identical transactional data are written to separate disks often at separate sites to increase data availability in the event of site failure. See also disk shadowing.

two-phase commit

A database commit/rollback concept that works in two steps: 1) The coordinator asks each local recovery manager if it is able to commit the transaction. 2) If and only if all local recovery managers agree that they can commit the transaction, the coordinator commits the transaction. If one or more recovery managers cannot commit the transaction, then all are told to roll back the transaction. Two-phase commit is an all-or-nothing process: either all of a transaction is committed, or none of it is.

WAN

Wide area network.

 

 

Index