hp Reliable Transaction Router
System Manager's Manual


Previous Contents Index


Chapter 7
RTR Monitoring

This chapter describes the RTR monitor and provides examples of its use with the CLI interface. Displays with the browser interface contain the same information in a comparable format.

The RTR monitor lets you view the activities of RTR and your applications. Many different aspects of RTR's behavior can be viewed, allowing the activities and performance of RTR to be analyzed.

7.1 Introduction

The RTR monitor provides a means to continuously display the status of RTR and the applications using it.

It can be used to check the correct operation of an RTR network, showing information useful for tuning, capacity planning, and locating configuration and application errors.

The information displayed is composed of named data items which are continuously updated by RTR. These data items can be displayed in various formats and combined using simple arithmetic operators and constants.

The monitor is invoked with the MONITOR command. MONITOR displays a monitor picture that is periodically updated. See Section 8.2 for the full syntax of the MONITOR command.

A monitor picture contains elements that are either text (such as labels and titles) or variables derived from data items. Monitor pictures can be defined either interactively at the RTR> prompt or defined in a file called a monitor file.

You can use monitor files provided with RTR and you can create your own. See Appendix A for information about creating monitor files.

7.2 Standard Monitor Pictures

A number of standard monitor pictures are supplied with RTR. These cover most of the usual monitoring requirements. You may define your own monitor pictures or alter the standard ones to suit your particular needs. Table 7-1 contains a list of the standard monitor pictures. To display one of these pictures, use the following command at the RTR prompt:


 
 RTR> MONITOR picture-name
 

The files for standard monitor pictures are installed on your system when RTR is installed. The location of these files is platform specific. The file names are the picture name appended with .mon . Type the file name without .mon when starting the display.

See Chapter 8, RTR Commands, for more information on the MONITOR command.

Table 7-1 Standard Monitor Pictures
Picture name Description
accfail Shows link transport name for links on which a connection attempt was declined, with a reason for failure. The most recent entry is highlighted.
acp2app Displays counts of messages and number of bytes from RTRACP to the application, as viewed from a specific node.
active Displays a list of RTR processes, and for each process the number of transactions they have started, the number of transactions they have completed and the number of transactions that are still active.
app2acp Displays counts of messages and number of bytes from the application to RTRACP, as viewed from a specific node.
appdelay Displays counts of flow-control induced traffic stalls by application process.
broadcast Displays information about RTR user events by process, including number of user events enqueued, received, and discarded.
calls Displays the total number of RTR API calls and their success or failure for the processes on all the nodes being monitored. All RTR messages are also shown by message type. (Pending messages are those that an application has not received yet). Use the /IDENTIFICATION=process-id qualifier to display the values for one specific process, otherwise the total values for all processes are displayed.
channel Displays the roles of the channels declared by an application. This can be useful as a debugging tool in the early stages of application development.
connects Displays connection status summary, including the number of links up and down, and a list of links with state (up or down), architecture, network transport, and fail-reason, if any.
ctccalls Displays the history of calls for a specific client transaction controller.
downstream Displays counts of downstream flow-control induced traffic stalls.
event Displays event routing data by facility. Information includes events in transit and destination information showing number of events enqueued, processed, and discarded.
fastrecovery Displays the number of transactions recovered during optimized shadow recovery.
flow Displays the flow control counters.
flostalls Displays counts of flow-control induced traffic stalls.
frontend Displays frontend status and counts by node and facility, including frontend state, current router, reject status, retry count, and quorum rejects.
group Shows server and transaction concurrency by partition.
ipc Shows counts of interprocess communication (IPC) activity in the RTR ACP and active RTR applications.
ipcrate Displays rate information on IPC messages, byte counts, and I/O primitive usage.
jcalls Displays counts of successful (success), failed (fail) and total journal calls for local and remote journals.
jhuse Displays bar graphs of Maximum, Allocated, In Use and Peak usage for local and standby journals using the browser interface. Provides the same information as the juse monitor picture.
jnlhtml Displays detail for local and standby journal usage with the browser interface. Data are the same as from the MONITOR JOURNAL CLI command.
journal Displays detail for local and standby journal usage. Includes counts of the number of blocks in use, percent of journal used, number of blocks, top blocks, blocks available, transaction (tx) entries, transaction records, memory bytes, disk reads, blocks read, disk writes, blocks written, entries total, records total, records read, record bytes read, non-transaction entries and open journals. Also provides bar graphs that show usage of journal blocks as a percent of the total.
juse Displays bar graphs of Maximum, Allocated, In Use and Peak usage for local and standby journals.
link Displays a number of per-link data items. Use the /LINK=link-name qualifier to display the values for one specific link, otherwise the total values for all links are displayed.
netbytes Displays a list of the links to other nodes. For each link, the total number of bytes received and sent on that link and the number of bytes received and sent per second are displayed.
netstat Displays for each link the connection status in detail, with the link state (up or down), and architecture type of remote node (such as VAX, I386, Alpha, and so on).
ortr Displays a list of server/client transaction controllers.
partit Displays the status of server partitions. Shows the partition identifiers, key ranges and key segments, and the status of the servers (active, recovering and so on).
queues Shows transaction queues on a partition basis.
quorum Tracks (by facility) the configuration, reachability, and quorum status of one or more nodes.
recovery Displays the status of server recovery procedures, such as waiting for quorum, catching up transactions, and so on.
rejects Displays the last rtr_mt_rejected message received by each running process.
rejhist Displays the last 10 rtr_mt_rejected messages received by the selected process.
response Displays the elapsed time that a transaction has been active on the opened channels of a process.
rolequor Displays a detailed picture of the various data items in the QUORUM picture, separated by roles. If a quorum problem is encountered, this picture may be useful for problem diagnosis.
routers Displays information on a router node. It gives an indication of the utilization of the router in terms of transactions and broadcasts routed through this node. Useful to monitor performance or locate problems.
routing Displays statistics of transaction and broadcast traffic by facility.
rscbe Displays the most recent call's history for the RSC subsystem on a backend node.
stalls Displays in real time any network links currently stalling in their outbound traffic, and provides a history of the stalls that the various links encountered during their lifetime.
stccalls Displays a history of calls for a specified server transaction controller.
summary Displays channel, transaction, and system environmental information.
system Displays the state of critical resources within the RTR environment. If a resource has exceeded a predefined threshold, a warning indicator is displayed.
tps Displays the rate of transaction commits performed by each process using RTR.
tpslo Displays low end of the rate of transaction commits performed by each process using RTR.
traffic Displays a list of the links to other nodes. Shown for each link are: byte rate, packet rate, message rate and congestion, in both directions. Average packets per second is also shown.
trans Displays transactions for a frontend, router and backend.
upstream Displays upstream flow-control-induced traffic stalls.
v2calls Shows RTR Version 2 verb usage through the interoperability subsystem. The screen layout is identical to the RTR Version 2 Monitor Call's picture.
xa Displays XA counter information including success and failure as well as call and read-only counters.

The following sections describe the more commonly used standard monitor pictures in detail.

7.2.1 Monitor Accfail

Displays link acceptance failures. When configuring RTR, nodes can sometimes fail to connect. Although the cause of the error can be viewed on the initiator side with the MONITOR NETSTAT picture, it can be difficult to pinpoint the problem when looking at the other end of the link. Use the MONITOR ACCFAIL picture to display the reason for the local node's refusal to accept connections.

Example 7-1 Monitor Accfail

================================================================================ 
                     U n a c c e p t a b l e    L i n k s 
 
         Most recent links on which a connection attempt was declined 
 
            Node: NODE11                Wed Jan  7 1998 10:51:00 
 
-------------------------------------------------------------------------------- 
Link Transport Name(s)                  Reason for failure 
-------------------------------------------------------------------------------- 
nodef                                   node is not configured for the facility 
marke.zko.dec.com                       node not recognized 
real1                                   facility name not matched 
MARKE DEC:.ZKO.MARKE                    node not recognized 
nodef                                   node is not configured for the facility 
marke.zko.dec.com                       node not recognized 
real1                                   facility name not matched 
nodez                                   node role definitions do not match for 
real1                                   facility name not matched 
MARKE DEC:.ZKO.MARKE                    node not recognized 
-------------------------------------------------------------------------------- 
 
         List entries are reused in a cyclical fashion; the most 
         recent entry is highlighted. 
 
================================================================================ 

Errors that can be displayed by ACCFAIL include:

7.2.2 Monitor Acp2app

Displays counts of messages and number of bytes from RTRACP to the application, as viewed from a specific node. Includes opened, closed, msg1, msg1_uncertain, msgn, reply (reply to client), rettosend (return to sender), accepted, prepared, rejected, user_event, rtr_event, prepare, request_info , and set_info messages as appropriate. For receive_message and user_wakeup , displays calls, active, fail, and timeout counts. Refer to the Reliable Transaction Router C Application Programmer's Reference Manual for an explanation of the message types.

The default is to display information on all PIDs, process names, and images. To display information on one process only, use the qualifier /IDENTIFICATION= process-id .

Example 7-2 Monitor Acp2app

RTR ACP to Application Messages, Node: NodeA  PID:  1568   Process name: -ALL- 
    Image: -ALL-                                      14:15:46 Tue Dec 04 2001 
Message Type        Client               Server             Other         Pend 
                    #      Bytes         #     Bytes        #      Bytes     # 
opened              4        760         3        38        0          0     8 
closed              0          0         1       136    43492     295745     1 
msg1                                     0       170                         2 
msg1_uncertain                           0         0                         0 
msgn                                     0       253                         3 
reply               0          0         0         0                         0 
rettosend           0          0                                             0 
prepared            0          0                                             0 
accepted            0          0         0         0                         0 
rejected            0          0         0        68                         1 
user_event          0          0         0         0                         0 
rtr_event           0          0         0         0                         0 
prepare                                                     0          0     0 
                    Other 
request_info                                            43347    5211764     0 
set_info                                                    0          0     0 
                  Calls   Active     Fail   Timeout 
receive_message   88607        1     1759      1759 
user_wakeup           0        0 

7.2.3 Monitor Active

Displays a list of RTR processes, and for each process the number of transactions they have started, the number of transactions they have completed, and the number of transactions still active.

OpenVMS systems show actual process names; on Windows and UNIX systems RTR produces a process name by combining user name and image name.

Example 7-3 Monitor Active

 
ACTIVE TRANSACTIONS BY PROCESS  Fri Mar 12 1999 19:32:41 
 
                                              Starts  Complete  Active 
Display totals:                                    5         5       0 
 
Node      ID      Process name   Image name 
NodeA     11141   SMITH_1        DISK01:[SMITH]RTR 5         5       0 
 

7.2.4 Monitor App2acp

Displays counts of messages and number of bytes from the application to RTRACP, as viewed from a specific node. Includes open_channel, close_channel, prepare_tx (prepare transaction), accept_tx (accept transaction), reject_tx (reject transaction), broadcast_event, start_tx, send_to_server, reply_to_client, request_info , and set_info . Refer to the Reliable Transaction Router C Application Programmer's Reference Manual for an explanation of the message types.

The default is to display counts for all PIDs and processes, for client, server, and other roles.

Example 7-4 Monitor App2acp

RTR Application to ACP Messages, Node: NodeA  PID:  1568 Process name: -ALL- 
   Image: -ALL-                                      14:21:19 Tue Dec 04 2001 
RTR Message Type     Client               Server               Other        
                     #     Bytes          #     Bytes          #      Bytes 
open_channel        11      8228          7      5376          0          0 
close_channel        5       120          3        72          0          0 
prepare_tx           0         0 
accept_tx            0         0          0         0                        
reject_tx            0         0          0         0                        
broadcast_event      1        81          0         0                     
start_tx             2         0                             
send_to_server       5       548   
reply_to_client                           0          0   
request_info                                               43768   16015270 
set_info                                                       0          0 

7.2.5 Monitor Appdelay

Displays counts of flow-control-induced traffic stalls by an application process.

Example 7-5 Monitor Appdelay

FLOW CONTROL STALLS BY PROCESS Tue Dec 04 2001 15:38:37, NODE: rtrdoc 
 
                         Txn traffic stalls      Bdcst traffic stalls    KBytes 
PID      Process Name    reqs times secs  max    reqs times secs  max     sent 
Total    *                  1    0     0     0      1    0     0     0   45528 
1568     user rtr           1    0     0     0      1    0     0     0   45528 


Previous Next Contents Index