Previous | Contents | Index |
The following sections describe the more commonly used standard monitor
pictures in detail.
5.2.1 Monitor ACCFAIL (Link Acceptance Failures)
When configuring RTR it can happen that nodes sometimes fail to connect up. Whilst the cause of the error can be viewed on the initiator side with the MONITOR NETSTAT picture, it can be difficult to pinpoint the problem when looking at the other end of the link. The monitor picture ACCFAIL can be used to display the reason for the local node to refuse to accept connections. An example:
================================================================================ U n a c c e p t a b l e L i n k s Most recent links on which a connection attempt was declined Node: LENGTH Wed Jan 7 1998 10:51:00 -------------------------------------------------------------------------------- Link Transport Name(s) Reason for failure -------------------------------------------------------------------------------- ilira node is not configured for the facility dmark.zko.dec.com node not recognized breal facility name not matched DMARK DEC:.ZKO.DMARK node not recognized ilira node is not configured for the facility dmark.zko.dec.com node not recognized breal facility name not matched sfranc node role definitions do not match for breal facility name not matched DMARK DEC:.ZKO.DMARK node not recognized -------------------------------------------------------------------------------- List entries are reused in a cyclical fashion; the most recent entry is highlighted. ================================================================================ |
Some of the errors that can be displayed by ACCFAIL are:-
RTR ACP to Application Messages, Node: NodeA PID: -ALL- Process name: -ALL- Image: -ALL- 14:15:46 Mon Jan 25 1999 messages client server other pend # Bytes # Bytes # Bytes # opened 0 0 0 0 0 0 0 closed 0 0 0 0 0 0 0 msg1 0 0 0 msg1_uncertain 0 0 0 msgn 0 0 0 repl_2_client 0 0 0 rettosend 0 0 0 accepted 0 0 0 0 0 rejected 0 0 0 0 0 user_event 0 0 0 0 0 rtr_event 0 0 0 0 0 mt_prepare 0 0 0 other request_info 0 0 0 set_info 0 0 0 calls active fail timeout receive_message 0 0 0 0 user_wakeup 0 0 |
Displays counts of messages and number of bytes from RTRACP to the application, as viewed from a specific node. Includes openend, closed, msg1, msg1_uncertain, msgn, repl_2_client (reply to client), rettosend (return to sender), accepted, rejected, user_event, rtr_event, mt_prepare, request_info, and set_info messages as appropriate. For receive_message and user_wakeup, displays calls, active, fail, and timeout counts.
The default is to display information on all PIDs, process names, and
images. To display information on one process only, user the qualifer
/IDENTIFICATION=process-id.
5.2.3 Monitor Active
ACTIVE TRANSACTIONS BY PROCESS Fri Mar 12 1999 19:32:41 Starts Completions Active All processes: 5 5 0 Node ID Process Image NodeA 11141 11141 rtr 5 5 0 |
Displays a list of RTR processes, and for each process the number of
transactions they have started, the number of transactions they have
completed and the number of transactions still active.
5.2.4 Monitor APP2ACP
RTR Application to ACP Messages, Node: NodeA PID: -ALL- Process name: -ALL- Image: -ALL- 14:21:19 Mon Jan 25 1999 messages client server other # Bytes # Bytes # Bytes open_channel 0 0 0 0 0 0 close_channel 0 0 0 0 0 0 accept_tx 0 0 0 0 reject_tx 0 0 0 0 broadcast_event 0 0 start_tx 0 0 send_to_server 0 0 reply_to_client 0 0 request_info 0 0 set_info 0 0 |
Displays counts of messages and number of bytes from the application to RTRACP, as viewed from a specific node. Includes open_channel, close_channel, accept_tx (accept transaction), reject_tx (reject transaction), broadcast_event, start_tx, send_to_server, reply_to_client, request_info, and set_info.
The default is to display counts for all PIDs and processes, for
client, server, and other roles.
5.2.5 Monitor Broadcast
BROADCAST RECEPTION BY PROCESS 15:20:27 6-APR-1999 Node ID Process Received Queued Lost Rate of delivery Total 2750 5 17 850.0 NODEA 20400249 RTRACP 0 0 0 NODEA 2040024D BATCH_2993 2750 5 17 850.0 NODEA 2040024B BATCH_2991 0 0 0 NODEA 2040024C BATCH_2992 0 0 0 |
Displays information about the RTR user events process. Fields
displayed included number of user events enqueued for the application,
number of user events received by the application, and number of user
events discarded by RTR.
5.2.6 Monitor Calls
RTR api calls, Node: nodea.zuo.dec.com , PID: 2162 , Process name: -ALL- Image: -ALL- Fri Feb 12 1999 16:38:05 CALLS client server fail MESSAGES client server pend open_channel 1 1 0 mt_opened 1 1 0 close_channel 0 0 0 mt_closed 0 0 0 start_tx 0 0 mt_msg1 0 1 send_to_server 1 0 mt_msg1_uncertain 0 0 mt_msgn 0 0 reply_to_client 0 0 mt_reply 0 0 mt_rettosend 0 0 prepare_tx 0 0 mt_prepared 0 0 accept_tx 0 0 0 mt_accepted 0 0 0 reject_tx 0 0 0 mt_rejected 0 0 0 broadcast_event 0 0 0 mt_user_event 0 0 0 set_user_handle 0 0 0 mt_rtr_event 0 0 0 get_tid 0 0 0 mt_prepare 0 0 other other request_info 3 0 mt_request_info 2 0 set_info 0 0 mt_set_info 0 0 error_text 2 mt_closed 2 set_wakeup 0 calls active fail timeout receive_message 9 1 2 2 user wakeup 0 0 |
Displays the total number of RTR API calls and their outcome for the
processes on all the nodes being monitored. Use the /IDENTIFICATION=process-id qualifier to display
the values for one specific process, otherwise the total values for all
processes are displayed.
5.2.7 Monitor Channel
RTR CHANNELS BY TYPE PER PROCESS Fri Feb 12 1999 16:41:13 Client Server Call-out Node ID Process Image Pri. Sby. Router Backend nodea 2162 2162 1 0 0 0 0 |
Displays the channels opened by RTR CALL RTR_OPEN_CHANNEL comands.
5.2.8 Monitor Connects
C o n n e c t i o n S t a t u s S u m m a r y Node: nodea.zuo.dec.com Tue Feb 16 1999 13:02:18 -Executive summary---------------------------------------- Number of links up: 3 (100.%) Number of links down: 0 (0.0 %) ---------------------------------------------------------- -Detail------------------------------------------------------------------------- Node -> Link State Arch T'port Fail-reason --------------:-----:-----:------:---------------------------------------------- nodea->nodea up alpha - nodea->nodeb up alpha TCP nodea->nodec up i386 TCP |
Displays the link protocol for connected links, and the fail reason as
a text message for any links on which a connection has failed.
Unconnected links where connection have been attempted are highlighted.
Link state and architecture of the remote node are also displayed.
Summarizes link status and is less detailed than the monitor netstat
display.
5.2.9 Monitor Event
EVENT ROUTING STATISTICS BY FACILITY 6-APR-1999 15:21:47 Destination Transit Node Facility In Out Lost In Out Lost Total 180 175 5 180 180 0 NODEA FACCMS 25 25 0 25 25 0 NODEA TESTFAC 155 150 5 155 155 0 |
Displays event routing data by facility. Information includes events in
transit from RTR to a destination facility and destination information
showing number of events enqueued for the application (In column),
number of events processed by the application (Out column), and the
number of events discarded by RTR (Lost column).
5.2.10 Monitor Facility
FACILITY COUNTERS 7-JAN-1999 14:04:27, NODE: -ALL- , FACILITY: -ALL- ASM_MSGS_TO_APPS 290005 TMBE_TX_RQ_COMMITS 0 NCF_TR_FE_LOSS 0 ASM_MSGS_FROM_APPS290404 TMBE_TX_RQ_ABORTS 0 NCF_TR_BE_LOSS 0 BM_NCF_EVENTS_DELV 6 TMBE_TX_ACCEPTS 72501 NCF_TR_FE_GAIN 1 BM_NCF_EVENTS_RCVD 7 TMBE_TX_REJECTS 0 NCF_TR_BE_GAIN 1 TM_NCF_EVENTS_RCVD 13 TMBE_TX_RTR_FORGETS 72499 NCF_TR_FQM_LOSS 0 TMFE_TX_RQ_STARTS 72700 TMBE_CRPS_REQUESTED 0 NCF_FE_TR_LOSS 0 TMFE_TX_RQ_ENQS 72700 RSC_GETTXDST_CALLS 72700 NCF_FE_TR_NOCUR 0 TMFE_TX_RQ_COMMITS 72700 RSC_GETTXDST_SUCCESS 72700 NCF_FE_TR_GAIN 1 TMFE_TX_RQ_ABORTS 0 RSC_GETTXSRV_CALLS 0 NCF_BE_TR_LOSS 0 TMFE_TX_ACCEPTS 72500 RSC_GETTXSRV_SUCCESS 0 NCF_BE_TR_GAIN 1 TMFE_TX_REJECTS 0 RSC_GETEVTDST_CALLS 0 NCF_FQM_TR_LOSS 0 TMFE_TX_REPLAYS 0 RSC_GETEVTDST_SUCCESS 0 QRM_REQ_LINK_QUEUE 0 TMRT_TX_RQ_STARTS 72700 RSC_GETEVTRCV_CALLS 6 QRM_RSP_LINK_QUEUE 0 TMRT_TX_RQ_ENQS 72700 RSC_GETEVTRCV_SUCCESS 0 TMRT_TX_RQ_COMMITS 72700 RSC_FE_NODES 1 TMRT_TX_RQ_ABORTS 0 RSC_BE_NODES 1 TMRT_TX_ACCEPTS 72500 RSC_TR_SERVER_CLASSES 1 TMRT_TX_REJECTS 0 RSC_BE_SERVER_CLASSES 1 TMBE_TX_RQ_STARTS 72700 RSC_SERVER_CHANS 1 TMBE_TX_RQ_ENQS 72700 RSC_REQUESTER_CHANS 2 |
Displays per facility counters. Use the /FACILITY qualifier to specify a facility; if
it is not specified then the totals of the counters for all facilities
are displayed.
5.2.11 Monitor Flow
FLOW CONTROL COUNTERS 7-JAN-1999 14:08:06, NODE: -ALL- , FACILITY: -ALL- CREDIT DATA RATE REQUESTS GRANTS ROLE AVAILABLE BYTES/SEC WAITS SENT PENDING SENT PENDING FE=>TR 15000 2065 307 966 0 966 0 TR=>BE 15000 2065 70 998 0 998 0 BE=>TR 0 0 0 0 0 0 0 TR=>FE 15000 0 0 2 0 2 0 LINK DATA RATE WAITS PENDING REQS SENT CACHE IN USE NODEA =>NODEA 0 0 0 1 NODEA 0 NODEA =>NODEB 0 0 0 0 NODEB 51456 NODEB =>NODEB 2065 307 0 968 NODEB =>NODEA 2065 70 0 999 |
Displays the flow control internals.
5.2.12 Monitor Group
% rtr monitor group Concurrency Measures Tue Apr 6 1999 10:04:26, NODE: NODEA -- averages -- txn -server- -- transactions -- srv txn txn krid state cnt cnt act vreq vote ack /csn act /sec /csn 16777216 shd_rec_fail 0 1 0 0 0 0 0 0.0 0.0 0.0 16777217 shd_rec_fail 0 1 0 0 0 0 0 0.0 0.0 0.0 |
Field | Meaning |
---|---|
krid | Key range (partition) identifier. |
state | Partition state. |
txn cnt | Number of transactions executed for this partition. |
srv cnt | Number of servers active for this partition. |
srv act | Number servers that are currently busy processing txns for this sample. |
The following fields track the progress of a transaction through the states: vote requested, voted, acknowledged. | |
vreq | Number of transactions that are waiting for the server to vote. |
vote | Number of transactions that have been voted on by the server but not committed by RTR. |
ack | Number of transactions that have been committed but have not been acknowledged by the server. Acknowledgment occurs on a subsequent rtr_receive_message() call by the server processing this transaction to get a message for a new transaction. |
/csn | Number of transactions which have been grouped under the same "commit sequence number" (CSN). This grouping determines the ordering of transactions submitted to a secondary shadow server. |
txn/sec | The average rate of transaction starts per second for this partition. |
txn/csn | average number of transactions which have been grouped under the same commit sequence number (CSN) since this partition became active. This average is computed as the quotient of the txn cnt column and the total number of CSN's. |
RTR> Monitor IPC Node: LENGTH I P C S u m m a r y Fri Mar 5 1999 11:18:34 +----------------------------------------------------------------------------+ This screen displays usage information on IPC messages, byte counts and IO primitives. Display units are counts, kbytes and calls respectively. +----------------------------------------------------------------------------+ | - - - - - O u t g o i n g / s - - - - | - - I n c o m i n g / s - -| Process Messages ...kbytes send() ...kbytes Messages recv() ...kbytes rtracp 110334 49744 73437 49744 73434 220299 5616 3123B84F 0 0 0 0 0 0 0 31232395 73282 5569 73280 5569 109930 293144 49685 |
Displays interprocess communication message information.
5.2.14 Monitor IPCRATE
RTR> Monitor IPCRATE Node: LENGTH I P C R a t e s Fri Mar 5 1999 11:18:53 +----------------------------------------------------------------------------+ This screen displays rate information on IPC messages, byte counts and IO primitive usage. Display units are counts, kbytes and calls per second respectively. +----------------------------------------------------------------------------+ | - - - - - O u t g o i n g / s - - - - | - - I n c o m i n g / s - -| Process Messages ...kbytes send() ...kbytes Messages recv() ...kbytes rtracp 44 19 29 19 29 86 2 3123B84F 0 0 0 0 0 0 0 31232395 28 2 28 2 41 110 18 |
Displays interprocess communication rate information for messages.
5.2.15 Monitor Journal
JOURNAL USAGE ON NODE NODEA AT 10:36:05 Tue Apr 6 1999 LOCAL JOURNAL STANDBY JOURNAL(S) JNL_LCL_BLOCKS_IN_USE 128 JNL_RMT_BLOCKS_IN_USE 0 [___ 13% ] [ 0% ] JNL_LCL_NR_BLOCKS 992 JNL_RMT_NR_BLOCKS 992 JNL_LCL_TOP_BLOCKS_USED 128 JNL_RMT_TOP_BLOCKS_USED 0 JNL_LCL_BLOCKS_AVAILABLE 864 JNL_RMT_BLOCKS_AVAILABLE 0 JNL_LCL_TX_ENTRIES 1 JNL_RMT_TX_ENTRIES 0 JNL_LCL_TX_RECORDS 2 JNL_RMT_TX_RECORDS 0 JNL_LCL_MEMORY_BYTES 530121 JNL_RMT_MEMORY_BYTES 4197 JNL_LCL_DISK_READS 31 JNL_RMT_DISK_READS 33 JNL_LCL_BLOCKS_READ 992 JNL_RMT_BLOCKS_READ 1056 JNL_LCL_DISK_WRITES 12 JNL_RMT_DISK_WRITES 0 JNL_LCL_BLOCKS_WRITTEN 14 JNL_RMT_BLOCKS_WRITTEN 0 JNL_LCL_ENTRIES_TOTAL 201 JNL_RMT_ENTRIES_TOTAL 5 JNL_LCL_RECORDS_TOTAL 398 JNL_RMT_RECORDS_TOTAL 9 JNL_LCL_RECORDS_READ 21 JNL_RMT_RECORDS_READ 0 JNL_LCL_REC_BYTES_READ 8006 JNL_RMT_REC_BYTES_READ 0 JNL_LCL_NONTX_ENTRIES 5 JNL_RMT_NONTX_ENTRIES 0 JNL_LCL_OPEN_JOURNALS 2 |
Displays information about journal usage, including total number of entries and records written, number of records read, and how many bytes were involved. Bar graphs showing current usage of journal blocks (as a percentage of the total) are also provided.
The local journal figures refer to journal usage for the displayed node. Standby journals are journals of standby nodes that are being accessed due to restart or catch-up situations. Under normal conditions standby journal figures are all zero.
The bar graphs appear under the first line of the display (JNL_LCL_BLOCKS_IN_USE).
Previous | Next | Contents | Index |