PreviousNext

Serviceability and Logging

The DCE serviceability facilities allow server applications to display or log messages, to control message routing, and to associate actions with messages. A remote serviceability interface also makes it possible to control server message routing and filtering via dcecp or from application management clients.

The serviceability mechanism is designed to be used mainly for server informational and error messaging (that is, for messages that are of interest to those who are concerned with server maintenance and administration (in the broadest sense of these terms). The essential idea of the mechanism is that all server events that are significant for maintaining or restoring normal operation should be reported in messages that are made to be self-documenting, so that (provided all significant events have been correctly identified and reported) users and administrators will by definition always be able to learn what action they should take whenever anything out of the ordinary occurs. User-prompted, interactive, client-generated messaging should be handled through the DCE messaging interface.

Serviceability is also used by the DCE components (for example, DTS, CDS, and so forth) themselves. Consistent use of the same message mechanism by DCE implementations and applications should result in simplified DCE administration.

DCE components use the serviceability facilities according to the following guidelines; it is recommended that DCE applications use them also.

· All servers should report when they are started, and when they have completed their initialization and are ready to perform work. They should also indicate when they are going off-line.

· All program exits should be reported as fatal errors. Similarly, all calls to abort( ) should be replaced by calls to dce_svc_printf( ) with the svc_c_action_abort action attribute specified.

· Errors which make it impossible for the application to proceed should be reported as close as possible to the point of occurrence. This includes such conditions as: failure to allocate memory, failure to open a configuration file for reading, or a log file for writing, and so on.

· Conditions which may indicate system-level malfunction or poor performance must be reported.

· Routine administrative actions should be reported as informational messages. This includes: creation, modification and deletion of tickets, threads, files, sockets, RPC endpoints, or other objects; message transfer, including name lookup, binding, and forwarding; and database maintenance, including replication or synchronization.

The severity level attribute for each message can be determined according to the following criteria:

· Fatal error exit (svc_c_sev_fatal_error). An unrecoverable error has occurred requiring special manual recovery actions to take place, such as database restoration. The program usually terminates immediately.

· Error detected (svc_c_sev_error). An unexpected event that is nonterminal or is correctable via human intervention has occurred, such as a timeout. The program continues although some functions or services may not be available. This may also be used to indicate that a particular request or action could not be completed.

· Correctable error (svc_c_sev_warning). An error occurred that was automatically corrected, such as a configuration file was not found so that defaults were used. This may also be used to indicate a condition that may be an error if the effects are undesireable, such as removing all files when a nonempty directory is removed. This may also be used to indicate a condition that if not corrected will eventually result in an error, such as when a printer is running out of paper.

· Informational notice (svc_c_sev_notice). A predetermined major event has occurred, such as a server started.

· Verbose information notice (svc_c_sev_notice_verbose). A predetermined event has occurred, such as a directory entry was removed.

· Debug level 1 (svc_c_debug1) through debug level 9 (svc_c_debug9). Messages in the nine debug levels would not normally appear in production code.

An appropriate action may be associated with an error message by ORing one of the svc_c_action. . . values with the message attribute. Note that the svc_c_action_abort action, which results in a call to abort( ), does not provide any reliable means to clean up and should only be used where the default abort( ) action, which is typically to dump core, is appropriate. Cleanup for the svc_c_action_exit action can implemented by supplying an atexit( ) handler.

In addition to these guidelines, a persistent server application that does message logging should consider exporting the remote serviceability interface as a means to simplify server administration.