Document revision date: 19 July 1999 | |
Previous | Contents | Index |
In certain system configurations, it might be impossible to preserve the entire contents of memory in a disk file. For instance, a large memory system or a system with small disk capacity might not be able to supply enough disk space for a full memory dump. In normal circumstances, if the system dump file cannot accommodate all of memory, SDA cannot analyze the dump.
To preserve those portions of memory that contain information most useful in determining the causes of system failures, a system manager sets the static system parameter DUMPSTYLE to 1. When the DUMPSTYLE parameter is set, AUTOGEN attempts to create a dump file large enough to contain ample information for SDA to analyze a failure. When the DUMPSTYLE parameter is clear (the default), AUTOGEN attempts to create a dump file large enough to contain all of physical memory.
A comparison of full and subset style dump files appears in Table SDA-7.
Full | Subset | |
---|---|---|
Available Information | Complete contents of physical memory in use, stored in order of increasing physical address (for instance, system and global page tables are stored last). | System page table, global page table, system space memory, and process and control regions (plus global pages) for all saved processes. |
Unavailable Information | Contents of paged-out memory at the time of the crash. | Contents of paged-out memory at the time of the crash, process and control regions of unsaved processes, and memory not mapped by a page table (such as the free and modified lists). |
SDA Command Limitations | None. | The following commands are not useful for unsaved processes: SHOW PROCESS/CHANNELS, SHOW PROCESS/RMS, SHOW STACK, and SHOW SUMMARY/IMAGE. |
Every time the operating system writes information to the system dump file, it writes over whatever was previously stored in the file. For this reason, as system manager, you need to save the contents of the file after a system failure has occurred.
You can use the SDA COPY command or the DCL COPY command in your site-specific startup procedure. Digital recommends using the SDA COPY command because it marks the dump file as copied. This is particularly important if the dump was written into the paging file, SYS$SYSTEM:PAGEFILE.SYS, because the SDA COPY command releases to the pager the pages that were occupied by the dump.
Because system dump files are set to NOBACKUP, the Backup utility (BACKUP) does not copy dump files to tape unless you use the qualifier /IGNORE=NOBACKUP when invoking BACKUP. When you use the SDA COPY command to copy the system dump file to another file, the new file is not set to NOBACKUP.
As included in the distribution kit, SYS$SYSTEM:SYSDUMP.DMP is
protected against world access. Because a dump file can contain
privileged information, Digital recommends that you continue to protect
dump files from universal read access.
1.3 Invoking SDA in the Site-Specific Startup Command Procedure
Because a listing of the SDA output is an important source of information in determining the cause of a system failure, it is a good idea to have SDA produce such a listing after every failure. The system manager can ensure the creation of a listing by modifying the site-specific startup command procedure SYS$MANAGER:SYSTARTUP_VMS.COM so that it invokes SDA when the system is booted.
When invoked in the site-specific startup procedure, SDA executes the specified commands only if the system is booting immediately after a system failure. SDA examines a flag in the dump file's header that indicates whether it has already processed the file. If the flag is set, SDA merely exits. If the flag is clear, SDA executes the specified commands and sets the flag. This flag is clear when the operating system initially writes a crash dump, except for those resulting from an operator-requested shutdown (for instance, SYS$SYSTEM:SHUTDOWN.COM).
The following example shows typical commands that you might add to your site-specific startup command procedure to produce an SDA listing after each failure.
$ ! $ ! Print dump listing if system just failed $ ! $ ANALYZE/CRASH_DUMP SYS$SYSTEM:SYSDUMP.DMP COPY SYS$SYSTEM:SAVEDUMP.DMP ! Save dump file SET OUTPUT DISK1:SYSDUMP.LIS ! Create listing file READ/EXEC ! Read symbols into the SDA symbol table SHOW CRASH ! Display crash information SHOW STACK ! Show current stack SHOW SUMMARY ! List all active processes SHOW PROCESS/PCB/PHD/REG ! Display current process SHOW SYMBOL/ALL ! Print system symbol table EXIT $ PRINT DISK1:SYSDUMP.LIS |
The COPY command in the preceding example saves the contents of the file SYS$SYSTEM:SYSDUMP.DMP. If your system's startup command file does not save a copy of the contents of this file, this crash dump information is lost in the next system failure, when the system saves the information about the new failure, overwriting the contents of SYS$SYSTEM:SYSDUMP.DMP.
If you are using the SYS$SYSTEM:PAGEFILE.SYS as the crash dump file, you must include SDA commands in SYS$MANAGER:SYSTARTUP_VMS.COM that free the space occupied by the dump so that the pager can use it. For instance:
$ ANALYZE/CRASH_DUMP SYS$SYSTEM:PAGEFILE.SYS . . . COPY dump_filespec EXIT |
SDA performs certain tasks prior to bringing a dump into memory, presenting its initial displays, and accepting command input. This section describes those tasks, which include the following:
For detailed information about the investigation of a system failure, see Section 8.
To be able to analyze a dump file, your process must have the following:
If your process satisfies these conditions, you can issue the DCL command ANALYZE/CRASH_DUMP to invoke SDA. If you do not specify the name of a dump file in the command, SDA prompts you for the name of the file, as follows:
$ ANALYZE/CRASH_DUMP _Dump File: |
The default file specification is as follows:
disk:[default-dir]SYSDUMP.DMP |
disk and [default-dir] represent the disk and
directory specified in your last SET DEFAULT command.
2.2 Mapping the Contents of the Dump File
SDA first attempts to map the contents of physical memory as stored in the specified dump file. To do this, it must first locate the system page table (SPT) among its contents. The SPT contains one entry for each page of system virtual address space.
The SPT appears at the largest physical addresses in a typical configuration. As a result, if a dump file is too small, the SPT cannot be written to it in the event of system failure.
If SDA cannot find the SPT in the dump file, it displays either of the following messages:
%SDA-E-SPTNOTFND, system page table not found in dump file |
%SDA-E-SHORTDUMP, the dump only contains m out of n pages of physical memory |
If SDA displays either of these error messages, you cannot analyze the crash dump, but must take steps to ensure that any subsequent dump can be preserved. To do this, you must increase the size of the dump file, as indicated in Section 1.1, or adjust the system DUMPSTYLE parameter, as discussed in Section 1.1.2.
Under certain conditions, the system might not save some memory locations in the system dump file. For instance, during halt/restart bugchecks, the system does not preserve the contents of general registers. If such a bugcheck occurs, SDA indicates in the SHOW CRASH display that the contents of the registers were destroyed. Additionally, if a bugcheck occurs during system initialization, the contents of the register display might be unreliable. The symptom of such a bugcheck is a SHOW SUMMARY display that shows no processes or only the swapper process.
Also, if you use an SDA command to access a virtual address that has no corresponding physical address, SDA displays the following error message:
%SDA-E-NOTINPHYS, 'location' not in physical memory |
When you analyze a subset dump file, if you use an SDA command to access a virtual address that has a corresponding physical address but was not saved in the dump file, SDA displays the following error message:
%SDA-E-MEMNOTSVD, memory not saved in the dump file |
After locating and reading the system dump file, SDA attempts to read the system symbol table file into the SDA symbol table. This file, named SYS$SYSTEM:SYS.STB by default, contains most of the global symbols used by the operating system. SDA also reads into its symbol table a subset of SYS$SYSTEM:SYSDEF.STB, called SYS$SYSTEM:REQSYSDEF.STB, that it requires to identify locations in memory.
If SDA cannot find the system symbol table file, or if it is given a file that is not a system symbol table in the /SYMBOL qualifier to the ANALYZE command, it halts with a fatal error.
When SDA finishes building its symbol table, it displays a message identifying itself and the immediate cause of the crash. In the following example, the cause of the crash was an illegal exception occurring at an IPL above IPL$_ASTDEL or while using the interrupt stack.
Dump taken on 28-Jan-1993 18:10:09.79 INVEXCEPTN, Exception while above ASTDEL or on interrupt stack |
After displaying the crash summary, SDA executes the commands in the SDA initialization file, if you have established one. SDA refers to its initialization file by using the logical name SDA$INIT. If SDA cannot find the file defined as SDA$INIT, it searches for the file SYS$LOGIN:SDA.INIT.
The initialization file can contain SDA commands that read symbols into SDA's symbol table, define keys, establish a log of SDA commands and output, or perform other tasks. For instance, you might want to use an SDA initialization file to augment SDA's symbol table with definitions helpful in locating system code.
If you issue the following command, SDA includes those symbols that define many of the system's data structures, including those in the I/O database:
READ SYS$SYSTEM:SYSDEF.STB |
You might also find it very helpful to define those symbols that identify the modules in the images that make up the executive. You can do this by issuing the following command:
READ/EXECUTIVE SYS$LOADABLE_IMAGES |
After SDA executes the commands in the initialization file, it displays its prompt, as follows:
SDA> |
The SDA> prompt indicates that you can use SDA interactively and enter SDA commands.
An SDA initialization file can invoke a command procedure with the @
command. However, such command procedures cannot themselves invoke a
command procedure (that is, you cannot have nested command procedures).
3 Analyzing a Running System
Occasionally, an internal problem hinders system performance but does not cause a system failure. By allowing you to examine the running system, SDA provides the means to search for the solution to the problem without disturbing the operating system. For example, you can use SDA to examine the stack and memory of a process that is stalled in a scheduler state, such as a miscellaneous wait (MWAIT) or a suspended (SUSP) state (see the Guide to OpenVMS Performance Management).
If your process has change-mode-to-kernel (CMKRNL) privilege, you can invoke SDA to examine the system. Use the following DCL command:
$ ANALYZE/SYSTEM |
OpenVMS System analyzer SDA> |
The SDA> prompt indicates that you can use SDA interactively and enter SDA commands. When analyzing a running system, SDA sets its process context to that of the process running SDA.
If you are undertaking an analysis of a running system, take the following considerations into account:
When using SDA to analyze a running system, use caution in interpreting its displays. Because system states change frequently, it is possible that the information SDA displays might be inconsistent with the actual, volatile state of the system at any given moment. |
%SDA-E-CMDNOTVLD, command not valid on the running system |
When invoked to analyze either a crash dump or a running system, SDA establishes a default context from which it interprets certain commands.
When the subject of analysis is a uniprocessor system, SDA's context is solely process context. That is, SDA can interpret its process-specific commands in the context of either the process current on the uniprocessor or some other process in some other scheduling state.
When you initially invoke SDA to analyze a crash dump, its process context defaults to that of the process that was current at the time of the crash. When you invoke SDA to analyze a running system, its process context defaults to that of the current process; that is, the one executing SDA.
You can change SDA's process context by issuing any of the following commands:
In a uniprocessor system only one CPU exists, and the concept of SDA CPU context is not an issue. However, for a multiprocessor system with more than one active CPU, SDA must maintain an idea of CPU context to provide a way of displaying information bound to a specific CPU, such as the reason for the bugcheck exception, the currently executing process, the current IPL, the contents of CPU registers, and any owned spin locks. When you first invoke SDA to analyze a crash dump, the "SDA current CPU" is the CPU that induced the system failure.
You can use several SDA commands to change the CPU context. When you change the CPU context, the "SDA current process" is changed to the current process on the "SDA current CPU" to synchronize CPU context and process context. If no current process is on the "SDA current CPU," the "SDA current process" is undefined; no process context information will be available until you set SDA process context to a specific process.
Type HELP PROCESS_CONTEXT for specific information about the "SDA current process."
The following SDA commands change the "SDA current CPU":
Command | Description |
---|---|
SET CPU cpu_id | Changes the "SDA current CPU" to CPU cpu_id |
SHOW CPU cpu_id | Changes the "SDA current CPU" to CPU cpu_id |
SHOW CRASH | Changes the "SDA current CPU" to the CPU that induced the system failure |
If you select a process that is the current process on a CPU, the following commands change the "SDA current CPU" to that CPU:
No other SDA commands affect the "SDA current CPU."
When you analyze the running system, you cannot use the SET CPU and SHOW CPU commands because SDA does not have access to all the CPU-specific information about the running system. |
In a uniprocessor system, process context might be the process that is current on the CPU or the process in whose context process-specific SDA commands are interpreted. For a multiprocessor system with more than one active CPU, however, the meaning of "SDA process context" changes so that it includes a way to display information relevant to a specific process both when the process is current on a processor and when the process is not.
You can use several SDA commands to change SDA process context. Following is a list of the results of some of these changes:
Type HELP CPU_CONTEXT for specific information about the "SDA current CPU."
The following SDA commands change the "SDA current process":
Command | Description |
---|---|
SET PROCESS name | Changes the "SDA current process" to the named process |
SET PROCESS /INDEX=n | Changes the "SDA current process" to the process with index n |
SHOW PROCESS name | Changes the "SDA current process" to the named process |
SHOW PROCESS /INDEX=n | Changes the "SDA current process" to the process with index n |
The following commands change the SDA process context if the "SDA current process" is not the current process on the selected CPU:
Command | Description |
---|---|
SET CPU cpu_id | Changes the "SDA current process" to the current process on CPU cpu_id |
SHOW CPU cpu_id | Changes the "SDA current process" to the current process on CPU cpu_id |
SHOW CRASH | Changes the "SDA current process" to the current process on the CPU that induced the system failure |
No other SDA commands affect the "SDA current process."
When you analyze the running system, CPU context is not used because all the CPU-specific information might not be available. |
When you invoke SDA to analyze a crash dump from a multiprocessing system with more than one active CPU, SDA maintains a second dimension of context---its CPU context---that allows it to display certain processor-specific information, such as the reason for the bugcheck exception, the currently executing process, the current IPL, the contents of processor-specific registers, the interrupt stack pointer (ISP), and the spin locks owned by the processor. When you invoke SDA to analyze a multiprocessor's crash dump, its CPU context defaults to that of the processor that induced the system failure.3
You can change the SDA CPU context by using any of the following commands:
Changing CPU context involves an implicit change in process context in either of the following ways:
Likewise, changing process context can involve a switch of CPU context as well. For instance, if you issue a SET PROCESS command for a process that is current on another CPU, SDA automatically changes its CPU context to that of the CPU on which that process is current. The following commands can have this effect if the name or index number (nn) refers to a current process:
3 When you are analyzing a running system, CPU context is not accessible to SDA. Therefore, the SET CPU and SHOW CPU commands are not permitted. |
Previous | Next | Contents | Index |
privacy and legal statement | ||
4556PRO_001.HTML |