Updated: 11 December 1998 |
OpenVMS Alpha System Analysis Tools Manual
Previous | Contents | Index |
There are two types of dump files---a physical memory dump (also known as a full dump), and a dump of selected virtual addresses (also known as a selective dump). Both full and selective dumps may be produced in either compressed or uncompressed form. Compressed dumps save disk space and time taken writing the dump at the expense of a slight increase in time to access the dump with SDA. The SDA commands COPY/COMPRESS and COPY/DECOMPRESS can be used to convert an existing dump.
DUMPSTYLE, which specifies the method of writing system dumps, is a 32-bit mask. Table 2-1 shows how the bits are defined. Each bit can be set independently. The value of the SYSGEN parameter is the sum of the values of the bits that have been set. Remaining or undefined values are reserved to Compaq.
Bit | Value | Description |
---|---|---|
0 | 0 | 0= Full dump (SYSGEN default). The entire contents of physical memory will be written to the dump file. |
1= Selective dump. The contents of memory will be written to the dump file selectively to maximize the usefulness of the dump file while conserving disk space. | ||
1 | 2 | 0= Minimal console output. |
1= Full console output (includes stack dump, register contents, and so on.) | ||
2 | 4 | This bit is ignored on Alpha systems. |
3 | 8 | 0= Do not compress. |
1= Compress. | ||
4-31 | Reserved to Compaq |
In a physical memory dump, the DUMPSTYLE system parameter can be set to 0,2,8, or 10. Each value provides a full dump; the value of 0 yields an uncompressed dump with minimal console output; the value of 2 provides an uncompressed dump with full console output; the value of 8 provides a compressed dump with minimal console output; and the value of 10 provides a compressed dump with full console output. A physical memory dump requires that all physical memory be written to the dump file. This ensures the presence of all the page table pages required for SDA to emulate translation of system virtual addresses. These table pages include the level 1 page table of the current process, the shared level 2 page table that maps the system page table (SPT), and the level 3 page table pages that constitute the SPT.
In certain system configurations, it may be impossible to preserve the entire contents of memory in a disk file. For instance, a large memory system or a system with small disk capacity may not be able to supply enough disk space for a full memory dump. If the system dump file cannot accommodate all of memory, information essential to determining the cause of the system failure may be lost.
To preserve those portions of memory that contain information most useful in determining the causes of system failures, a system manager sets the value of the DUMPSTYLE system parameter to 1, 3, 9, or 11 to specify a dump of selected virtual address spaces. Each value provides a selective dump; the value of 1 yields an uncompressed dump with minimal console output; the value of 3 provides an uncompressed dump with full console output; the value of 9 provides a compressed with minimal console output; and the value of 11 provides a compressed with full console output. In a selective dump, related pages of virtual address space are written to the dump file as a unit called a logical memory block (LMB). For example, one LMB consists of the system and global page tables; another is the address space of a particular process. Those LMBs most likely to be useful in crash dump analysis are written first.
Table 2-2 compares full and selective style dump files.
Item | Full | Selective |
---|---|---|
Available Information | Complete contents of physical memory in use, stored in order of increasing physical address. | System page table, global page table, system space memory, and process and control regions (plus global pages) for all saved processes. |
Unavailable Information | Contents of paged-out memory at the time of the system failure. | Contents of paged-out memory at the time of the system failure, process and control regions of unsaved processes, L1 page tables, and memory not mapped by a page table. |
SDA Command Limitations | None. | The following commands are not useful for unsaved processes: SHOW PROCESS/CHANNELS, SHOW PROCESS/IMAGE, SHOW PROCESS/RMS, SHOW STACK, and SHOW SUMMARY/IMAGE. |
You can adjust the size of the system page file and dump file using AUTOGEN (the recommended method) or by using SYSGEN.
AUTOGEN automatically calculates the appropriate sizes for page and dump files. AUTOGEN invokes the System Generation utility (SYSGEN) to create or change the files. However, you can control sizes calculated by AUTOGEN by defining symbols in the MODPARAMS.DAT file. The file sizes specified in MODPARAMS.DAT are copied into the PARAMS.DAT file during AUTOGEN's GETDATA phase. AUTOGEN then makes appropriate adjustments in its calculations.
Although Compaq recommends using AUTOGEN to create and modify page and dump file sizes, you can use SYSGEN to directly create and change the sizes of those files.
The sections that follow discuss how you can calculate the size of a dump file.
See the OpenVMS System Manager's Manual for detailed information about using AUTOGEN and
SYSGEN to create and modify page and dump file sizes.
2.2.1.3 Writing to the System Dump File
OpenVMS Alpha writes the contents of the error-log buffers, processor registers, and memory into the system dump file, overwriting its previous contents. If the system dump file is too small, OpenVMS Alpha cannot copy all memory to the file when a system failure occurs.
SYS$SYSTEM:SYSDUMP.DMP (SYS$SPECIFIC:[SYSEXE]SYSDUMP.DMP) is furnished as an empty file in the OpenVMS Alpha software distribution kit. To successfully store a crash dump, SYS$SYSTEM:SYSDUMP.DMP must be enlarged to hold all of the page tables required for SDA to emulate system virtual address translation.
To calculate the correct size for a physical memory dump to SYS$SYSTEM:SYSDUMP.DMP, use the following formula:
size-in-blocks(SYS$SYSTEM:SYSDUMP.DMP) = size-in-pages(physical-memory) * blocks-per-page + number-of-error-log-buffers * blocks-per-buffer + 2 |
Use the DCL command SHOW MEMORY to determine the total size of physical
memory on your system. There is a variable number of error log buffers
in any given system, depending on the setting of the ERRORLOGBUFFERS
system parameter. The size of each buffer depends on the setting of the
ERLBUFFERPAGES parameter. (See the OpenVMS System Manager's Manual for additional
information about these parameters.)
2.2.1.4 Writing to the Dump File off the System Disk
OpenVMS Alpha allows you to write the system dump file to a device other than the system disk. This is useful in large memory systems and in clusters with common system disks where sufficient disk space, on one disk, is not always available to support customer dumpfile requirements. To perform this activity, the DUMPSTYLE system parameter must be correctly enabled to allow the bugcheck code to write the system dump file to an alternative device.
The requirements for writing the system dump file off the system disk are the following:
DUMPFILE_DEVICE = $nnn$ddcuuuu |
For information on how to write the system dump file to an alternative
device to the system disk, see the OpenVMS System Manager's Manual: Tuning, Monitoring, and Complex Systems.
2.2.1.5 Writing to the System Page File
If SYS$SYSTEM:SYSDUMP.DMP does not exist, the operating system writes the dump of physical memory into SYS$SYSTEM:PAGEFILE.SYS, the primary system page file, overwriting the contents of that file.
If the SAVEDUMP system parameter is set, the dump file is retained in PAGEFILE.SYS when the system is booted after a system failure. If the SAVEDUMP parameter is not set (clear), which is the default, OpenVMS Alpha uses the entire page file for paging and any dump written to the page file is lost. (To examine or change the value of the SAVEDUMP parameter, consult the OpenVMS System Manager's Manual: Tuning, Monitoring, and Complex Systems.)
To calculate the minimum size for a physical memory dump to SYS$SYSTEM:PAGEFILE.SYS, use the following formula:
size-in-blocks(SYS$SYSTEM:PAGEFILE.SYS) = size-in-pages(physical-memory) * blocks-per-page + number-of-error-log-buffers * blocks-per-buffer + 2 + value of the system parameter RSRVPAGCNT |
Note that this formula calculates the minimum size requirement for saving a physical dump in the system's page file. Compaq recommends that the page file be a bit larger than this minimum to avoid hanging the system. Also note that you can only write the dump of physical memory into the primary page file (SYS$SYSTEM:PAGEFILE.SYS). Secondary page files cannot be used to save dump file information.
It is not recommended to use a selective dump (DUMPSTYLE=1) style with PAGEFILE.SYS. If the PAGEFILE.SYS is used for a selective dump, and if the PAGEFILE.SYS is not large enough to contain all the logical memory blocks, the dump fills the entire page file and the system may hang on reboot. When selective dumping is set up, all available space is used to write out the logical memory blocks. If the page file is large enough to contain all of physical memory, there is no reason to use selective dumping. A full memory dump (DUMPSTYLE=0) should be used.
Writing crash dumps to SYS$SYSTEM:PAGEFILE.SYS presumes that you will later free the space occupied by the dump for use by the pager. Otherwise, your system may hang during the startup procedure. To free this space, you can do one of the following:
Every time the operating system writes information to the system dump file, it writes over whatever was previously stored in the file. The system writes information to the dump file whenever the system fails or is shut down. For this reason, the system manager must save the contents of the file after a system failure has occurred.
The system manager can use the SDA COPY command or the DCL COPY command. Either command can be used in a site-specific startup procedure, but the SDA COPY command is preferred because it marks the dump file as copied. As mentioned earlier, this is particularly important if the dump was written into the page file, SYS$SYSTEM:PAGEFILE.SYS, because it releases those pages occupied by the dump to the pager. Another advantage of using the SDA COPY command is that this command copies only the saved number of blocks and not necessarily the whole allotted dump file. For instance, if the size of the SYSDUMP.DMP file is 100,000 blocks and the bugcheck wrote only 60,000 blocks to the dump file, then DCL COPY would create a file of 100,000 blocks. However, SDA COPY would generate a file of only 60,000 blocks.
Because system dump files are set to NOBACKUP, the Backup utility (BACKUP) does not copy them to tape unless you use the qualifier /IGNORE=NOBACKUP when invoking BACKUP. When you use the SDA COPY command to copy the system dump file to another file, OpenVMS Alpha does not set the new file to NOBACKUP.
As shipped by Compaq, the file SYS$SYSTEM:SYSDUMP.DMP is protected
against world access. Because a dump file can contain privileged
information, Compaq recommends that the system manager not change this
default protection.
2.2.3 Invoking SDA When Rebooting the System
When the system reboots after a system failure, SDA is automatically invoked by default. SDA archives information from the dump in a history file. In addition, a listing file with more detailed information about the system failure is created in the directory pointed to by the logical name CLUE$COLLECT. (Note that the default directory is SYS$ERRORLOG unless you redefine the logical name CLUE$COLLECT in the procedure SYS$MANAGER:SYLOGICALS.COM.) The file name is in the form CLUE$node_ddmmyy_hhmm.LIS where the timestamp (hhmm) corresponds to the system failure time and not the time when the file was created.
Directed by commands in a site-specific file, SDA can take additional steps to record information about the system failure. They include the following:
The following example shows SDA commands that can make up your site-specific command file to produce a more complete SDA listing after each system failure, and to save a copy of the dump file:
! ! SDA command file, to be executed as part of the system ! bootstrap from within CLUE. Commands in this file can ! be used to save the dump file after a system bugcheck, and ! to execute any additional SDA commands. ! ! Note that the logical name DMP$ must have been defined ! within SYS$MANAGER:SYLOGICALS.COM ! READ/EXEC ! read in the executive images' symbol tables COPY DMP$:SAVEDUMP.DMP ! copy and save dump file SHOW STACK ! display the stack ! |
The SDA commands in this site-specific command file are executed first and then the CLUE HISTORY command is executed by default. See the reference section on CLUE HISTORY for details on the summary information that is generated and stored in the CLUE list file by the CLUE HISTORY command.
To point to your site-specific file, add a line such as the following to the file SYS$MANAGER:SYLOGICALS.COM:
$ DEFINE/SYSTEM CLUE$SITE_PROC SYS$MANAGER:SAVEDUMP.COM |
In this example, the site-specific file is named SAVEDUMP.COM.
The CLUE list file can be printed immediately or saved for later examination.
SDA is invoked and executes the specified commands only when the system boots immediately after a system failure. If the system is booting for any other reason (such as a normal system shutdown and reboot), SDA exits.
If CLUE files occupy more space than the threshold allows (the default is 5000 blocks), the oldest files will be deleted until the threshold limit is reached. The threshold limit can be customized with the CLUE$MAX_BLOCK logical name.
To prevent the running of CLUE at system startup, define the logical
CLUE$INHIBIT in the SYLOGICALS.COM file as /SYS TRUE.
2.3 Analyzing a System Dump
SDA performs certain tasks before bringing a dump into memory, presenting its initial displays, and accepting command input. These tasks include the following:
For detailed information on investigating system failures, see
Section 2.7.
2.3.1 Requirements
To analyze a dump file, your process must have read access both to the
file that contains the dump and to copies of
SDA$READ_DIR:SYS$BASE_IMAGE.EXE and SDA$READ_DIR:REQSYSDEF.STB (the
required subset of the symbols in the file SYSDEF.STB). SDA reads these
tables by default.
2.3.2 Invoking SDA
If your process can access the files listed in Section 2.3.1, you can issue the DCL command ANALYZE/CRASH_DUMP to invoke SDA. If you do not specify the name of a dump file in the command, SDA prompts you:
$ ANALYZE/CRASH_DUMP _Dump File: |
The default file specification is as follows:
SYS$DISK and [default-dir] represent the disk and directory specified in your last SET DEFAULT command.
If you are rebooting after a system failure, SDA is automatically
invoked. See Section 2.2.3.
2.3.3 Mapping the Contents of the Dump File
SDA first attempts to map the contents of physical memory as stored in the specified dump file. To do this, it must first locate the system page table (SPT) among its contents. The SPT contains one entry for each page of system virtual address space.
%SDA-E-SPTNOTFND, system page table not found in dump file |
%SDA-W-SHORTDUMP, the dump only contains m out of n blocks of physical memory |
Under certain conditions, some memory locations might not be saved in the system dump file. Additionally, if a bugcheck occurs during system initialization, the contents of the register display may be unreliable. The symptom of such a bugcheck is a SHOW SUMMARY display that shows no processes or only the swapper process.
If you use an SDA command to access a virtual address that has no corresponding physical address, SDA generates the following error message:
%SDA-E-NOTINPHYS, 'location': virtual data not in physical memory |
When analyzing a selective dump file, if you use an SDA command to access a virtual address that has a corresponding physical address not saved in the dump file, SDA generates the following error message:
%SDA-E-MEMNOTSVD, memory not saved in the dump file |
After locating and reading the system dump file, SDA attempts to read the system symbol table file into the SDA symbol table. If SDA cannot find SDA$READ_DIR:SYS$BASE_IMAGE.EXE---or is given a file that is not a system symbol table in the /SYMBOL qualifier to the ANALYZE command---it displays a fatal error and exits. SDA also reads into its symbol table a subset of SDA$READ_DIR:SYSDEF.STB, called SDA$READ_DIR:REQSYSDEF.STB. This subset provides SDA with the information needed to access some of the data structures in the dump.
When SDA finishes building its symbol table, SDA displays a message identifying itself and the immediate cause of the system failure. In the following example, the cause of the system failure was the deallocation of a bad page file address.
OpenVMS Alpha System Dump Analyzer Dump taken on 27-MAR-1993 11:22:33.92 BADPAGFILD, Bad page file address deallocated |
Previous | Next | Contents | Index |
Copyright © Compaq Computer Corporation 1998. All rights reserved. Legal |
6549PRO_001.HTML
|