Updated: 11 December 1998 |
OpenVMS VAX System Dump Analyzer Utility Manual
Previous | Contents | Index |
The MOVB instruction is part of a routine that reads characters from a buffer and writes them to the printer. The routine contains the loop of instructions that starts at the label 20$ and ends at 25$. This loop executes once for each character in the buffer, performing these steps:
Steps 1 and 2 are repeated until the contents of R1 are 0 or the printer signals that it is not ready.
If the printer signals that it is not ready, the driver transfers control to 30$ (line 598), the beginning of a routine that waits for an interrupt from the printer. When the printer becomes ready, it interrupts the driver and execution of the loop resumes.
Examine the code to determine which variables control the loop.
The byte count (BCNT) is the number of characters in the buffer. Note that BCNT is set by a function decision table (FDT) routine and that this routine sets the value of BCNT to the number of characters in the buffer. In line 586, the starting address of a buffer that is BCNT bytes in size is moved into R3.
Note also that the number of characters left to be printed is represented by the byte offset (BOFF), the offset into the buffer at which the driver finds the next character to be printed. This value controls the number of times the loop is executed.
Because the exception is an access violation, either R3 or R0 must contain an incorrect value. You can determine that R0 is probably valid by the following logic:
Thus, the contents of R3 seem to be the cause of the failure.
The most likely reason that the contents of R3 are wrong is that the
MOVB instruction at line 599 executes too many times. You can check
this by comparing the contents of UCB$W_BOFF and UCB$W_BCNT. If
UCB$W_BOFF contains a larger value than that in UCB$W_BCNT, then R3
contains a value that is too large, indicating that the MOVB
instruction has incremented the contents of R3 too many times.
9.4.2 Checking the Values of Key Variables
Because the start-I/O routine requires that R5 contain the address of the printer's UCB, and because several other instructions reference R5 without error before any instruction in the loop does, you can assume that R5 contains the address of the right UCB. To compare BOFF and BCNT, use the command FORMAT @R5 to display the contents of the UCB, as shown in the following session.
SDA> READ SYS$SYSTEM:SYSDEF.STB SDA> FORMAT @R5 |
8005D160 UCB$L_FQFL 800039A8 UCB$L_RQFL UCB$W_MB_SEED UCB$W_UNIT_SEED 8005D164 UCB$L_FQBL 800039A8 UCB$L_RQBL 8005D168 UCB$W_SIZE 0122 8005D16A UCB$B_TYPE 10 8005D16B UCB$B_FIPL 34 UCB$B_FLCK . . . 8005D1C8 UCB$L_SVAPTE 80062720 8005D1CC UCB$W_BOFF 0795 8005D1CE UCB$W_BCNT 006D 8005D1D0 UCB$B_ERTCNT 00 8005D1D1 UCB$B_ERTMAX 00 8005D1D2 UCB$W_ERRCNT 0000 . . . SDA> |
If you have only one printer in your system configuration, you do not need to use the FORMAT command. Instead, you can use the command SHOW DEVICE LP. Because only one printer is connected to the processor, only one UCB is associated with a printer for SDA to display.
The output produced by the FORMAT @R5 command shows that UCB$W_BOFF contains a value greater than that in UCB$W_BCNT; it should be smaller. Therefore, the value stored in BOFF is incorrect.
Thus, the value of BOFF is not the number of characters that remain in
the buffer. This value is used in calculating an address that is
referenced at an elevated IPL. When this address is within a null page
(unreadable in all access modes), an attempt to reference it causes the
system to fail.
9.4.3 Identifying and Correcting the Defective Code
Examine the printer driver code to locate all instructions that modify UCB$W_BOFF. The value changes in two circumstances:
When the printer times out, the driver should not modify UCB$W_BOFF. It does so, however, in line 631. The driver should modify the contents of UCB$W_BOFF only when it is certain that the printer printed the character. When the printer times out, this is not the case. Furthermore, the wait-for-interrupt routine preserves only registers R3, R4, and R5, so that only those registers can be used unmodified after the execution of the wait-for-interrupt routine. Thus, the use of R1 in line 631 is an error.
To correct the problem, change the WFIKPCH argument (line 616) so that, when the printer times out, the WFIKPCH macro transfers control to 50$ rather than to 40$.
607 608 30$: BNEQ 40$ ;If NEQ paper problem 609 ADDW3 #1,R1,UCB$W_BOFF(R5) ;Save number of characters remaining 610 DEVICELOCK - 611 LOCKADDR=UCB$L_DLCK(R5),- ;Lock device interrupts 612 SAVIPL=-(SP) ;Save current IPL 613 BITW #^X80,LP_CSR(R4) ;Is it ready now? 614 BNEQ 35$ ;If NEQ, yes, it's ready 615 BISB #^X40,LP_CSR(R4) ;Set interrupt enable 616 WFIKPCH 40$,#12 ;Wait for ready interrupt 617 IOFORK ;Create a fork process 618 BRB 10$ ; ...and start next output 619 620 35$: 621 DEVICEUNLOCK - 622 LOCKADDR=UCB$L_DLCK(R5),- ;Unlock device interrupts 623 NEWIPL=(SP)+ ;Restore IPL 624 CLRW LP_CSR(R4) ;Disable device interrupts 625 BRB 10$ ;Go transfer more characters 626 ; 627 ; PRINTER HAS PAPER PROBLEM 628 ; 629 630 40$: CLRL UCB$L_LP_OFLCNT(R5) ;Clear offline counter 631 ADDW3 #1,R1,UCB$W_BOFF(R5) ;Save number of characters remaining 632 50$: CLRW LP_CSR(R4) ;Disable printer interrupt 633 IOFORK ;Lower to fork level 634 BBS #UCB$V_CANCEL,UCB$W_STS(R5),80$ ;If set, cancel I/O operation 635 TSTW LP_CSR(R4) ;Printer still have paper problem? 636 BLSS 55$ ;If LSS yes 637 MOVL #15,UCB$L_LP_TIMEOUT(R5) ;Set timeout value 638 BRB 10$ ; ...and start next output |
If the operating system is not performing well and you want to create a dump you can examine, you must induce a system failure. Occasionally, a device driver or other user-written, kernel-mode code can cause the system to execute a loop of code at a high priority, interfering with normal system operation. This can occur even though you have set a breakpoint in the code if the loop is encountered before the breakpoint. To gain control of the system in such circumstances, you must cause the system to fail and then reboot it.
If the system has suspended all noticeable activity (if it is "hung"), see the examples of causing system failures in Section 10.2.
If you are generating a system crash in response to a system hang, be
sure to record the PC at the time of the system halt as well as the
contents of the general registers. Submit this information to Digital,
along with the Software Performance Report (SPR) and a copy of the
generated system dump file.
10.1 Meeting Crash Dump Requirements
The following requirements must be met before the system can write a complete crash dump:
The following examples show the sequence of console commands needed to cause a system failure on each type of processor. In each instance, after halting the processor and examining its registers, you place the equivalent of --1 (for example, FFFFFFFF16) into the PC. The value placed in the PSL sets the processor access mode to kernel and the IPL to 31. After these commands are executed, an INVEXCEPTN bugcheck is reported on the console terminal, followed by a listing of the contents of the processor registers.
The console volume of most processors contains a command file named either CRASH.COM or CRASH.CMD, which you can execute to perform these commands. Note that the console sessions recorded in this section omit much of the information the console displays in response to the listed commands.
VAX 85x0/8700/88x0
The following series of console commands causes a system failure on the VAX 85x0/8700/88x0 systems. (Note that the console prompt for the VAX 8810, 8820, and 8830 systems is PS-CIO-0> and not >>>.)
$ [Ctrl/P] >>> SET CPU CURRENT_PRIMARY >>> HALT ?00 Left CPU -- CPU halted PC = 8001911C >>> @CRASH ! ! Command procedure to force bugcheck via access violation ! SET VERIFY SET CPU CURRENT_PRIMARY !Select primary EXAMINE PSL !Display PSL M 00000000 00420008 EXAMINE/I/NEXT 4 0 . . . DEPOSIT PC FFFFFFFF !Set PC=-1 to force ACCVIO DEPOSIT PSL 41F0000 !Set IPL=31, interrupt stack CONTINUE !Execute from PC=-1 |
VAX 82x0/83x0, VAXstation 3520/3540, 6000
Series, and 9000 Series
The following console commands cause a system failure on a VAX 82x0/83x0 system, a VAXstation 3520/3540 system, a VAX 6000 series system, or a VAX 9000 series system.
$ [Ctrl/P] PC = 80008B1F >>> E P >>> E/I 0 >>> E/I + >>> E/I + >>> E/I + >>> E/I + >>> D/G F FFFFFFFF >>> D P 41F0000 >>> C |
VAX 8600/8650
The following console commands cause a system failure on the VAX 8600/8650 systems.
$ [Ctrl/P] >>> @CRASH SET QUIET OFF !Make clearer SET ABORT OFF !Don't abort on E/VIR command HALT CPU stopped, INVOKED BY CONSOLE (CSM code 11) PC 80008B1F UNJAM !Clear the way E PSL !Display PSL U PSL 00000000 E/I/N:4 0 !Display stack pointers . . . E SP !Get current stack pointers G 0E 80000C40 E/vir/next:40 @ !Dump top of stack . . . D PC FFFFFFFF !Invalidate the PC D PSL 1F0000 !Kernel mode, IPL 31 SET ABORT ON !Restore abort flag SET QUIET ON !Shut output off CONTINUE !Force a machine check |
VAX-11/780 and VAX-11/785
The following console commands cause a system failure on the VAX-11/780 and VAX-11/785 processors.
$ [Ctrl/P] >>> @CRASH HALT !Halt system, examine PC, HALTED AT 80008A89 EXAMINE PSL !PSL, 00000000 EXAMINE/INTERN/NEXT:4 0 !and all stack pointers DEPOSIT PC = -1 !Invalidate PC DEPOSIT PSL = 41F0000 !Kernel mode, IPL 31 CONTINUE |
VAX-11/750
The following code causes a system failure on a VAX-11/750. On this processor, the HALT command is a NOP; a Ctrl/P automatically halts the processor.
$ [Ctrl/P] >>> H >>> E P >>> E/I 0 >>> E/I + >>> E/I + >>> E/I + >>> E/I + >>> D/G F FFFFFFFF >>> D P 41F0000 >>> C |
MicroVAX 3400/3600/3900 Series, VAXstation/MicroVAX 3100,
VAXstation/MicroVAX 2000, MicroVAX II, and VAX 4000 Series
To force a crash of a MicroVAX, you must first halt the processor. (After you halt the processor, press the HALT button again so that it is popped out and is not illuminated.) Then, issue the following console commands:
>>> E PSL >>> E/I/N:4 0 >>> D PC FFFFFFFF >>> D PSL 41F0000 >>> C |
VAX-11/730
The following console commands cause a system failure on a VAX-11/730. Ctrl/P automatically halts the processor.
$ [Ctrl/P] >>> H >>> E PSL >>> E/I/N:4 0 >>> D PC FFFFFFFF >>> D PSL 1F0000 >>> C |
The System Dump Analyzer is a utility that you can use to help determine the causes of system failures. This utility is also useful for examining the running system.
analyze {/CRASH_DUMP [/RELEASE] filespec| /SYSTEM} [/SYMBOL=system-symbol-table]
Usage Summaryfilespec
Name of the file that contains the dump you want to analyze. At least one field of the filespec is required, and it can be any field. The default filespec is the highest version of SYSDUMP.DMP in your default directory.
The following table summarizes how to perform key SDA operations.
The following qualifiers, described in this section, determine whether the object of an SDA session is a crash dump or a running system. They also help create the environment of an SDA session. Table SDA-10 briefly describes the SDA qualifiers.
Qualifier | Description |
---|---|
/CRASH_DUMP | Invokes SDA to analyze a specified dump file |
/RELEASE | Invokes SDA to release those blocks that are occupied by a crash dump in a specified system paging file |
/SYMBOL | Specifies a system symbol table for SDA to use in place of the system symbol table it uses by default (SYS$SYSTEM:SYS.STB) |
/SYSTEM | Invokes SDA to analyze a running system |
Invokes SDA to analyze the specified dump file.
/CRASH_DUMP filespec
filespec
Name of the crash dump file to be analyzed. The default file specification is:SYS$DISK and [default-dir] represent the disk and directory specified in your last SET DEFAULT command. If you do not specify filespec, SDA prompts you for it.
- SYS$DISK:[default-dir]SYSDUMP.DMP
See Section 2 for additional information on crash dump analysis.
#1 |
---|
$ ANALYZE/CRASH_DUMP SYS$SYSTEM:SYSDUMP.DMP $ ANALYZE/CRASH SYS$SYSTEM |
These commands invoke SDA to analyze the crash dump stored in SYS$SYSTEM:SYSDUMP.DMP.
#2 |
---|
$ ANALYZE/CRASH SYS$SYSTEM:PAGEFILE.SYS |
This command invokes SDA to analyze a crash dump stored in the system paging file.
Invokes SDA to release those blocks in the specified system paging file occupied by a crash dump.
/RELEASE filespec
filespec
Name of the system page file (SYS$SYSTEM:PAGEFILE.SYS). The default file specification is:SYS$DISK and [default-dir] represent the disk and directory specified in your last SET DEFAULT command. If you do not specify filespec, SDA prompts you for it.
- SYS$DISK:[default-dir]SYSDUMP.DMP
You use the /RELEASE qualifier to release from the system paging file those blocks occupied by a crash dump. When invoked with the /RELEASE qualifier, SDA immediately deletes the dump from the paging file and allows no opportunity to analyze its contents.When you specify the /RELEASE qualifier in the ANALYZE command, you must also do the following:
- Use the /CRASH_DUMP qualifier.
- Include the name of the system paging file (SYS$SYSTEM:PAGEFILE.SYS) as the filespec.
If you do not specify the system paging file or the specified paging file does not contain a dump, SDA generates the following messages:
%SDA-E-BLKSNRLSD, no dump blocks in page file to release, or not page file %SDA-E-NOTPAGFIL, specified file is not the page file
$ ANALYZE/CRASH_DUMP/RELEASE SYS$SYSTEM:PAGEFILE.SYS |
This command invokes SDA to release to the paging file those blocks in SYS$SYSTEM:PAGEFILE.SYS occupied by a crash dump.
Specifies a system symbol table for SDA to use in place of the system symbol table it uses by default (SYS$SYSTEM:SYS.STB).
/SYMBOL =system-symbol-table
system-symbol table
File specification of the SDA system symbol table needed to define symbols required by SDA to analyze a dump from a particular system. The specified system-symbol-table must contain those symbols required by SDA to find certain locations in the executive image.If you do not specify the /SYMBOL qualifier, SDA uses SYS$SYSTEM:SYS.STB by default. When you do specify the /SYMBOL qualifier, SDA assumes the default disk and directory to be SYS$DISK: that is, the disk and directory specified in your last SET DEFAULT command. If SDA is given a file that is not a system symbol table in the /SYMBOL qualifier, it halts with a fatal error.
The /SYMBOL qualifier allows you to specify a system symbol table, other than SYS$SYSTEM:SYS.STB, to load into the SDA symbol table. This might be necessary, for instance, to analyze a crash dump taken on a processor running a different version of OpenVMS.You can use the /SYMBOL qualifier whether you are analyzing a system dump or a running system.
$ ANALYZE/CRASH_DUMP/SYMBOL=SYS$CRASH:SYS.STB SYS$SYSTEM |
This command invokes SDA to analyze the crash dump stored in SYS$SYSTEM:SYSDUMP.DMP, using the system symbol table at SYS$CRASH:SYS.STB.
Previous | Next | Contents | Index |
Copyright © Compaq Computer Corporation 1998. All rights reserved. Legal |
4556PRO_004.HTML
|