Updated: 11 December 1998 |
OpenVMS Performance Management
Previous | Contents | Index |
This chapter describes corrective procedures for each of the various categories of resource limitations described in Chapter 5.
Wherever the corrective procedure suggests changing the value of one or more system parameters, the description explains briefly whether the parameter should be increased, decreased, or given a specific value. Relationships between parameters are identified and explained, if necessary. However, to avoid duplicating information available in the OpenVMS System Management Utilities Reference Manual: M--Z, complete explanations of parameters are not included.
You should review descriptions of system parameters, as necessary,
before changing the parameters.
10.1 Changing System Parameters
Before you make any changes to your system parameters, make a copy of the existing version of the file that is in the SYSGEN work area, using a technique such as the following:
$ RUN SYS$SYSTEM:SYSGEN SYSGEN> WRITE SYS$SYSTEM:file-spec SYSGEN> EXIT |
You may want to use a date as part of the file name you specify for file-spec to readily identify the file later.
By creating a copy of the current values, you can always return to those values at some later time. Generally you use the following technique, specifying your parameter file as file-spec:
$ RUN SYS$SYSTEM:SYSGEN SYSGEN> USE SYS$SYSTEM:file-spec SYSGEN> WRITE ACTIVE SYSGEN> EXIT |
However, if some of the parameters you changed were not dynamic, to restore them from the copied file, you must instead use the SYSGEN command WRITE CURRENT, and then reboot the system.
Do not directly modify system parameters using SYSGEN. AUTOGEN overrides system parameters set with SYSGEN, which can cause a setting to be lost months or years after it was made. |
You should change only a few parameters at a time.
Whenever your changes are unsuccessful, make it a practice to restore the parameters to their previous values before you continue tuning. Otherwise, it can be difficult to determine which changes produce currently observed effects.
If you are planning to change a system parameter and you are uncertain of the ultimate target value or of the sensitivity of the specific parameter to changes, err on the conservative side in making initial changes. As a guideline, you might make a 10 percent change in the value first so that you can observe its effects on the system.
If... | Then ... |
---|---|
you see little or no effect | try doubling or halving the original value of the parameter depending on whether you are increasing or decreasing it. |
this magnitude of change had no effect | restore the parameter to its original value with the parameter file you saved before starting. |
you cannot affect your system performance with changes of this magnitude | you probably have not selected the right parameter for change. |
In most cases, you will want to use AUTOGEN to change system parameters
since AUTOGEN adjusts related parameters automatically. (For a
discussion of AUTOGEN, see the OpenVMS System Manager's Manual: Tuning, Monitoring, and Complex Systems.) In the few instances
where it is appropriate to change a parameter in the special parameter
group, further explanation of the parameter is given in this chapter,
since special parameters are otherwise undocumented.
10.1.3 When to Use SYSGEN
If your tuning changes involve system parameters that are dynamic, plan to test the changes on a temporary basis first. This is the only instance where the use of SYSGEN is warranted for making tuning changes.
Once you are satisfied that the changes are working well, you should
invoke AUTOGEN with the REBOOT parameter to make the changes permanent.
10.2 Monitoring the Results
After you perform the recommended corrective actions in this and the following chapters, repeat the steps in the preceding chapters to observe the effects of the changes. As you repeat the steps, watch for new problems introduced by the corrective actions or previously undetected problems. Your goal should be to complete the steps in those chapters without uncovering a serious symptom or problem.
After you change system values or parameters, you must monitor the results, as described in Section 2.7.7. You have two purposes for monitoring:
You may want to return to the appropriate procedures in Chapters 5, 7, 8, and 9 as you evaluate your success after tuning and decide whether to pursue additional tuning efforts. However, always keep in mind that there is a point of diminishing returns in every tuning effort (see Section 2.7.7.1).
This chapter describes corrective procedures for memory resource
limitations described in Chapters 5 and 7.
11.1 Improving Memory Responsiveness
It is always good practice to check the four methods for improving
memory responsiveness to see if there are ways to free up more memory,
even if no problem seems to exist currently. The easiest way to improve
memory utilization significantly is to make sure that active memory
reclamation is enabled.
11.1.1 Equitable Memory Sharing
When active memory reclamation is enabled, the system distributes memory among active processes in an equitable and expeditious manner. If you feel page faulting is excessive with this policy enabled, make sure processes have not reached their WSEXTENT values. Note that precise WSQUOTA values are not very important when this policy is enabled, provided that GROWLIM and BORROWLIM are set equal to FREELIM using AUTOGEN.
If active memory reclamation is not enabled (that is, the value of MMG_CTLFLAGS is 0), then overall system page fault behavior is highly dependent on current process WSQUOTA values. The following discussion can help you to determine if inequitable memory sharing is occurring.
Because page fault behavior is so heavily dependent on the page referencing patterns of user programs, the WSQUOTA values you assign may be satisfactory for some programs but not for others. Use the ACCOUNTING image report described in Section 4.3 to identify the programs (images) that are the heaviest faulters on your system, and then compensate by encouraging users to run such images as batch jobs on queues you have set up with large WSQUOTA values.
You may be able to detect inequitable sharing by looking at the Faults column of the MONITOR PROCESSES display in a standard summary report (it is not contained in the multifile summary report). A process with a page fault accumulation much higher than that of other processes is suspect, although it depends on how long the process has been active.
A better means of detection is to use the MONITOR playback feature to view a display of the top page faulters during each collection interval:
$ MONITOR /INPUT=SYS$MONITOR:file-spec /VIEWING_TIME=1 PROCESSES /TOPFAULT |
You may want to select a time interval using the /BEGINNING and /ENDING qualifiers when you suspect that a problem has occurred.
Check to see whether the top process changes periodically. If it appears that one or two processes are consistently the top faulters, you may want to obtain more information about which images they are running and consider upgrading their WSQUOTA values, using the guidelines in Section 3.5. Sometimes a small adjustment in a WSQUOTA value can make a drastic difference in the page faulting behavior, if the original value was near the knee of the working-set/page-fault curve (see Figures 3-3 and 3-4).
If you find that the MONITOR collection interval is too large to provide sufficient detail, try entering the previous command on the running system (live mode) during a representative period, using the default 3-second collection interval. If you discover an inequity, try to obtain more information about the process and the image being run by entering the SHOW PROCESS /CONTINUOUS command.
Another way to check for inequitable sharing of memory is to use the
WORKSET.COM command procedure described in Section 7.1.3. Examine the
various working set values and ensure that the allocation of memory,
even if not evenly distributed, is appropriate.
11.1.2 Reduction of Memory Consumption by the System
The operating system uses physical memory for storage of the code and
data structures it requires to support user processes. You have control
over the sizes of two of the memory areas reserved for the system: the
system working set and the nonpaged pool area. Both of these areas are
sized by AUTOGEN. The sizes set by AUTOGEN are normally adequate but
may not be optimal because AUTOGEN cannot anticipate all operational
requirements.
11.1.2.1 System Working Set
The system working set is an area of physical memory reserved to satisfy page faults of virtual addresses in system space.
Such virtual addresses can be code or data (paged pool, for example). Because the same system working set is used for all processes on the system, there is very little locality associated with it.
Therefore, the system fault rate can be expected to change slowly in relation to changes in the system working set size (as controlled by the system parameter SYSMWCNT). A rule of thumb is to try to keep the system fault rate to less than 2 per second.
Keep in mind, however, that pages allocated to the system working set
by raising the value of SYSMWCNT are considered permanently allocated
to the system and are therefore no longer available for process working
sets.
11.1.2.2 Nonpaged Pool
The nonpaged pool area is a portion of physical memory permanently allocated to the system for the storage of data structures and device drivers.
AUTOGEN determines the initial size of the nonpaged pool, but automatic
expansion will occur if necessary. The system expands pool as required
by permanently allocating a page of memory from the free-page list.
Pages allocated in this manner are not available for use by process
working sets until the system is rebooted.
11.1.2.3 Adaptive Pool Management
The high-performance nonpaged pool allocator reduces the probability of system outages due to exhaustion of memory allocated for system data structures (pool). Adaptive pool management virtually eliminates the need to actively manage the allocation of pool resources. The nonpaged pool area and lookaside lists are combined into one region (defined by the system parameters NPAGEDYN and NPAGEVIR), allowing memory packets to migrate from lookaside lists to general pool and back again based on demand. As a result, the system is capable of tuning itself according to the current demand for pool, optimizing its use of these resources, and reducing the risk of running out of these resources.
Internal to the allocator is an array of lookaside lists that contiguously span an allocation range from 1 to 5120 bytes. These lookaside lists require no external tuning. They are automatically prepopulated during bootstrapping based on previous demand and each continuously adapts its number of packets based on changing demand during the life of the system. The result is very high performance due to a very high hit percentage on the internal lookaside lists, typically over 99 percent.
When dellocating nonpaged pool, the allocator requires that you pass an accurate packet size either in R1 or in the word starting at the eighth byte in the packet itself. The size of the packet determines to which internal lookaside list the packet will be deallocated.
Enabling and Disabling Pool Monitoring
The setting of the parameter POOLCHECK at boot time also controls which version of the pool allocator is loaded as follows:
For more information about the POOLCHECK parameter, refer to the OpenVMS System Management Utilities Reference Manual.
The granularity of nonpaged pool changed with OpenVMS Version 6.0. Any
code that explicitly assumes the granularity of nonpaged pool to be 16
bytes or makes use of the symbol EXE$C_ALCGRNMSK to perform (for
example) structure alignment must be changed to use the symbol
EXE$M_NPAGGRNMSK, which reflects the nonpaged pool's current
granularity.
11.1.2.4 Additional Consistency Checks
On Alpha, the system parameter SYSTEM_CHECK is used to investigate intermittent system failures by enabling a number of run-time consistency checks on system operation and recording some trace information.
Enabling SYSTEM_CHECK causes the system to behave as if the following system parameter values are set:
Parameter1 | Value | Description |
---|---|---|
BUGCHECKFATAL | 1 | Crashes the system on nonfatal bugchecks |
POOLCHECK 2 | %X616400FF | Enables all pool checking with an allocated pool pattern of %X61616161 ('aaaa') and a deallocated pool pattern of %X64646464 ('dddd') |
MULTIPROCESSING | 2 | Enables full synchronization checking |
While SYSTEM_CHECK is enabled, the previous settings of the BUGCHECKFATAL and MULTIPROCESSING parameters are ignored.
Setting SYSTEM_CHECK causes certain image files to be loaded that are capable of the additional system monitoring. These image files are located in SYS$LOADABLE_IMAGES and can be identified by the suffix _MON.
Note that enabling SYSTEM_CHECK, or any of the individual system checks listed, may have an impact on system performance because the system must do extra work to perform these run-time consistency checks. Also note that BUGCHECKFATAL should be used with care in a multiuser environment because it causes the entire system to crash.
These checks can be very helpful when working with applications or layered products that are causing problems, especially in the way they interact with the system. However, once the system has achieved stability they should generally be turned off.
For more information about the interaction of the SYSTEM_CHECK system
parameter with the ACP_DATACHECK system parameter, see the description
of ACP_DATACHECK in the OpenVMS System Management Utilities Reference Manual.
11.1.3 Memory Offloading
While the most common and probably most cost-effective type of offloading is that performed by shifting the CPU and disk resources onto memory, it is possible to improve memory responsiveness by offloading it onto disk. This procedure is recommended only when sufficient disk resource is available and its use is more cost effective than purchasing additional memory.
Some of the CPU offloading techniques described in Section 13.1.3 apply also to memory. Additional techniques are as follows:
When you increase swapping, it is important to evaluate the size of the
swapping file. If the swapping file is not large enough, system
performance will degrade. Use AUTOGEN feedback to size the swapping
file appropriately.
11.1.4 Memory Load Balancing
You can balance the memory load by using some of the CPU load-balancing techniques for VMSclusters described in Section 13.1.5 to shift user demand.
To balance the load by reconfiguring memory hardware, perform the following steps:
The Free List Size item gives the relative amounts of free memory available on each CPU. If a system seems to be deficient in memory and is experiencing memory management problems, perhaps the best solution is to reconfigure the VMScluster by moving some memory from a memory-rich system to a memory-poor one---provided the memory type is compatible with both CPU types.
The Free List Size item is an average of levels, or snapshots. Because it is not a rate, its accuracy is dependent on the collection interval. |
The following sections describe procedures to remedy specific
conditions that you might have detected as the result of the
investigation described in Chapter 7.
11.2 Reduce Number of Image Activations
There are several ways to reduce the number of image activations. You
and the programming staff should explore them all and apply those you
deem feasible and likely to produce the greatest results.
11.2.1 Programs Versus Command Procedures
Excessive image activations can result from running large command
procedures frequently, because all DCL commands (except those performed
within the command interpreter) require an image activation. If command
procedures are introducing the problem, consider writing programs to
replace them.
11.2.2 Code Sharing
When code is actively shared, the cost of image startups decreases. Perhaps your installation has failed to design applications that share code. You should examine ways to employ code sharing wherever suitable. See the appropriate sections in Section 1.4.3 and Section 3.8.
You will not see the number of image activations drop when you begin to
use code sharing, but you should see an improvement in performance. The
effect of code sharing is to shift the type of faults at image
activation from hard faults to soft faults, a shift that results in
performance improvement.
11.2.3 Designing Applications for Native Mode
Yet another source of excessive image activations is migration of programs from other operating systems without any design changes. For example, programs that employ the chaining technique on another operating system will not use memory efficiently on an OpenVMS system if you simply recompile them and ignore design differences. When converting applications to run on an OpenVMS system, always consider the benefits of designing and coding each application for native-mode operation.
Previous | Next | Contents | Index |
Copyright © Compaq Computer Corporation 1998. All rights reserved. Legal |
6491PRO_011.HTML
|