Document revision date: 19 July 1999 | |
Previous | Contents | Index |
Perhaps the best method for managing resource assignment is to use the Galaxy APIs to write your own resource management routines. This allows you to base your decisions for resource management on your own criteria and application environment. The same push-model restriction described in Section 12.3 still exists, however, so your routines will need to be Galaxy aware, possibly using shared memory to coordinate their operations.
For information about CPU reassignment APIs, see Chapter 19.
12.5 Reassignment Faults
CPU reassignment can fail or be blocked, for several reasons. Because the GCU buries its management actions in SYSMAN or DCL scripts, it may not always identify and report the reasons for a reassignment fault. The GCU does perform certain checks prior to allowing reassignment actions in order, for example, to prevent attempts to reassign the primary CPU. Other reasons exist for reassignment faults that can only be detected by the operating system or console firmware. For example, if the operating system detects a fault attempting to reassign a CPU that currently has process affinity or Fast Path duties, a DCL message will be displayed on both the console and the users terminal.
The Galaxy APIs for reassignment are capable of reporting most faults to the caller. However, even using the reassignment services, the console may reject a reassignment because of hardware platform dependencies not readily visible to the operating system.
The following DCL Commands are useful for managing an OpenVMS Galaxy:
CPUs are assignable resources in an OpenVMS Galaxy.
13.1.1 STOP/CPU/MIGRATE
The STOP/CPU/MIGRATE command stops and removes the specified secondary processor(s) from the active set in an OpenVMS SMP system.
For example:
A user enters:
GLX1$ stop/cpu/migrate=GLX0 4 |
The following message is displayed at the user's terminal:
%SYSTEM-I-CPUSTOPPING, trying to stop CPU 4 after it reaches quiescent state |
The source console displays:
%SMP-I-STOPPED, CPU #04 has been stopped. |
The destination console displays:
%SMP-I-SECMSG, CPU #04 message: P04>>>START %SMP-I-CPUTRN, CPU #04 has joined the active set. |
The SHOW CPU command displays information about the status, characteristics, and capabilities of the specified processor(s).
For example:
GLX0$ show cpu GLX0, AlphaServer 8400 Model 5/440 Multiprocessing ENABLED. Full checking synchronization image loaded. Minimum multiprocessing revision levels: CPU = 1 PRIMARY CPU = 00 Active CPUs: 00 01 Configured CPUs: 00 01 Potential CPUs: 00 01 03 04 05 06 07 |
The SHOW MEMORY command displays the uses of memory by the system.
For example:
CLSSIC$ SHOW MEMORY/physical System Memory Resources on 5-OCT-1999 20:50:19.03 Physical Memory Usage (pages): Total Free In Use Modified Main Memory (2048.00Mb) 262144 228183 31494 2467 Of the physical pages in use, 11556 pages are permanently allocated to OpenVMS. GALAXY$ show memory/physical System Memory Resources on 5-OCT-1999 07:55:14.68 Physical Memory Usage (pages): Total Free In Use Modified Private Memory (512.00Mb) 65536 56146 8875 515 Shared Memory (1024.00Mb) 131072 130344 728 Of the physical pages in use, 6421 pages are permanently allocated to OpenVMS. $ |
Lexical Function Example Command Procedure
$ write sys$output "" $ write sys$output "Instance = ",f$getsyi("scsnode") $ write sys$output "Platform = ",f$getsyi("galaxy_platform") $ write sys$output "Sharing Member = ",f$getsyi("galaxy_member") $ write sys$output "Galaxy ID = ",f$getsyi("galaxy_id") $ write sys$output "Community ID = ",f$getsyi("community_id") $ write sys$output "Partition ID = ",f$getsyi("partition_id") $ write sys$output "" $ exit |
Lexical Function Command Procedure Output
COBRA2$ @shoglx Instance = COBRA2 Platform = 1 Sharing Member = 1 Galaxy ID = 5F5F30584C47018011D3CC8580F40383 Community ID = 0 Partition ID = 0 COBRA2$ |
The LIST option now returns the galaxywide sections as well as standard
global sections.
13.5 SET CPU
The CONFIGURE command invokes the Galaxy Configuration Utility (GCU) to monitor, display, and interact with an OpenVMS Galaxy system. The GCU requires DECwindows Motif V1.2-4 or greater and ALPHA OpenVMS V7.2 or greater.
The optional model parameter specifies the location and name of a Galaxy Configuration Model to load and display. If no model is provided and the system is running as an OpenVMS Galaxy, the current active configuration is displayed.
If the system is not running as an OpenVMS Galaxy, the GCU will assist the user in creating a single-instance OpenVMS Galaxy system.
OpenVMS Galaxy Configuration Models are created with the Galaxy Configuration Utility. Refer to the OpenVMS Galaxy Guide and GCU online help for more information.
Format:
CONFIGURE GALAXY [model.gcm]
Parameters:
GALAXY [model.gcm] Specifies the location and name of a Galaxy configuration model to load and display. If no model is provided and the system is running as an OpenVMS Galaxy, the current active configuration is displayed. |
Qualifiers:
/ENGAGE Causes the GCU to engage (load, validate, and activate) the specified OpenVMS Galaxy Configuration Model without displaying the graphical user interface. After validation, the specified model becomes the active system configuration. This qualifier allows system managers to restore the OpenVMS Galaxy system to a known configuration, regardless of what dynamic resource reassignments may have occurred since the system was booted. This command can be embedded in DCL command procedures to automate configuration operations. /VIEW When used in conjunction with /ENGAGE and a model parameter, causes the GCU to load, validate, activate, and display the specified configuration model. |
$ CONFIGURE GALAXY Displays the GCU's graphical user interface. If the system is currently configured as an OpenVMS Galaxy, the active system configuration is displayed. $ CONFIGURE GALAXY model.GCM Displays the GCU's graphical user interface. The specified OpenVMS Galaxy Configuration Model is loaded and displayed, but does not become the active configuration until the user chooses to engage it. $ CONFIGURE GALAXY/ENGAGE model.GCM Invokes the GCU command line interface to engage the specified OpenVMS Galaxy Configuration Model without displaying the GCU's graphical user interface. $ CONFIGURE GALAXY/ENGAGE/VIEW model.GCM Invokes the GCU command line interface to engage the specified OpenVMS Galaxy Configuration Model and display the GCU's graphical user interface. |
This chapter describes the following two OpenVMS internal mechanisms that use shared memory to communicate between instances in an OpenVMS Galaxy computing environment:
The Shared Memory Cluster Interconnect (SMCI) is a System
Communications Services (SCS) port for communications between Galaxy
instances. When an OpenVMS instance is booted as both a Galaxy and as
an OpenVMS Cluster member, the SMCI driver is loaded. This SCS port
driver communicates with other cluster instances in the same Galaxy
through shared memory. This capability provides one of the major
performance benefits of the OpenVMS Galaxy Software Architecture. The
ability to communicate to another clustered instance through shared
memory provides dramatic performance benefits over traditional cluster
interconnects.
14.1.1 SYS$PBDRIVER Port Devices
When booting as both a Galaxy and a cluster member, SYS$PBDRIVER is loaded by default. The loading of this driver creates a device PBAx, where x represents the Galaxy partition ID. As other instances are booted, they also create PBAx devices. The SMCI quickly identifies the other instances and creates communications channels to them. Unlike traditional cluster interconnects, a new device is created to communicate with the other instances. This device also has the name PBAx, where x represents the Galaxy partition ID for the instance with which this device is communicating.
For example, consider an OpenVMS Galaxy that consists of two instances: MILKY and WAY. MILKY is instance 0 and WAY is instance 1. When node MILKY boots, it creates device PBA0. When node WAY boots, it creates PBA1. As the two nodes "find" each other, MILKY creates PBA1 to talk to WAY and WAY creates PBA0 to talk to MILKY.
MILKY WAY PBA0: PBA1: PBA1: <-------> PBA0: |
SYS$PBDRIVER can support multiple clusters in the same Galaxy. This is done in the same way that SYS$PEDRIVER allows support for multiple clusters on the same LAN. The cluster group number and password used by SYS$PEDRIVER are also used by SYS$PBDRIVER to distinguish different clusters in the same Galaxy community. If your Galaxy instances are also clustered with other OpenVMS instances over the LAN, the cluster group number is set appropriately by CLUSTER_CONFIG. To determine the current cluster group number:
$ MCR SYMAN SYSMAN> CONFIGURATION SHOW CLUSTER_AUTHORIZATION Node: MILKY Cluster group number: 0 Multicast address: xx-xx-xx-xx-xx-xx SYSMAN> |
If you are not clustering over a LAN and you want to run multiple clusters in the same Galaxy community, then you must set the cluster group number. You must ensure that the group number and password are the same for all Galaxy instances that you want to be in the same cluster.
$ MCR SYSMAN SYSMAN> CONFIGURATION SET CLUSTER_AUTHORIZATION/GROUP_NUMBER=222/PASSWORD=xxxx SYSMAN> |
If your Galaxy instances are also clustering over the LAN,
CLUSTER_CONFIG asks for a cluster group number, and the Galaxy
instances use that group number. If you are not clustering over a LAN,
the group number defaults to zero. This means that all instances in the
Galaxy will be in the same cluster.
14.1.3 SYSGEN Parameters for SYS$PBDRIVER
In most cases, the default settings for SYS$PBDRIVER should be
appropriate; however, several SYSGEN parameters are provided. Two
SYSGEN parameters control SYS$PBDRIVER: SMCI_PORTS and SMCI_FLAGS.
14.1.3.1 SMCI_PORTS
The SMCI_PORTS SYSGEN parameter controls initial loading of SYS$PBDRIVER. This parameter is a bitmask in which bits 0 through 25 each represent a controller letter. If bit 0 is set, PBAx will be loaded; this is the default setting. If bit 1 is set, PBBx will be loaded, and so on all the way up to bit 25, which will cause PBZx to be loaded. For OpenVMS Alpha Version 7.2--1, Compaq recommends leaving this parameter at the default value of 1.
Loading additional ports allows for multiple paths between Galaxy
instances. For OpenVMS Alpha Version 7.2--1, having multiple
communications channels does not provide any advantages because
SYS$PBDRIVER will initially not support Fast Path. A future release of
OpenVMS will provide Fast Path support for SYS$PBDRIVER. When Fast Path
support is enabled, instances with multiple CPUs can achieve improved
throughput by having multiple communications channels between instances.
14.1.4 SMCI_FLAGS
The SMCI_FLAGS SYSGEN parameter controls operational aspects of SYS$PBDRIVER. The only currently defined flag is bit 1. This controls whether or not the port device supports communications with itself. Supporting SCS communications to itself is primarily used for test purposes. By default, this bit will be turned off and thus support for SCS communication locally is disabled, which saves system resources. This parameter is dynamic and by turning this bit on, an SCS virtual circuit should soon form.
Bit | Mask | Description |
---|---|---|
0 | 0 |
0 = Do not create local communications channels (SYSGEN default). Local
SCS communications are primarily used in test situations and not needed
for normal operations. Leaving this bit off saves resources and
overhead.
1 = Create local communications channels. |
1 | 2 |
0 = Load SYS$PBDRIVER if booting into both a Galaxy and a Cluster
(SYSGEN Default).
1 = Load SYS$PBDRIVER if booting into a Galaxy. |
2 | 4 |
0 = Minimal console output (SYSGEN default)
1 = Full console output, SYS$PBDRIVER will display console messages when creating communication channels and tearing down communication channels. |
Local Area Network (LAN) communications between OpenVMS Galaxy instances are supported by the Ethernet LAN shared memory driver. This LAN driver communicates to other instances in the same OpenVMS Galaxy system through shared memory. Communicating with other instances through shared memory provides performance benefits over traditional LANs.
To load the LAN shared memory driver SYS$EBDRIVER, enter the following command:
$ MCR SYSMAN SYSMAN> IO CONN EBA/DRIVER=SYS$EBDRIVER/NOADAPTER |
For OpenVMS Version 7.2--1, in order for LAN protocols to automatically start over this LAN device (EBAn, where n is the unit number), the procedure for loading this driver should be added to the configuration procedure: SYS$MANAGER:SYCONFIG.COM.
The LAN driver emulates an Ethernet LAN with frame formats the same as Ethernet/IEEE 802.3 but with maximum frame size increased from 1518 to 7360 bytes. The LAN driver presents a standard OpenVMS QIO and VCI interface to applications. All existing QIO and VCI LAN applications should work unchanged.
In a future release, the SYS$EBDRIVER device driver will be loaded automatically.
This chapter describes SDA information that is specific to an OpenVMS Galaxy computing environment.
For more information about using SDA, refer to the OpenVMS Alpha System Analysis Tools Manual.
15.1 Dumping Shared Memory
When a system crash occurs in a Galaxy instance, the default behavior of OpenVMS is to dump the contents of private memory of the failed instance and the contents of shared memory. In a full dump, every page of both shared and private memory is dumped; in a selective dump, only those pages in use at the time of the system crash are dumped.
Dumping of shared memory can be disabled by setting bit 4 the dynamic SYSGEN parameter DUMPSTYLE. This bit should only be set on the advice of "your Compaq support," as the resulting system dump may not contain the data required to determine the cause of the system crash.
Table 15-1 shows the definitions of all the bits in DUMPSTYLE and their meanings in OpenVMS Alpha. Bits can be combined in any combination.
Bit | Value | Description |
---|---|---|
0 | 1 |
0= Full dump. The entire contents of physical memory will be written to
the dump file.
1= Selective dump. The contents of memory will be written to the dump file selectively to maximize the usefulness of the dumpfile while conserving disk space. (Only pages that are in use are written). |
1 | 2 |
0= Minimal console output. This consists of the bugcheck code; the
identity of the CPU, process, and image where the crash occurred; the
system date and time; plus a series of dots indicating progress writing
the dump.
1= Full console output. This includes the minimal output described above plus stack and register contents, system layout, and additional progress information such as the names of processes as they are dumped. |
2 | 4 |
0= Dump to system disk. The dump will be written to
SYS$SYSDEVICE:[SYSn.SYSEXE]SYSDUMP.DMP, or in its absence,
SYS$SYSDEVICE:[SYSn.SYSEXE]PAGEFILE.SYS.
1= Dump to alternate disk. The dump will be written to dump_dev:[SYSn.SYSEXE]SYSDUMP.DMP, where dump_dev is the value of the console environment variable DUMP_DEV. |
3 | 8 |
0= Uncompressed dump. Pages are written directly to the dump file.
1= Compressed dump. Each page is compressed before it is written, providing a saving in space and in the time taken to write the dump, at the expense of a slight increase in time taken to access the dump. |
4 | 16 |
0= Dump shared memory.
1= Do not dump shared memory. |
The default setting for DUMPSTYLE is 0 (an uncompressed full dump,
including shared memory, written to the system disk). Unless a value
for DUMPSTYLE is sepcified in MODPARAMS.DAT, AUTOGEN.COM will set
DUMPSTYLE to 1 (an uncompressed selective dump, including shared
memory, written to the system disk) if there is less than 128 megabytes
of memory on the system, or to 9 (a compressed selective dump,
including shared memory, written to the system disk) otherwise.
15.2 Summary of SDA Command Interface Changes or Additions
The following list summarizes how the System Dump Analyzer (SDA) has been enhanced to view shared memory and OpenVMS Galaxy data structures. For more details, see the appropriate commands.
This section describes OpenVMS Galaxy-specific SDA commands.
Displays a brief one-page summary of the state of the Galaxy and all the instances in the Galaxy.
SHOW GALAXY
None.
None.Example
SDA> SHOW GALAXY Galaxy summary -------------- GMDB address Creator node ID Revision Creation time State ----------------- --------------- -------- ----------------------- --------------- FFFFFFFF.7F234000 00000001 1.0 31-MAR-1999 13:15:08.08 OPERATIONAL Node ID NODEB address Name Version Join time State -------- ----------------- -------- -------- ----------------------- --------------- 00000000 FFFFFFFF.7F236000 ANDA1A 1.0 31-MAR-1999 14:11:09.08 MEMBER (current instance) 00000001 FFFFFFFF.7F236200 ANDA2A 1.0 31-MAR-1999 14:10:49.06 MEMBER 00000002 FFFFFFFF.7F236400 ANDA3A 1.0 31-MAR-1999 14:13:26.16 MEMBER 00000003 FFFFFFFF.7F236600 - Node block is empty -) |
Displays the contents of the Galaxy Configuration Tree (GCT) either in summary (hierarchical) form or in detail node by node.
SHOW GCT [/ADDRESS=n |/ALL|/HANDLE=n |/OWNER=n |/SUMMARY (default)|/TYPE=type ]
None.
/ADDRESS=n
Provides a detailed display of the GCT node at the given address./ALL
Provides a detailed display of all nodes in the GCT./HANDLE=n
Provides a detailed display of the GCT node with the given handle./OWNER=n
Provides a detailed display of all nodes in the GCT currently owned by the node with the given handle./SUMMARY
Provides a summary display of the GCT in hierarchical form. This qualifier is the default./TYPE=type
Provides a detailed display of all nodes in the GCT of the given type, which can be one of the following:
BUS CAB COMMUNITY CPU CPU_MODULE EXP_CHASSIS FRU_DESC FRU_ROOT HOSE HW_ROOT IO_CTRL IOP MEMORY_CTRL MEMORY_DESC MEMORY_SUB PARTITION POWER_ENVIR PSEUDO ROOT SBB SLOT SMB SW_ROOT SYS_CHASSIS TEMPLATE_ROOT The type given may be an exact match, in which case just that type is displayed (for example, /TYPE=CPU); or a partial match, in which case all matching types are displayed (for example, /TYPE=CP displays both CPU and CPU_MODULE nodes).
This example shows the summary configuration tree display for a three-instance OpenVMS Galaxy system.
2. SDA> SHOW GCT/HANDLE=00000700 Galaxy Configuration Tree ------------------------- Handle: 00000700 Address: FFFFFFFF.83694740 Node type: Memory_Sub Size: 0080 Id: 00000000.00000000 Flags: 00000000.00000001 Hardware Related nodes: Node relationship Handle Type Id --------------------- -------- --------------------- ----------------- Initial owner 00001580 Community 00000000.00000000 Current owner -<Same>- Parent 00000240 HW_Root 00000000.00000000 Previous sibling 00000640 CPU_Module 00000000.00000003 Next sibling -<None>- Child 00000780 Memory_Ctrl 00000000.00000005 Configuration binding 00000240 HW_Root 00000000.00000000 Affinity binding 00000240 HW_Root 00000000.00000000 Min. physical address: 00000000.00000000 Max. physical address: 00000000.FFFFFFFF |
This example shows the detailed display for the memory subsystem node of a configuration tree.
Previous | Next | Contents | Index |
privacy and legal statement | ||
6512PRO_008.HTML |