Compaq COBOL
User Manual


Previous Contents Index


Chapter 3
Handling Nonnumeric Data

Nonnumeric data in Compaq COBOL is evaluated with respect to a specified collating sequence of the operands.

The following information is in this chapter:

3.1 How the Compiler Stores Nonnumeric Data

COBOL programs hold their data in items whose sizes are described in their source programs. The size of these items is thus fixed during compilation for the lifespan of the resulting object program.

Items in a COBOL program belong to any of the following three data classes:

The data description of an item specifies which class that item belongs to.

Classes are further subdivided into categories. Alphanumeric items can be numeric edited, alphanumeric edited, or alphanumeric. Every elementary item, except for an index data item, belongs to one of the classes and its categories. The class of a group item is treated as alphanumeric regardless of the classes of subordinate elementary items.

If the data description of an alphanumeric item specifies that certain editing operations be performed on any value that is moved into it, that item is called an alphanumeric edited item.

As you read this chapter, keep in mind the distinction between the class or category of a data item and the actual value that the item contains.

Sometimes the text refers to alphabetic, alphanumeric, and alphanumeric edited data items as nonnumeric data items to distinguish them from items that are specifically numeric.

Regardless of the class of an item, it is usually possible at run time to store an invalid value in the item. Thus, nonnumeric ASCII characters can be placed in an item described as numeric, and an alphabetic item can be loaded with nonalphabetic characters. Invalid values can cause errors in output or run-time errors.

3.2 Data Organization

A Compaq COBOL record consists of a set of data description entries that describe record characteristics; it must have an 01 or 77 level number. A data description entry can be either a group item or an elementary item.

All of the records used by Compaq COBOL programs (except for certain registers and switches) must be described in the source program's Data Division. The compiler allocates memory space for these items (except for Linkage Section items) and fixes their size at compilation time.

The following sections explain how the compiler sets up storage for group and elementary data items.

3.2.1 Group Items

A group item is a data item that is followed by one or more elementary items or other group items, all of which have higher-valued level numbers than the group to which they are subordinate.

The size of a group item is the sum of the sizes of its subordinate elementary items. The compiler considers all group items to be alphanumeric DISPLAY items regardless of the class and usage of their subordinate elementary items.

3.2.2 Elementary Items

An elementary item is a data item that has no subordinate data item.

The size of an elementary item is determined by the number of symbols that represent character positions contained in the PICTURE character-string. For example, consider this record description:


01 TRANREC. 
   03 FIELD-1 PIC X(7). 
   03 FIELD-2 PIC S9(5)V99. 

Both elementary items require seven bytes of memory; however, item FIELD-1 contains seven alphanumeric characters while item FIELD-2 contains seven decimal digits, an operational sign, and an implied decimal point. Operations on such items are independent of the mapping of the item into memory words (32-bit words that hold four 8-bit bytes). An item can begin in the leftmost or rightmost byte of a word with no effect on the function of any operation that refers to that item. (However, the position of items in memory can have an effect on run-time performance.)

In effect, the compiler sees memory as a continuous array of bytes, not words. This becomes particularly important when you are defining a table using the OCCURS clause (see Chapter 4).

In Compaq COBOL, all records, and elementary items with level 01 or 77, begin at an address that is a multiple of 8 bytes (a quadword boundary). By default, the Compaq COBOL compiler will locate a subordinate data item at the next unassigned byte location.

Refer to Chapter 16, Chapter 15, and the SYNCHRONIZED clause in the Compaq COBOL Reference Manual for a complete discussion of alignment.

3.3 Special Characters

Compaq COBOL allows you to handle any of the 128 characters of the ASCII character set as alphanumeric data, even though many of the characters are control characters, which usually direct input/output devices. Generally, alphanumeric data manipulations attach no meaning to the 8th bit of an 8-bit byte. Thus, you can move and compare these control characters in the same manner as alphabetic and numeric characters.

Note

Some control characters have 0 in the high-order bit and are part of the ASCII character set, while others have 1 in the high order bit and are not part of the ASCII character set.

Although the object program can manipulate all ASCII characters, certain control characters cannot appear in nonnumeric literals because the compiler uses them to delimit the source text.

You can place special characters into items of the object program by defining symbolic characters in the SPECIAL-NAMES paragraph or by using the EXTERNAL clause. Refer to the Compaq COBOL Reference Manual for information on these two topics.

The ASCII character set listed in the Compaq COBOL Reference Manual indicates the decimal value for any ASCII character.

3.4 Testing Nonnumeric Items

The following sections describe the relation and class tests as they apply to nonnumeric items.

3.4.1 Relation Tests of Nonnumeric Items

An IF statement with a relation condition can compare the value in a nonnumeric data item with another value and use the result to alter the flow of control in the program.

An IF statement with a relation condition compares two operands. Either of these operands can be an identifier or a literal, but they cannot both be literals. If the stated relation exists between the two operands, the relation condition is true.

When coding a relational operator, leave a space before and after each reserved word. When the reserved word NOT is present, the compiler considers it and the next key word or relational character to be a single relational operator defining the comparison. Table 3-1 shows the meanings of the relational operators.

Table 3-1 Relational Operator Descriptions
Operator Description
IS [NOT] GREATER THAN
IS [NOT] >
The first operand is greater than (or not greater than) the second operand.
IS [NOT] LESS THAN
IS [NOT] <
The first operand is less than (or not less than) the second operand.
IS [NOT] EQUAL TO
IS [NOT] =
The first operand is equal to (or not equal to) the second operand.
IS GREATER THAN OR
EQUAL TO
IS >=
The first operand is greater than or equal to the second operand.
IS LESS THAN OR EQUAL TO
IS <=
The first operand is less than or equal to the second operand.

3.4.1.1 Classes of Data

Compaq COBOL allows comparison of both numeric class operands and nonnumeric class operands; however, it handles each class of data differently. For example, it allows a comparison of two numeric operands regardless of the formats specified in their respective USAGE clauses, but it requires that all other comparisons (including comparisons of any group items) be between operands with the same usage. It compares numeric class operands with respect to their algebraic values and nonnumeric (or numeric and nonnumeric) class operands with respect to a specified collating sequence. (See Section 2.5.1 for numeric comparisons.)

If only one of the operands is numeric, it must be an integer data item or an integer literal, and it must be DISPLAY usage. In these cases, the manner in which the compiler handles numeric operands depends on the nonnumeric operand, as follows:

The compiler does not accept a comparison between a noninteger numeric operand and a nonnumeric operand. If you try to compare these two items, you receive a diagnostic message at compile time.

3.4.1.2 Comparison Operations

If the two operands are acceptable, the compiler compares them character by character. The compiler starts at the first byte and compares the corresponding bytes until it either encounters a pair of unequal bytes or reaches the last byte of the longer operand.

If the compiler encounters a pair of unequal characters, it considers their relative position in the collating sequence. The operand with the character that is positioned higher in the collating sequence is the greater operand.

If the operands have different lengths, the comparison proceeds as though the shorter operand were extended on the right by sufficient ASCII spaces (decimal 32) to make both operands the same length.

If all character pairs are equal, the operands are equal.

3.4.2 Class Tests for Nonnumeric Items

An IF statement with a class condition tests the value in a nonnumeric data item (USAGE DISPLAY only) to determine whether it contains numeric, alphabetic, or user-defined data and uses the result to alter the flow of control in the program. For example:


IF ITEM-1 IS NUMERIC... 
IF ITEM-2 IS ALPHABETIC... 
IF ITEM-3 IS NOT NUMERIC... 

If the data item consists entirely of the ASCII characters 0 to 9, with or without the operational sign, the class condition is NUMERIC. If the item consists entirely of the ASCII characters A to Z (upper- or lowercase) and spaces, the class condition is ALPHABETIC.

The ALPHABETIC-LOWER test is true if the operand contains any combination of the lowercase alphabetic characters a to z, and the space. Otherwise the test is false.

The ALPHABETIC-UPPER test is true if the operand contains any combination of the uppercase alphabetical characters A to Z, and the space. Otherwise, the test is false.

You can also perform a class test on a data item that you define with the CLASS clause of the SPECIAL-NAMES paragraph.

A class test is true if the operand consists entirely of the characters listed in the definition of the CLASS-NAME in the SPECIAL-NAMES paragraph. Otherwise, the test is false.

When the reserved word NOT is present, the compiler considers it and the next key word as one class condition defining the class test to be executed. For example, NOT NUMERIC determines if an operand contains at least one nonnumeric character.

If the item being tested is described as a numeric data item, it can only be tested as NUMERIC or NOT NUMERIC. The NUMERIC test cannot examine either of the following:

For further information on using class conditions with numeric items, refer to the Compaq COBOL Reference Manual.

3.5 Data Movement

Three Compaq COBOL statements (MOVE, STRING, and UNSTRING) perform most of the data movement operations required by business-oriented programs. The MOVE statement simply moves data from one item to another. The STRING statement concatenates a series of sending items into a single receiving item. The UNSTRING statement disperses a single sending item into multiple receiving items. Section 3.6 describes the MOVE statement. Chapter 5 describes STRING and UNSTRING.

The MOVE statement handles most data movement operations on character strings. However, it is limited in its ability to handle multiple items. For example, it cannot, by itself, concatenate a series of sending items into a single receiving item or disperse a single sending item into several receiving items.

Two MOVE statements will, however, bring the contents of two items together into a third (receiving) item if the receiving item has been subdivided with subordinate elementary items that match the two sending items in size. If other items are to be concatenated into the third item, and they differ in size from the first two items, then the receiving item requires additional subdivisions (through redefinition).

Example 3-1 demonstrates item concatenation using two MOVE statements.

Example 3-1 Item Concatenation Using Two MOVE Statements

01  SEND-1        PIC X(5) VALUE "FIRST". 
01  SEND-2        PIC X(6) VALUE "SECOND". 
01  RECEIVE-GROUP. 
    05  REC-1     PIC X(5). 
    05  REC-2     PIC X(6). 
PROCEDURE DIVISION. 
A00-BEGIN. 
    MOVE SEND-1 TO REC-1. 
    MOVE SEND-2 TO REC-2. 
    DISPLAY RECEIVE-GROUP. 
    STOP RUN. 

The result of the concatenation follows:


FIRSTSECOND 

Two MOVE statements can also disperse the contents of one sending item to several receiving items. The first MOVE statement moves the leftmost end of the sending item to a receiving item; then the second MOVE statement moves the rightmost end of the sending item to another receiving item. (The second receiving item must first be described with the JUSTIFIED clause.) Characters from the middle of the sending item cannot easily be moved to any receiving item without extensive redefinitions of the sending item or a reference modification loop (as with concatenation).

The STRING and UNSTRING statements handle concatenation and dispersion more easily than compound moves. Reference modification handles substring operations more easily than compound moves, STRING, or UNSTRING.

3.6 Using the MOVE Statement

The MOVE statement moves the contents of one item into another. For example:


MOVE FIELD1 TO FIELD2 
 
MOVE CORRESPONDING FIELD1 TO FIELD2 

FIELD1 is the sending item name, and FIELD2 is the receiving item name.

The first statement causes the compiler to move the contents of FIELD1 into FIELD2. The two items need not be the same size, class, or usage; they can be either group or elementary items. If the two items are not the same length, the compiler aligns them on one end or the other. It also truncates or space-fills the other end. The movement of group items and nonnumeric elementary items is discussed in Section 3.6.1 and Section 3.6.2, respectively.

The MOVE statement alters the contents of every character position in the receiving item.

3.6.1 Group Moves

If either the sending or receiving item is a group item, the compiler considers the move to be a group move. It treats both the sending and receiving items as if they were alphanumeric items.

If the sending item is a group item, and the receiving item is an elementary item, the compiler ignores the receiving item description except for the size description, in bytes, and any JUSTIFIED clause. It conducts no conversion or editing on the sending item's data.

3.6.2 Elementary Moves

If both items of a MOVE statement are elementary items, their PICTURE character-strings control their data movement. If the receiving item was described as numeric or numeric edited, the rules for numeric moves control the data movement (see Section 2.6). Nonnumeric receiving items are alphanumeric, alphanumeric edited, or alphabetic.

Table 3-2 shows the valid and invalid nonnumeric elementary moves.

Table 3-2 Nonnumeric Elementary Moves
  Receiving Item Category
Sending Item Category   Alphanumeric
  Alphabetic Alphanumeric Edited
ALPHABETIC Valid Valid
ALPHANUMERIC Valid Valid
ALPHANUMERIC EDITED Valid Valid
NUMERIC INTEGER
(DISPLAY ONLY)
Invalid Valid
NUMERIC EDITED Invalid Valid

In all valid moves, the compiler treats the sending item as though it had been described as PIC X(n). A JUSTIFIED clause in the sending item's description has no effect on the move. If the sending item's PICTURE character-string contains editing characters, the compiler uses them only to determine the item's size.

In valid nonnumeric elementary moves, the receiving item controls the movement of data. All of the following characteristics of the receiving item affect the move:

The JUSTIFIED clause and editing characters are mutually exclusive.

When an item that contains no editing characters or JUSTIFIED clause in its description is used as the receiving item of a nonnumeric elementary MOVE statement, the compiler moves the characters starting at the leftmost position in the item and proceeding, character by character, to the rightmost position. If the sending item is shorter than the receiving item, the compiler fills the remaining character positions with spaces. If the sending item is longer than the receiving item, truncation occurs on the right.

Numeric items used in nonnumeric elementary moves must be integers in DISPLAY format.

If the description of the numeric data item indicates the presence of an operational sign (either as a character or an overpunched character), or if there are P characters in its character-string, the compiler first moves the item to a temporary location. It removes the sign and fills out any P character positions with zero digits. It then uses the temporary value as the sending item as if it had been described as PIC X(n). The temporary value can be shorter than the original value if a separate sign was removed, or longer than the original value if P character positions were filled with zeros.

If the sending item is an unsigned numeric class item with no P characters in its character-string, the MOVE is accomplished directly from the sending item, and a temporary item is not required.

If the numeric sending item is shorter than the receiving item, the compiler fills the receiving item with spaces.

3.6.2.1 Edited Moves

This section explains the following insertion editing characters:
B Blank insertion position
0 Zero insertion position
/ Slash insertion position

When an item with an insertion editing character in its PICTURE character-string is the receiving item of a nonnumeric elementary MOVE statement, each receiving character position corresponding to an editing character receives the insertion byte value. Table 3-3 illustrates the use of such symbols with the following statement, where FIELD1 is described as PIC X(7):


MOVE FIELD1 TO FIELD2 

Table 3-3 Data Movement with Editing Symbols
FIELD1 FIELD2
  Character-String Contents After MOVE
070476 XX/99/XX 07/04/76
04JUL76 99BAAAB99 04sJULs76
2351212 XXXBXXXX/XX/ 235s1212/ss/
123456 0XB0XB0XB0X 01s02s03s04


Legend: s = space

Data movement always begins at the left end of the sending item and moves only to the byte positions described as A, 9, or X in the receiving item PICTURE character-string. When the sending item is exhausted, the compiler supplies space characters to fill any remaining character positions (not insertion positions) in the receiving item. If the receiving item is exhausted before the last character is moved from the sending item, the compiler ignores the remaining sending item characters.

Any necessary conversion of data from one form of internal representation to another takes place during valid elementary moves, along with any editing specified for, or de-editing implied by, the receiving data item.


Previous Next Contents Index