Compaq COBOL
User Manual


Previous Contents Index


Chapter 5
Using the STRING, UNSTRING, and INSPECT Statements

The STRING, UNSTRING, and INSPECT statements give your Compaq COBOL programs the following capabilities:

5.1 Concatenating Data Using the STRING Statement

The STRING statement concatenates the contents of one or more sending items into a single receiving item.

The statement has many forms; the simplest is equivalent in function to a nonnumeric MOVE statement. Consider the following example:


STRING FIELD1 DELIMITED BY SIZE INTO FIELD2. 

If the two items are the same size, or if the sending item (FIELD1) is larger, the statement is equivalent to the following statement:


MOVE FIELD1 TO FIELD2. 

If the sending item of the string is shorter than the receiving item, the compiler does not replace unused positions in the receiving item with spaces. Thus, the STRING statement can leave some portion of the receiving item unchanged.

The receiving item of the string must be an elementary alphanumeric item with no JUSTIFIED clause or editing characters in its description. Thus, the data movement of the STRING statement always fills the receiving item with the sending item from left to right and with no editing insertions.

5.1.1 Multiple Sending Items

The STRING statement can concatenate a series of sending items into one receiving item. Consider the following example:


STRING FIELD1A FIELD1B FIELD1C DELIMITED BY SIZE 
                           INTO FIELD2. 

In this sample STRING statement, FIELD1A, FIELD1B, and FIELD1C are all sending items. The compiler moves them to the receiving item (FIELD2) in the order in which they appear in the statement, from left to right, resulting in the concatenation of their values.

If FIELD2 is not large enough to hold all three items, the operation stops when it is full. If the operation stops while moving one of the sending items, the compiler ignores the remaining characters of that item and any other sending items not yet processed. For example, if FIELD2 is filled while it is receiving FIELD1B, the compiler ignores the rest of FIELD1B and all of FIELD1C.

If the sending items do not fill the receiving item, the operation stops when the last character of the last sending item (FIELD1C) is moved. It does not alter the contents nor space-fill the remaining character positions of the receiving item.

The sending items can be nonnumeric literals and figurative constants (except for ALL literal). Example 5-1 sets up an address label by stringing the data items CITY, STATE, and ZIP into ADDRESS-LINE. The figurative constant SPACE and the literal period (.) are used to separate the information.

Example 5-1 Using the STRING Statement and Literals

01 ADDRESS-GROUP. 
   03 CITY           PIC X(20). 
   03 STATE          PIC XX. 
   03 ZIP            PIC X(5). 
01 ADDRESS-LINE      PIC X(31). 
      . 
      . 
      . 
PROCEDURE DIVISION. 
BEGIN. 
   STRING CITY SPACE STATE ". " SPACE ZIP 
        DELIMITED BY SIZE INTO ADDRESS-LINE. 
   .
   .
   .

5.1.2 Using the DELIMITED BY Phrase

Although the sending items of the STRING statement are fixed in size at compile time, they are frequently filled with spaces. For example, if a 20-character city item contains the text MAYNARD followed by 13 spaces, the STRING statement using the DELIMITED BY SIZE phrase would move the text (MAYNARD) and the unwanted 13 spaces (assuming the receiving item is at least 20 characters long). The DELIMITED BY phrase, written with a data name or literal, eliminates this problem.

The delimiter can be a literal, a data item, a figurative constant, or the word SIZE. It cannot, however, be ALL literal, because ALL literal has an indefinite length. When the phrase contains the word SIZE, the compiler moves each sending item in total, until it either exhausts the characters in the sending item or fills the receiving item.

If you use the code in Example 5-1, and CITY is a 20-character item, the result of the STRING operation might look like Figure 5-1.

Figure 5-1 Results of the STRING Operation


A more attractive and readable report can be produced by having the STRING operation produce this line:


AYER, MA. 01432 

To accomplish this, use the figurative constant SPACE as a delimiter on the sending item:


MOVE 1 TO P. 
STRING CITY DELIMITED BY SPACE 
        INTO ADDRESS-LINE WITH POINTER P. 
STRING ", " STATE ". " ZIP 
        DELIMITED BY SIZE 
        INTO ADDRESS-LINE WITH POINTER P. 

This example makes use of the POINTER phrase (see Section 5.1.3). The first STRING statement moves data characters until it encounters a space character---a match of the delimiter SPACE. The second STRING statement supplies the literal, the 2-character STATE item, another literal, and the 5-character ZIP item.

The delimiter can be varied for each item within a single STRING statement by repeating the DELIMITED BY phrase after each of the sending item names to which it applies. Thus, the shorter STRING statement in the following example has the same effect as the two STRING statements in the preceding example. (Placing the operands on separate source lines has no effect on the operation of the statement, but it improves program readability and simplifies debugging.)


STRING CITY DELIMITED BY SPACE 
        ", " STATE ". " 
        ZIP DELIMITED BY SIZE 
        INTO ADDRESS-LINE. 

The sample STRING statement cannot handle 2-word city names, such as San Francisco, because the compiler considers the space between the two words as a match for the delimiter SPACE. A longer delimiter, such as two or three spaces (nonnumeric literal), can solve this problem. Only when a sequence of characters matches the delimiter does the movement stop for that data item. With a 2-character delimiter, the same statement can be rewritten in a simpler form:


STRING CITY ", " STATE ". " ZIP 
        DELIMITED BY "  " INTO ADDRESS-LINE. 

Because only the CITY item contains two consecutive spaces, the delimiter's search of the other items will always be unsuccessful, and the effect is the same as moving the full item (delimiting by SIZE).

Data movement under control of a data name or literal generally executes more slowly than data movement delimited by SIZE.

Remember, the remainder of the receiving item is not space-filled, as with a MOVE statement. If ADDRESS-LINE is to be printed on a mailing label, for example, the STRING statement should be preceded by the statement:


MOVE SPACES TO ADDRESS-LINE. 

This statement guarantees a space-fill to the right of the concatenated result. Alternatively, the last item concatenated by the STRING statement can be an item previously set to SPACES. This sending item must either be moved under control of a delimiter other than SPACE or use the value of POINTER and reference modification.

5.1.3 Using the POINTER Phrase

Although the STRING statement normally starts scanning at the leftmost position of the receiving item, the POINTER phrase makes it possible to start scanning at another point within the item. The scanning, however, continues left to right. Consider the following example:


MOVE 5 TO P. 
STRING FIELD1A FIELD1B DELIMITED BY SIZE 
        INTO FIELD2 WITH POINTER P. 

The value of P determines the starting character position in the receiving item. In this example, the 5 in P causes the program to move the first character of FIELD1A into character position 5 of FIELD2 (the leftmost character position of the receiving item is character position 1), and leave positions 1 to 4 unchanged.

When the STRING operation is complete, P points to one character position beyond the last character replaced in the receiving item. If FIELD1A and FIELD1B are both four characters long, P contains a value of 13 (5+4+4) when the operation is complete (assuming that FIELD2 is at least 13 characters long).

5.1.4 Using the OVERFLOW Phrase

When the SIZE option of the DELIMITED BY phrase controls the STRING operation, and the pointer value is either known or the POINTER phrase is not used, you can add the PICTURE sizes of sending items together at program development time to see if the receiving item is large enough to hold the sending items. However, if the DELIMITED BY phrase contains a literal or an identifier, or if the pointer value is not predictable, it can be difficult to tell whether or not the size of the receiving item will be large enough at run time. If the size of the receiving item is not large enough, an overflow can occur.

An overflow occurs when the receiving item is full and the program is either about to move a character from a sending item or is considering a new sending item. Overflow can also occur if, during the initialization of the statement, the pointer contains a value that is either less than 1 or greater than the length of the receiving item. In this case, the program moves no data to the receiving item and terminates the operation immediately.

The ON OVERFLOW phrase at the end of the STRING statement tests for an overflow condition:


STRING FIELD1A FIELD1B DELIMITED BY "C" 
        INTO FIELD2 WITH POINTER PNTR 
        ON OVERFLOW GO TO 200-STRING-OVERFLOW. 

The ON OVERFLOW phrase cannot distinguish the overflow caused by a bad initial value in the pointer from the overflow caused by a receiving item that is too short. Only a separate test preceding the STRING statement can distinguish between the two.

Additionally, even if an overflow condition does not exist, you can use the NOT ON OVERFLOW phrase to branch to or execute other sections of code.

Example 5-2 illustrates the overflow condition.

Example 5-2 Sample Overflow Condition

DATA DIVISION. 
     . 
     . 
     . 
01 FIELD1 PIC XXX VALUE "ABC". 
01 FIELD2 PIC XXXX. 
PROCEDURE DIVISION. 
           . 
           . 
           . 
1.    STRING FIELD1 QUOTE DELIMITED BY SIZE INTO FIELD2 
             ON OVERFLOW DISPLAY "overflow at 1". 
2.    STRING FIELD1 FIELD1 DELIMITED BY SIZE INTO FIELD2 
             ON OVERFLOW DISPLAY "overflow at 2". 
3.    STRING FIELD1 FIELD1 DELIMITED BY "C" INTO FIELD2 
             ON OVERFLOW DISPLAY "overflow at 3". 
4.    STRING FIELD1 FIELD1 FIELD1 FIELD1 
             DELIMITED BY "B" INTO FIELD2 ON OVERFLOW DISPLAY "overflow at 4". 
5.    STRING FIELD1 FIELD1 "D" DELIMITED BY "C" 
             INTO FIELD2 ON OVERFLOW DISPLAY "overflow at 5". 
6.    MOVE 2 TO P. 
 
      MOVE ALL QUOTES TO FIELD2. 
 
      STRING FIELD1 "AC" DELIMITED BY "C" 
             INTO FIELD2 WITH POINTER P ON OVERFLOW DISPLAY "overflow at 6". 

The STRING statement numbers in Example 5-2 point to the line number results shown in Table 5-1.

Table 5-1 Results of Sample Overflow Statements
Value of FIELD2 After the STRING Operation Overflow?
1. ABC" No
2. ABCA Yes
3. ABAB No
4. AAAA No
5. ABAB Yes
6. "ABA No

5.1.5 Common STRING Statement Errors

The following are common errors made when writing STRING statements:

5.2 Separating Data Using the UNSTRING Statement

The UNSTRING statement disperses the contents of a single sending item into one or more receiving items.

The statement has many forms; the simplest is equivalent in function to a nonnumeric MOVE statement. Consider the following example:


UNSTRING FIELD1 INTO FIELD2. 

Regardless of the relative sizes of the two items, the sample statement is equivalent to the following MOVE statement:


MOVE FIELD1 TO FIELD2. 

The sending item (FIELD1) can be either (1) a group item, or (2) an alphanumeric or alphanumeric edited elementary item. The receiving item (FIELD2) can be alphabetic, alphanumeric, or numeric, but it cannot specify any type of editing.

If the receiving item is numeric, it must be DISPLAY usage. The PICTURE character-string of a numeric receiving item can contain any of the legal numeric description characters except P and the editing characters. The UNSTRING statement moves the sending item to the numeric receiving item as if the sending item had been described as an unsigned integer. It automatically truncates or zero-fills as required.

If the receiving item is not numeric, the statement follows the rules for elementary nonnumeric MOVE statements. It left-justifies the data in the receiving item, truncating or space-filling as required. If the data description of the receiving item contains a JUSTIFIED clause, the compiler right-justifies the data, truncating or space-filling to the left as required.

5.2.1 Multiple Receiving Items

The UNSTRING statement can disperse one sending item into several receiving items. Consider the following example of the UNSTRING statement written with multiple receiving items:


UNSTRING FIELD1 INTO FIELD2A FIELD2B FIELD2C. 

The compiler-generated code performs the UNSTRING operation by scanning across FIELD1, the sending item, from left to right. When the number of characters scanned equals the number of characters in the receiving item, the scanned characters are moved into that item and the next group of characters is scanned for the next receiving item.

If each of the receiving items in the preceding example (FIELD2A, FIELD2B, and FIELD2C) is 5 characters long, and FIELD1 is 15 characters long, FIELD1 is scanned until the number of characters scanned equals the size of FIELD2A (5). Those first five characters are moved to FIELD2A, and scanning is resumed at the sixth character position in FIELD1. Next, FIELD1 is scanned from character position 6, until the number of scanned characters equals the size of FIELD2B (five). The sixth through the tenth characters are then moved to FIELD2B, and the scanner is set to the next (eleventh) character position in FIELD1. For the last move in this example, characters 11 to 15 of FIELD1 are moved into FIELD2C.

Each data movement acts as an individual MOVE statement, the sending item of which is an alphanumeric item equal in size to the receiving item. If the receiving item is numeric, the move operation converts the data to numeric form. For example, consider what would happen if the items under discussion had the data descriptions and were manipulating the values shown in Table 5-2.

Table 5-2 Values Moved into the Receiving Items Based on the Sending Item Value
FIELD1
PIC X(15)
VALUE IS:
FIELD2A
PIC X(5)
FIELD2B
PIC S9(5)
LEADING SEPARATE
FIELD2C
PIC S999V99
ABCDE1234512345 ABCDE +12345 3450{
XXXXX0000100123 XXXXX +00001 1230{

FIELD2A is an alphanumeric item. Therefore, the statement simply conducts an elementary nonnumeric move with the first five characters.

FIELD2B, however, has a leading separate sign that is not included in its size. Thus, the compiler moves only five numeric characters and generates a positive sign (+) in the separate sign position.

FIELD2C has an implied decimal point with two character positions to the right of it, plus an overpunched sign on the low-order digit. The sending item should supply five numeric digits. However, because the sending item is alphanumeric, the compiler treats it as an unsigned integer; it truncates the two high-order digits and supplies two zero digits for the decimal positions. Furthermore, it supplies a positive overpunch sign, making the low-order digit a +0 (ASCII { ). There is no way to have the UNSTRING statement recognize a sign character or a decimal point in the sending item in a single statement.

If the sending item is shorter than the sum of the sizes of the receiving items, the compiler ignores the remaining receiving items. If the compiler reaches the end of the sending item before it reaches the end of one of the receiving items, it moves the scanned characters into that receiving item. It either left-justifies and fills the remaining character positions with spaces for alphanumeric data, or else it decimal point-aligns and zero-fills the remaining character positions for numeric data.

Consider the following statement with reference to the corresponding PICTURE character-strings and values in Table 5-3:


UNSTRING FIELD1 INTO FIELD2A FIELD2B. 

FIELD2A is a 3-character alphanumeric item. It receives the first three characters of FIELD1 (ABC) in every operation. FIELD2B, however, runs out of characters every time before filling, as Table 5-3 illustrates.

Table 5-3 Handling a Short Sending Item
FIELD1
PIC X(6)
VALUE IS:
FIELD2B
PICTURE IS:
FIELD2B
Value After UNSTRING Operation
ABCDEF XXXXX DEF
  S99999 0024F
ABC246 S9V999 600{
  S9999 LEADING SEPARATE +0246

5.2.2 Controlling Moved Data Using the DELIMITED BY Phrase

The size of the data to be moved can be controlled by a delimiter, rather than by the size of the receiving item. The DELIMITED BY phrase supplies the delimiter characters.

UNSTRING delimiters can be literals, figurative constants (including ALL literal), or identifiers (identifiers can even be subscripted data names). This section describes the use of these three types of delimiters. Subsequent sections cover multiple delimiters, the COUNT phrase, and the DELIMITER phrase.

Consider the following sample UNSTRING statement with the figurative constant SPACE as a delimiter:


UNSTRING FIELD1 DELIMITED BY SPACE 
         INTO FIELD2. 

In this example, the compiler scans the sending item (FIELD1), searching for a space character. If it encounters a space, it moves all of the scanned (nonspace) characters that precede that space to the receiving item (FIELD2). If it finds no space character, it moves the entire sending item. When the compiler has determined the size of the sending item, it moves the contents of that item following the rules for the MOVE statement, truncating or zero-filling as required.

Table 5-4 shows the results of the following UNSTRING operation that uses a literal asterisk delimiter:


UNSTRING FIELD1 DELIMITED BY "*" 
         INTO FIELD2. 

Table 5-4 Results of Delimiting with an Asterisk
FIELD1
PIC X(6)
VALUE IS:
FIELD2
PICTURE IS:
FIELD2
Value After
UNSTRING
  XXX ABC
ABCDEF X(7) ABCDEF
  XXX JUSTIFIED DEF
****** XXX ###
*ABCDE XXX ###
A***** XXX JUSTIFIED ##A
246*** S9999 024F
12345* S9999 TRAILING SEPARATE 2345+
2468** S999V9 LEADING SEPARATE +4680
*246** 9999 0000


Legend: # = space

If the delimiter matches the first character in the sending item, the compiler considers the size of the sending item to be zero. The operation still takes place, however, and fills the receiving item with spaces (if it is nonnumeric) or zeros (if it is numeric).

A delimiter can also be applied to an UNSTRING statement that has multiple receiving items:


UNSTRING FIELD1 DELIMITED BY SPACE 
         INTO FIELD2A FIELD2B. 

The compiler generates code that scans FIELD1 searching for a character that matches the delimiter. If it finds a match, it moves the scanned characters to FIELD2A and sets the scanner to the next character position to the right of the character that matched. The compiler then resumes scanning FIELD1 for a character that matches the delimiter. If it finds a match, it moves all of the characters between the character that first matched the delimiter and the character that matched on the second scan, and sets the scanner to the next character position to the right of the character that matched.

The DELIMITED BY phrase handles additional items in the same manner as it handled FIELD2B.

Table 5-5 illustrates the results of the following delimited UNSTRING operation into multiple receiving items:


UNSTRING FIELD1 DELIMITED BY "*" 
         INTO FIELD2A FIELD2B. 

Table 5-5 Results of Delimiting Multiple Receiving Items
  Values After UNSTRING Operation
FIELD1
PIC X(8)
VALUE IS:
FIELD2A
PIC X(3)
FIELD2B
PIC X(3)
ABC*DEF* ABC DEF
ABCDE*FG ABC FG#
A*B**** A## B##
*AB*CD** ### AB#
**ABCDEF ### ###
A*BCDEFG A## BCD
ABC**DEF ABC ###
A******B A## ###


Legend: # = space

The previous examples illustrate the limitations of a single-character delimiter. To overcome these limitations, a delimiter of more than one character or a delimiter preceded by the word ALL may be used.

Table 5-6 shows the results of the following UNSTRING operation using a 2-character delimiter:


UNSTRING FIELD1 DELIMITED BY "**" 
         INTO FIELD2A FIELD2B. 

Table 5-6 Results of Delimiting with Two Asterisks
  Values After UNSTRING Operation
FIELD1
PIC X(8)
VALUE IS:
FIELD2A
PIC XXX
FIELD2B
PIC XXX
JUSTIFIED
ABC**DEF ABC DEF
A*B*C*D* A*B ###
AB***C*D AB# C*D
AB**C*D* AB# *D*
AB**CD** AB# #CD
AB***CD* AB# CD*
AB*****CD AB# ###


Legend: # = space


Previous Next Contents Index