Compaq COBOL
User Manual

Chapter 6
Processing Files and Records

The Compaq COBOL I/O system offers you a wide range of record management techniques while remaining transparent to you. You can select one of several file organizations and access modes, each of which is suited to a particular application. The file organizations available through Compaq COBOL are sequential, line sequential, relative, and indexed. The access modes are sequential, random, and dynamic.

This chapter introduces you to the following Compaq COBOL I/O features:

Defining files and records ( Section 6.1)
Identifying files and records from your Compaq COBOL program ( Section 6.2)
Creating and processing files ( Section 6.3)
Reading files ( Section 6.4)
Updating files ( Section 6.5)
Backing up your files ( Section 6.6)

For information about low-volume or terminal screen I/O using the ACCEPT and DISPLAY statements, see Chapter 11 and refer to the Compaq COBOL Reference Manual.

The operating system provides you with I/O services for handling, controlling, and spooling your I/O needs or requests. Compaq COBOL, through the I/O system, provides you with extensive capabilities for data storage, retrieval, and modification.

On the OpenVMS Alpha operating system, the Compaq COBOL I/O system consists of the Run-Time Library (RTL), which accesses Record Management Services (RMS). (On OpenVMS VAX, COBOL-generated code accesses RMS directly.) Refer to the OpenVMS Record Management Utilities Reference Manual and the OpenVMS Record Management Services Reference Manual for more information about RMS. <>

On the Tru64 UNIX operating system, the Compaq COBOL I/O system consists of the Run-Time Library (RTL) and facilities of Tru64 UNIX. In addition, the facilities of a third-party ISAM package are required for any use of ORGANIZATION INDEXED. <>

6.1 Defining Files and Records

A file is a collection of related records. You can specify the organization and size of a file as well as the record format and physical record size. The system creates a file with these characteristics and stores them with the file. Any program that accesses a file must specify the same characteristics as those that the system stored for that file when creating it.

A record is a group of related data elements. The space a record needs on a physical device depends on the file organization, the record format, and the number of bytes the record contains.

File organization is described in Section 6.1.1. Record format is described in Section 6.1.2.

6.1.1 File Organization

Compaq COBOL supports the following four types of file organization:

SEQUENTIAL---This organization requires that records be referenced in sequence from the first record to the last. This organization is useful for programs that normally access each record serially. (See the Sequential File Organization section in this chapter.)
LINE SEQUENTIAL (Alpha)--- This organization is essentially the same as sequential. Line sequential allows you to treat files as collections of variable length records, with each record containing one line of printable characters. This organization is useful for programs that access files created by text editors and similar programs. (See the Line Sequential File Organization (Alpha) section in this chapter.) <>
RELATIVE---This organization lets you access records randomly, or sequentially by record number values. While this organization is more flexible than sequential organization, it is less flexible than indexed organization because you cannot insert a record in the middle of your file unless you have an empty cell to contain it. (See the Relative File Organization section in this chapter.)
INDEXED---This organization lets you access records randomly or sequentially, by primary and alternate key values. This is a useful way to organize a file in which records will be added, changed, or deleted upon demand. (See the Indexed File Organization section in this chapter.)

Note

On Tru64 UNIX, a third-party product is required for INDEXED runtime support. Refer to the Read Before Installing... letter for up-to-date details on how to obtain the INDEXED runtime support. <>

Table 6-1 summarizes the advantages and disadvantages of these file organizations.

Table 6-1 Compaq COBOL File Organizations---Advantages and Disadvantages
File Organizations Advantages Disadvantages

Sequential Uses disk and memory efficiently Allows sequential access only

Provides optimal usage if the application accesses all records sequentially on each run Allows records to be added only to the end of a file

Provides the most flexible record format

Allows READ/WRITE sharing

Allows data to be stored on many types of media, in a device-independent manner

Allows easy file extension

Line Sequential
(Alpha) Most efficient storage format Allows sequential access only

Compatible with text editors Used for printable characters only

Open Mode I/O is not allowed

Relative Allows sequential, random, and dynamic access Allows data to be stored on disk only

Provides random record deletion and insertion Requires that record cells be the same size

Allows READ/WRITE sharing

Indexed Allows sequential, random, and dynamic access Allows data to be stored on disk only

Allows random record deletion and insertion on the basis of a user-supplied key Requires more disk space

Allows READ/WRITE sharing Uses more memory to process records

Allows variable-length records to change length on update Generally requires multiple disk accesses to randomly process a record

Allows easy file extension

**Table 6-1 Compaq COBOL File Organizations---Advantages and Disadvantages**
File Organizations	Advantages	Disadvantages
Sequential	Uses disk and memory efficiently	Allows sequential access only
	Provides optimal usage if the application accesses all records sequentially on each run	Allows records to be added only to the end of a file
	Provides the most flexible record format
	Allows READ/WRITE sharing
	Allows data to be stored on many types of media, in a device-independent manner
	Allows easy file extension
Line Sequential (Alpha)	Most efficient storage format	Allows sequential access only
	Compatible with text editors	Used for printable characters only
		Open Mode I/O is not allowed
Relative	Allows sequential, random, and dynamic access	Allows data to be stored on disk only
	Provides random record deletion and insertion	Requires that record cells be the same size
	Allows READ/WRITE sharing
Indexed	Allows sequential, random, and dynamic access	Allows data to be stored on disk only
	Allows random record deletion and insertion on the basis of a user-supplied key	Requires more disk space
	Allows READ/WRITE sharing	Uses more memory to process records
	Allows variable-length records to change length on update	Generally requires multiple disk accesses to randomly process a record
	Allows easy file extension

Sequential File Organization

Sequential input/output, in which records are written and read in sequence, is the simplest and most common form of I/O. It can be performed on all I/O devices, including magnetic tape, disk, terminals, and line printers.

Sequential files consist of records that are arranged in the order in which they were written to the file. Figure 6-1 illustrates sequential file organization.

Figure 6-1 Sequential File Organization

Sequential files always contain an end-of-file (EOF) indication. On magnetic tapes, it is the EOF mark; on disk, it is a counter in the file header that designates the end of the file. Compaq COBOL statements can write over the EOF mark and, thus, extend the length of the file. Because the EOF indicates the end of useful data, Compaq COBOL provides no method for reading beyond it, even though the amount of space reserved for the file exceeds the amount actually used.

Occasionally a file with sequential organization, for example, a multiple-reel magnetic tape file, is so large that it requires more than one volume. An end-of-volume (EOV) label marks the end of recorded information on each volume and signals the file system to switch to a new volume. On multiple-volume files, the EOF mark appears only once, at the end of the last record on the last volume. Figure 6-2 depicts a multiple-volume, sequential file.

Figure 6-2 A Multiple-Volume, Sequential File

When you select the medium for your sequential file, consider the following:

Speed of access---Tape is significantly slower than disk. In general, most removable media storage (magnetic, optical, and so forth) devices are slower than your fixed disks.
Frequency of use---Use removable media devices to store relatively static files, and save your fixed disk space for more dynamic files.
Cost---Fixed disks are generally more expensive than removable media devices. The more frequently you plan to access the data, the easier it is to justify maintaining the data on your fixed disks. For example, data that is accessed daily must be kept on readily available disks; quarterly or annual data could be offloaded to removable media.
Transportability---Use removable media if you need to use the file across systems that have no common disk devices (this technique is commonly referred to as "sneakernetting").

Refer to the OpenVMS I/O User's Reference Manual or the ltf(4) manpage for more information on magnetic tape formats.

Line Sequential File Organization (Alpha)

Line sequential file structure is essentially similar to the structure of sequential files, with the major difference being record length. Figure 6-3 illustrates line sequential file organization.

Figure 6-3 Line Sequential File Organization (Alpha)

A line sequential file consists of records of varying lengths arranged in the order in which they were written to the file. Each record is terminated with a "newline" character. The newline character is a line feed record terminator ('0A' hex).

Each record in a line sequential file should contain only printable characters and should not be written with a WRITE statements that contains either a BEFORE ADVANCING or AFTER ADVANCING statement.

Record length is determined by the maximum record length in the FD entry in the FILE-CONTROL section and the number of characters in a line (not including the record terminator).

When your Compaq COBOL program reads a line from a line sequential file that is shorter than the record area, it reads up to the record terminator, discards the record terminator, and pads the rest of the record with a number of spaces necessary to equal the record's specified length. When your program reads a line from a line sequential file that is longer than the record area, it reads the number of characters necessary to fill the record area. The next READ, if any, will begin at the next printable character in the file that is not a record terminator.

Line sequential file organization is useful in reading and printing files that were created by an editor or word processor, which typically do not write fixed-length records. <>

Relative File Organization

A relative file consists of fixed-size record cells and uses a key to retrieve its records. The key, called a relative key, is an integer that specifies the record's storage cell or record number within the file. It is analogous to the subscript of a table. Relative file processing is available only on disk devices.

Any record on a relative file (unlike a sequential file) can be accessed with one READ operation. Also, relative files allow the program to read forward with respect to the current relative key. In addition to random access by relative key, relative files also permit you to delete and update records by relative key. Relative files are used primarily when records must be accessed in random order and the records can easily be associated with numbers that give the relative positions in the file.

In relative file organization, not every cell must contain a record. Although each cell occupies one record space, a field preceding the record on the storage medium indicates whether or not that cell contains a valid record. Thus, a file can contain fewer records than it has cells, and the empty cells can be anywhere in the file.

The numerical order of the cells remains the same during all operations on a relative file. However, accessing statements can move a record from one cell to another, delete a record from a cell, insert new records into empty cells, or rewrite existing cells.

With relative file processing, the I/O system organizes a file as a series of fixed-sized record cells. Cell size is based on the size specified as the maximum permitted length for a record in the file. The I/O system considers these cells as successively numbered from 1 (the first) to n (the last). A cell's relative record number (RRN) represents its location relative to the beginning of the file.

Because cell numbers in a relative file are unique, they can be used to identify both the cell and the record (if any) occupying that cell. Thus, record number 1 occupies the first cell in the file, record number 21 occupies the twenty-first cell, and so forth. Figure 6-4 illustrates relative file organization.

Figure 6-4 Relative File Organization

Relative files are used like tables. Their advantage over tables is that their size is limited to disk space rather than memory space. Also, their information can be saved from run to run. Relative files are best for records that are easily associated with ascending, consecutive numbers (so that the program conversion from data to cell number is easy), such as months (record keys 1 to 12), or the 50 U.S. states (record keys 1 to 50).

Indexed File Organization

An indexed file uses primary and alternate keys in the record to retrieve the contents of that record. Compaq COBOL allows sequential, random, and dynamic access to records. You access each record by one of its primary or alternate keys. Indexed file processing is available only on disk devices.

Unlike the sequential ordering of records in a sequential file or the relative positioning of records in a relative file, the physical location of records in indexed file organization is transparent to the program. You can add new records to an indexed file without recreating the file. You can also delete records, making room for new records.

Indexed file organization allows you to use a key to uniquely identify a record within the file. The location and length of the key are identical in all records. When creating an indexed file, you must select the data items to be the keys. Selecting such a data item indicates to the I/O system that the contents (key value) of that data item in any record written to the file can be used by the program to identify that record for subsequent retrieval. For more information, refer to the Environment Division clauses RECORD KEY IS and ALTERNATE RECORD KEY IS in the Compaq COBOL Reference Manual.

You must define at least one main key, called the primary key, for an indexed file. You may also optionally define from 1 to 254 additional keys called alternate keys. Each alternate key represents an additional data item in each record of the file. You can also use the key value in any of these alternate keys as a means of identifying the record for retrieval.

You define primary and alternate key values in the Record Description entry. Primary and alternate key values need not be unique if you specify the WITH DUPLICATES phrase in the file description entry (FD). When duplicate key values are present, you can retrieve the first record written in the logical sort order of the records with the same key value and any subsequent records using the READ NEXT phrase. The logical sort order controls the order of sequential processing of the record. (For more information about retrieving records with duplicate key values, refer to the information about the READ statement in the Compaq COBOL Reference Manual.)

When you open a file, you must specify the same number and type of keys that were specified when the file was created. (This situation is subject to modification by the check duplicate keys and relax key checking options, as well as a duplicate key specification on an FD.) If the number or type of keys does not match, the system will issue a run-time diagnostic when you try to open the file.

As your program writes records into an indexed file, the I/O system locates the values contained in the primary and alternate keys. The I/O system builds these values into a tree-structured table or index, which consists of a series of entries. Each entry contains a key value copied from a record. With each key value is a pointer to the location in the file of the record from which the value was copied.

Figure 6-5 shows the general structure of an indexed file defined with a primary key only.

Figure 6-5 Indexed File Organization

For a more detailed explanation of indexed file structure on OpenVMS systems, refer to the Guide to OpenVMS File Applications. <>

For information about specifying file organization in your program, see Section 6.2.2.

6.1.2 Record Format

Compaq COBOL provides four record format types: fixed, variable, print-control, and stream. Table 6-2 shows the record format availability.

Table 6-2 Record Format Availability
Sequential Line
Sequential
Relative
Indexed

Disk Tape

Fixed length yes yes no yes yes

Variable length yes yes no yes yes

Print control yes no no no no

Stream no no yes no no

**Table 6-2 Record Format Availability**
	Sequential	Line Sequential	Relative	Indexed
	Disk	Tape
Fixed length	yes	yes	no	yes	yes
Variable length	yes	yes	no	yes	yes
Print control	yes	no	no	no	no
Stream	no	no	yes	no	no

The compiler determines the record format from the information that you specify as follows:

Fixed record format---Use the RECORD CONTAINS clause. This is the Compaq COBOL default.
Variable record format---Use the RECORD CONTAINS TO clause or the RECORD VARYING clause.
Print-control (VFC on OpenVMS systems or ASCII on Tru64 UNIX systems)---use the Procedure Division ADVANCING phrase, the Environment Division APPLY PRINT-CONTROL or (on Tru64 UNIX) ASSIGN TO PRINTER clauses, or the Data Division LINAGE clause, or use Report Writer statements and phrases.
Stream (Alpha only)---Use the FILE-CONTROL ORGANIZATION IS LINE SEQUENTIAL clause. On OpenVMS Alpha you also get this format with /NOVFC. <>

If a file has more than one record description, the different record descriptions automatically share the same record area in memory. The I/O system does not clear this area before it executes the READ statement. Therefore, if the record read by the latest READ statement does not fill the entire record area, the area not overlaid by the incoming record remains unchanged.

The record format type that was specified when the file was created must be used for all subsequent accesses to the file.

In Example 6-1, a file contains a company's stock inventory information (part number, supplier, quantity, price). Within this file, the information is divided into records. All information for a single piece of stock constitutes a single record.

Example 6-1 Sample Record Description

01 PART-RECORD. 02 PART-NUMBER PIC 9999. 02 PART-SUPPLIER PIC X(20). 02 PART-QUANTITY PIC 99999. 02 PART-PRICE PIC S9(5)V99.

Each record in the stock file is itself divided into discrete pieces of information referred to as elementary items (02 level items). You give each elementary item a specific location in the record, give it a name, and define its size and type. The part number is an elementary item in the part record, as are supplier, quantity, and price. In this example, PART-RECORD contains four elementary items: PART-NUMBER, PART-SUPPLIER, PART-QUANTITY, and PART-PRICE.

Fixed-Length Records

Files with a fixed-length record format contain the same size records. The compiler generates the fixed-length format when either of the following conditions is true:

The RECORD CONTAINS clause specifies a fixed number of characters.
The RECORD CONTAINS clause is omitted.

The compiler does not generate fixed-length format when any of the following conditions exist:

The file description contains a RECORD CONTAINS TO clause or a RECORD VARYING clause.
The program specifies a print-control file by referring to the file with:
- The ADVANCING phrase in a WRITE statement
- An APPLY PRINT-CONTROL clause in the Environment Division
- A LINAGE clause in the file description
- Report Writer statements and phrases
- ASSIGN TO PRINTER
LINE SEQUENTIAL organization is specified.

Fixed-length record size is determined by either the largest record description or the record size specified by the RECORD CONTAINS clause, whichever is larger. Example 6-2 shows how fixed-length record size is determined.

Example 6-2 Determining Fixed-Length Record Size

FD FIXED-FILE RECORD CONTAINS 100 CHARACTERS. 01 FIXED-REC PIC X(75).

For the file, FIXED-FILE, the RECORD CONTAINS clause specifies a record size larger than the record description; therefore, the record size is 100 characters.

In Example 6-2, the following warning message is generated when the file FIXED-FILE is used:

"Record contains value is greater than length of longest record."

If the multiple record descriptions are associated with the file, the size of the largest record description is used as the size. In Example 6-3, for the file REC-FILE, the FIXED-REC2 record specifies the largest record size; therefore, the record size is 90 characters.

Example 6-3 Determining Fixed-Length Record Size for Files with Multiple Record Descriptions

FD REC-FILE RECORD CONTAINS 80 CHARACTERS. 01 FIXED-REC1 PIC X(75). 01 FIXED-REC2 PIC X(90).

When the file REC-FILE is used, the following warning message is generated:

"Longest record is longer than RECORD CONTAINS value - longest record size used."

Variable-Length Records

Files with a variable-length record format can contain records of different length. The compiler generates the variable-length attribute for a file when the file description contains a RECORD VARYING clause or a RECORD CONTAINS TO clause.

Each record is written to the file with a 32-bit integer that specifies the size of the record. This integer is not counted in the size of the record.

Examples 6-4, 6-5, and 6-6 show you the three ways you can create a variable-length record file.

In Example 6-4, the DEPENDING ON phrase sets the OUT-REC record length. The IN-TYPE data field determines the OUT-LENGTH field's contents.

Example 6-4 Creating Variable-Length Records with the DEPENDING ON Phrase

FILE SECTION. FD INFILE. 01 IN-REC. 03 IN-TYPE PIC X. 03 REST-OF-REC PIC X(499). FD OUTFILE RECORD VARYING FROM 200 TO 500 CHARACTERS DEPENDING ON OUT-LENGTH. 01 OUT-REC PIC X(500). WORKING-STORAGE SECTION. 01 OUT-LENGTH PIC 999 COMP VALUE ZEROES.

Example 6-5 shows how to create variable-length records using the RECORD VARYING phrase.

Example 6-5 Creating Variable-Length Records with the RECORD VARYING Phrase

FILE SECTION. FD OUTFILE RECORD VARYING FROM 200 TO 500 CHARACTERS. 01 OUT-REC-1 PIC X(200). 01 OUT-REC-2 PIC X(500).

Example 6-6 creates variable-length records by using the OCCURS clause with the DEPENDING ON phrase in the record description. Compaq COBOL determines record length by adding the sum of the variable record's fixed portion to the size of the table described by the number of table occurrences at execution time.

Contents

Index

Compaq COBOLUser Manual

Chapter 6Processing Files and Records

6.1 Defining Files and Records

6.1.1 File Organization

6.1.2 Record Format

Compaq COBOL
User Manual

Chapter 6
Processing Files and Records