Updated: 11 December 1998

Chapter 1
Overview

The DEC C Run-Time Library utilities help you to manage localization and time zone data for international software applications. Localization and time zone data is defined separately from the application and is bound to it only at run time.

The DEC C Run-Time Library includes the following utilities:

XPG4-compliant utilities ( Section 1.1)
ZIC utility ( Section 1.2)

1.1 Creating XPG4-Compliant Localizing Applications

To help you develop localizing applications for use internationally, the OpenVMS operating system offers, as part of its DEC C Run-Time Library, several utilities that support the XPG4 model (X/Open Portability Guide Issue 4) of internationalization. The following XPG4-compliant utilities are provided:

GENCAT utility ( Section 1.1.1)
ICONV utility ( Section 1.1.2)
LOCALE utility ( Section 1.1.3)

These tools are useful only for applications written to the XPG4 model.

1.1.1 Creating and Invoking Message Catalogs

A message catalog is a binary file that contains the messages an application displays or writes. This file includes all the messages that the application issues, for example, error messages, information messages, screen displays, and prompts. To create message catalogs, use the GENCAT command.

GENCAT reads one or more input source files and the existing catalog file, if one exists. The source file is a text file that you create to hold the messages that your program might print. Use any text editor to enter messages into the source file. If you identify multiple source files, GENCAT processes them one after the other in the sequence that you specify them. Each successive source file modifies the catalog.

Before you or your application issues GENCAT, create the required input source file and, if appropriate at this time, the catalog file.

For more detailed information about the GENCAT command, see Chapter 4.

1.1.1.1 Message Source File

When you create an input source file, follow these guidelines:

Group your messages into sets to represent functional subsets of the program.
Give each message a numeric identifier, which must be unique within its set.
Add commands recognized by GENCAT for manipulating sets and individual messages.

1.1.1.2 Message Catalog File

If a message catalog with the name catfile exists, GENCAT creates a new version of the file that includes the contents of the older version and then modifies it. If the catalog does not exist, GENCAT creates it with the name catfile.

1.1.1.3 Retrieving Messages from a Message Catalog

OpenVMS applications retrieve messages from a message catalog using the following DEC C Run-Time Library routines:

catopen
catgets
catclose

For details, see the DEC C Run-Time Library Reference Manual for OpenVMS Systems.

1.1.2 Performing Codeset Conversions

The ICONV utility provides the following commands to create a conversion table file from a conversion source file and, using this file, to convert characters from one codeset to another:

The ICONV COMPILE command creates a conversion table file.
Using this conversion table file, the ICONV CONVERT command then converts characters in another, specified file from one codeset to another.

The ICONV commands support any 1- to 4-byte codesets that are state independent.

Note

There is a restriction in the tocodeset encodings in this implementation. The characters in tocodeset must not use 0XFF in the fourth byte.

1.1.2.1 Creating Conversion Tables

To create a conversion table file, issue the DCL command ICONV COMPILE:
ICONV COMPILE sourcefile tablefile

See the description of the ICONV COMPILE command in Chapter 4 for the format of the conversion source file.

See the description of the ICONV CONVERT command in Chapter 4 for the tablefile naming convention.

1.1.2.2 Converting from One Codeset to Another

To convert characters in a file from one codeset to another codeset, issue the ICONV CONVERT command:
ICONV CONVERT infile outfile /FROMCODE=fromcodeset /TOCODE=tocodeset

The converted characters are written to the output file parameter outfile.

1.1.3 Setting International Environment Logical Names

The LOCALE utility is an OpenVMS XPG4 localization utility that:

Compiles a binary locale file for use by utilities and C routines dependent on the setting of the international environment logical names
Loads a locale name into system memory as shared, read-only global data
Displays a summary of the current international environment as defined on your system and details of locales on your system
Unloads a locale name from system memory

The LOCALE utility supports the following commands:

The LOCALE COMPILE command converts a locale source file into a binary locale file for use by utilities and C routines. This command allows you to add new locales to your system in addition to those specified by Digital.
To compile a locale, the LOCALE COMPILE command uses two source files:
- A locale definition source file that contains categories that describe a locale. Locale categories, described in Table 4-3, include LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and LC_TIME.
- A character set description (charmap) file that defines the character set for the locale. The charmap, which defines character symbols as character encodings, is the source file for a coded character set (see Chapter 3).
The LOCALE LOAD command loads a locale name into system memory as several shared, read-only, global sections. All processes that access the loaded locale use this one copy of the locale, thereby reducing overall demand on system memory.
The LOCALE UNLOAD command unloads a specified locale name from system memory.
The LOCALE SHOW CHARACTER_DEFINITIONS command lists the names of the character set description files (charmaps) in the public directory defined by the logical name SYS$I18N_LOCALE. A charmap defines the symbolic names and values of characters in a coded character set. A charmap file has the file type .CMAP.
The LOCALE SHOW CURRENT command displays a summary of the current international environment as defined by several logical names representing locale categories. This command lists the settings for each locale category and the values of the environment variables LC_ALL and LANG. The logical name that defines a category has the same name as the category. For example, the LC_MESSAGES logical name defines the setting for the LC_MESSAGES category.
The LOCALE SHOW PUBLIC command lists all the public locales on the system, including locales listed in the directory defined by the logical name SYS$118N_LOCALE as well as system locales supplied by the DEC C Run-Time Library.
The LOCALE SHOW VALUE command displays the value of one or more keywords from the current international environment. Locale categories and keywords in each category are listed in Table 4-4.

For more information about LOCALE commands, see Chapter 4.

1.2 Creating Time Zone Conversion Information

Using the Zone Information Compiler (ZIC) utility, the ZIC compiler creates binary files containing time zone conversion information. These files are generated from the time zone source files that you specify.

The lines in the source files consist of fields. To create a valid time zone source file, follow these formatting requirements:

Any number of white space characters separate the fields.
Leading and trailing white spaces on input lines are ignored.
An unquoted number sign (#), the sharp character, in the input line introduces a comment that extends to the end of the line where this sign appears.
White space characters and sharp characters can be enclosed in double quotation marks (" ") if they are to be used as part of a field.
Any line that is blank after comment stripping is ignored.
Non-blank lines are expected to be one of three types:
- Rule lines (see Section 1.2.1)
- Zone lines (see Section 1.2.2)
- Link lines (see Section 1.2.3)

1.2.1 Rule Lines

A rule line has the following form:

Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S

An example is as follows:

Rule USA 1969 1973 - Apr lastSun 2:00 1:00 D

The rule line consists of the following fields:

NAME

Gives the arbitrary name of the set of rules that this rule is part of.

FROM

Gives the first year in which the rule applies. The word minimum, or an abbreviation, means the minimum year with a representable time value. The word maximum, or an abbreviation, means the maximum year with a representable time value.

Gives the final year in which the rule applies. In addition to minimum and maximum as defined in FROM, minimum or maximum (or an abbreviation) only may be used to repeat the value of the FROM field.

TYPE

Gives the type of year in which the rule applies. If TYPE is - , then the rule applies in all years between FROM and TO inclusively. ZIC executes the following command to check the type of year:

yearistype year type

An exit status of 1 means that the year is of the given type; an exit status of 5 means that the year is not of the given type.

Gives the month in which the rule takes effect. Month names may be abbreviated.

Gives the day on which the rule takes effect. Table 1-1 shows the recognized forms.

Table 1-1 Day the Rule Becomes Effective
Form Meaning

5 Fifth of the month

lastSun Last Sunday in the month

lastMon Last Monday in the month

Sun>=8 First Sunday on or after the 8th

Sun<=25 Last Sunday on or before the 25th

**Table 1-1 Day the Rule Becomes Effective**
Form	Meaning
5	Fifth of the month
lastSun	Last Sunday in the month
lastMon	Last Monday in the month
Sun>=8	First Sunday on or after the 8th
Sun<=25	Last Sunday on or before the 25th

Names of days of the week may be abbreviated or spelled out in full. Note that there must be no spaces within the ON field.

Gives the time of day when the rule takes effect. Table 1-2 shows the recognized forms.

Table 1-2 Time of Day the Rule Becomes Effective
Form Meaning

2 Time in hours

2:00 Time in hours and minutes

15:00 24-hour format time (for times after noon)

1:28:14 Time in hours, minutes, and seconds

**Table 1-2 Time of Day the Rule Becomes Effective**
Form	Meaning
2	Time in hours
2:00	Time in hours and minutes
15:00	24-hour format time (for times after noon)
1:28:14	Time in hours, minutes, and seconds

Any of these forms may be followed by the letter w if the given time is local wall clock time, or the letter s if the time is local standard time. In the absence of either the letter w or the letter s, wall clock time is assumed.

SAVE

Gives the amount of time to be added to local standard time when the rule is in effect. This field has the same format as the AT field, although, of course, the letter w and s suffixes are not used.

LETTER/S

Gives the variable part of time zone abbreviations to be used when this rule is in effect; as for example, the S or D in EST or EDT. If this field is - , the variable part is null.

1.2.2 Zone Lines

A zone line has the following form:

Zone NAME GMTOFF RULES/SAVE FORMAT UNTIL

An example is as follows:

Zone Australia/South-west 9:30 Aus CST 1987 Mar 15 2:00

The zone line consists of the following fields:

NAME

Gives the name of the time zone. This name is used in creating the time conversion information file for the zone.

GMTOFF

Gives the amount of time to add to Greenwich mean time (GMT) to get standard time in this zone. This field has the same format as the AT and SAVE fields of rule lines. If time must be subtracted from GMT, begin the field with a minus sign.

RULES/SAVE

Gives the name of the rule(s) that apply in the time zone, or alternatively, an amount of time to add to local standard time. If this field is - , standard time always applies in the time zone.

FORMAT

Gives the format for time zone abbreviations in this time zone. The pair of characters %s is used to show where the variable part of the time zone abbreviation goes.

UNTIL

Gives the time at which the GMT offset, or the rule(s) change for a location. It is specified as the following:

A year
A month
A day
A time of day

If UNTIL is specified, the time zone information is generated from the given GMT offset and rule change until the time specified.

If you specify UNTIL, the next line must be a continuation line. The continuation line has the same form as the zone line except that the string Zone and the name are omitted, for the continuation line places information starting at the time specified in the UNTIL field in the previous line in the file used by the previous line. Continuation lines may contain an UNTIL field, just as zone lines do, indicating that the next line is a further continuation.

1.2.3 Link Lines

A link line has the following form:

Link LINK-FROM LINK-TO

An example is as follows:

Link US/Eastern EST5EDT

In the OpenVMS implementation, Link is interpreted as a copy. Thus, the previous line copies the information from US/Eastern to EST5EDT.

The LINK-FROM field should appear as the NAME field in some zone line. The LINK-TO field is used as an alternate name for that zone.

Except for continuation lines, lines may appear in any order in the input.

Note

For areas with more than two types of local time, use local standard time in the AT field of the earliest transition time's rule to ensure that the earliest transition time recorded in the compiled file is correct.

Chapter 2
Locale File Format

A locale definition source file contains categories that describe a locale. You can convert a locale definition source file into a locale by using the LOCALE COMPILE command. Locales can be modified only by editing a locale definition source file and then using the LOCALE COMPILE command again on the new source file. Each locale source file section defines a category of locale data. A source file cannot contain more than one section for the same category.

2.1 Locale Categories

The following standard locale categories are supported:

LC_COLLATE --- Defines character or string collation information
LC_CTYPE --- Defines character classification, case conversion, and other character attributes
LC_MESSAGES --- Defines the format for affirmative and negative responses
LC_MONETARY --- Defines rules and symbols for formatting monetary numeric information
LC_NUMERIC --- Defines rules and symbols for formatting nonmonetary numeric information
LC_TIME --- Defines rules and symbols for formatting time and date information

2.1.1 Overriding Defaults

You can include optional declarations at the beginning of your locale source file to override the default comment and escape characters used in locale category definitions:

Escape character
The escape character is used in decimal or hexadecimal constants when they are specified in the locale file. The default escape character is the backslash (\). To define another escape character, include a line with the following format:
escape_char <char_symbol>
Comment character
The comment character is the first character of each comment entry in the locale file. The default comment character is the number sign (#). To define another comment character, use the following format:
comment_char <char_symbol>

In the preceding formats, <char_symbol> is the character's symbolic name as defined in the charmap file used to build the locale's codeset. One or more blank characters (spaces or tabs) must separate escape_char or comment_char from <char_symbol>.

2.1.2 Category Source Definitions

Each category source definition consists of the following:

The category header (category_name)
The associated keyword or value pairs that comprise the category body
The category trailer (END category_name)

For example:

LC_CTYPE <source for LC_CTYPE category> END LC_CTYPE

The source for all of the categories is specified using keywords, strings, character literals, and character symbols. Each keyword identifies either a definition or a rule. The remainder of the statement containing the keyword contains the operands to the keyword. Operands are separated from the keyword by one or more blank characters (spaces or tabs). A statement may be continued on the next line by placing a backslash (\) as the last character before the new-line character that terminates the line. Lines containing the comment character (#) in the first column are treated as comment lines.

A symbolic name begins with the left angle-bracket character (<) and ends with the right angle-bracket character (>). The characters between the < and the > can be any characters from the Portable Character Set, except for the control and space characters. For example, <A-diaeresis> could be a symbolic name for a character. Any symbolic name referenced in the locale source file must be defined via the Portable Character Set or in the character set description (charmap) file for that locale.

A character literal is the character itself, or a decimal, hexadecimal, or octal constant. A decimal constant contains two or three decimal digits and has the following form, where n is any decimal digit:

\dnn or \dnnn

A hexadecimal constant contains two hexadecimal digits and has the following form, where n is any hexadecimal digit:

\xnn

An octal constant contains two or three octal digits and has the following form, where n is any octal digit:

\nn or \nnn

The explicit definition of each category in a locale definition source file is not required. When a category is undefined in a locale definition source file, the LOCALE COMPILE command will not store any data value for this category in the resulting locale file.

2.2 LC_COLLATE Category

The LC_COLLATE category defines the relative order between collation items. This category begins with the LC_COLLATE header and ends with the END LC_COLLATE trailer.

A collation item is the unit of comparison for collation. A collation item may be a character or a sequence of characters. Every collation item in the locale has a set of weights, which determine if the collation item collates before, equal to, or after the other collation items in the locale. Each collation item is assigned collation weights by the LOCALE COMPILE command when the locale definition source file is compiled. These collation weights are then used by applications programs that compare strings.

String comparison is performed by comparing the collation weights of each character in the string until either a difference is found or the strings are determined to be equal. This comparison may be performed several times if the locale defines multiple collation orders. For example, in the French locale, the strings are compared using a primary set of collation weights. If they are equal on the basis of this comparison, they are compared again using a secondary set of collation weights. A collation item has a set of collation weights associated with it that is equal to the number of collation sort rules defined for the locale.

Every character defined in the charmap file (or every character in the Portable Character Set if no charmap file is specified) is itself a collation item. Additional collation items can be defined using the collating-element statement (see the description that follows).

Table 2-1 lists the statement keywords recognized in the LC_COLLATE category.

Table 2-1 LC_COLLATE Category Keywords
Keyword Description

copy Specifies the name of an existing locale to be used as the definition of this category. If you specify a copy statement, you need not specify any other keywords in this category.

collating-element Specifies multicharacter collation items.

collating-symbol Specifies collation symbols for use in collation sequence statements.

order_start Specifies collation order statements that assign collation weights to collation items.

**Table 2-1 LC_COLLATE Category Keywords**
Keyword	Description
copy	Specifies the name of an existing locale to be used as the definition of this category. If you specify a copy statement, you need not specify any other keywords in this category.
collating-element	Specifies multicharacter collation items.
collating-symbol	Specifies collation symbols for use in collation sequence statements.
order_start	Specifies collation order statements that assign collation weights to collation items.

The collating-element, collating-symbol, and order_start statements are further described in the following sections.

Contents

Index

Legal

 
6494PRO.HTML

Chapter 1Overview

1.1 Creating XPG4-Compliant Localizing Applications

1.1.1.2 Message Catalog File

1.1.2.1 Creating Conversion Tables

1.2.1 Rule Lines

Chapter 2Locale File Format

2.1.1 Overriding Defaults

Chapter 1
Overview

Chapter 2
Locale File Format