Creating the Code Set Registry Source File
The Code Set Registry Compiler csrc creates a character and code set registry from the information supplied in a character and code set registry source file. Code set registry
source files are created for input to csrc in two stages:
· During DCE licensee porting of DCE to one or more operating system platforms
· During the creation of an internationalized DCE cell or when a DCE machine is being configured for use in an internationalized DCE cell
In the first stage, DCE licensees create code set registry source files when they are porting DCE to a specific operating system platform and plan for their DCE product to support internationalized
DCE applications. DCE licensees receive from OSF a template character and code set registry source file that contains the unique identifiers that OSF has assigned to the character sets and code sets
that have been registered with OSF. (This file exists in src/rpc/csrc/csr/code_set_registry.txt.) They modify this file to contain, for each code set that their platform supports, the
local code set names for those supported code sets. They can also add to this file any vendor-specific, non-OSF registered code set names and values that their platform supports.
In the second stage, DCE cell administrators create code set registry source files when they are configuring machines that are part of an internationalized DCE cell. Cell administrators of
internationalized DCE cells create their site-specific code set registry source files from one or more DCE licensee code set registry source files. Only one code set registry source file exists on
each machine. The number of source files that need to be modified depends upon the number of DCE platforms that exist in the cell.
A code set registry source file is composed of a series of code set records. Each record describes, in human-readable form, the mapping between an OSF-registered, a licensee-defined, or a
site-specific unique code set value and the character string that a given operating system uses when referring to that code set. This character string is called the "local code set name". Each code
set record specifies one code set, and has the following form:
start
field_list
end
The field_list consists of the following keyword-value or keyword-text pairs:
description text A comment string that briefly describes the code set.
loc_name text A maximum 32-byte string (31 character data bytes plus a terminating NULL) that contains the operating system-specific name of a code set or the keyword
NONE.
rgy_value value A 32-bit hexadecimal value that uniquely identifies this code set. A registry value can be one that OSF has assigned or one that a DCE licensee or cell
administrator has assigned.
char_values value[:value] One or more 16-bit hexadecimal values that uniquely identify each character set that this code set encodes. A character
value can be one that OSF has assigned or one that a DCE licensee or a cell administrator has assigned.
max_bytes value A 16-bit value that specifies the maximum number of bytes this code set uses to encode one character.
Here is a sample of a licensee-supplied source file.
start description ISO 8859-1:1987; Latin Alphabet No. 1 loc_name iso88591 rgy_value 0x00010001 char_values 0x0011 max_bytes 1
end start description ISO 8859-2:1987; Latin Alphabet No. 2 loc_name iso88592 rgy_value 0x00010002
char_values 0x0012 max_bytes 1 end start description ISO 8859-3:1988; Latin Alphabet No. 3 loc_name
iso88593 rgy_value 0x00010003 char_values 0x0013 max_bytes 1 end start description ISO 8859-4:1988; Latin Alphabet No. 4
loc_name NONE rgy_value 0x00010004 char_values 0x0014 max_bytes 1 end
For each different DCE platform that exists in the cell, the cell administrator takes that platform's licensee-generated character and code set registry source file and modifies the code set records
within it to add the local code set names of any additional code sets that the site supports. (Note that the DCE licensees will have already modified the code set records for each code set that
their DCE platform supports.) The cell administrator can also add to each platform-specific source file any site-specific, non-OSF registered code set names and values.
The cell administrator modifies the code set records that correspond to the code sets that the site supports as follows:
· For each code set that the site supports, replace the NONE keyword in the loc_name field of the code set record the name that your site uses to refer to the
code set and the operating system code set converters associated with it. For example, in a UNIX environment, code set converters exist in the iconv directory. In this case, you would
examine this directory to determine the names of the code set converters.
· Fill in the description field of the code set record to provide a detailed description of the code set and character set(s) that it supports. The text field can
contain multiple lines; use the backslash character (\) to continue the line. If the site does not support a given code set, you must leave the NONE keyword in the code set record.
· Fill in the max_byte field of the code set record with the maximum number of bytes that the code set uses to encode one character. The count should include any
single-shift control characters, if used.
· Add new values for any site-specific code set values or character set values that have not been registered with OSF to the appropriate rgy_value and
char_values fields. These values must be in the range 0xf5000000 through 0xfffffff so that they do not collide with OSF-registered values. Use the colon character (:) to separate
multiple character set values.
For additional source file usage information, see the csrc(8dce) reference page in the OSF DCE Command Reference.
|