2.4 Conversion Specifications

Several of the Standard I/O functions (including the Terminal I/O functions) use conversion specifications to specify data formats for I/O. These functions are the formatted-input and formatted-output functions. Consider the following example:

int     x = 5.0;
FILE    *outfile;
   .
   .
   .
fprintf(outfile, "The answer is %d.\n", x);

The decimal value of the variable x replaces the conversion specification %d in the string to be written to the file associated with the identifier outfile.

Each conversion specification begins with a percent sign (%) and ends with a conversion specifier, which is a character that specifies the type of conversion to be performed. Optional characters can appear between the percent sign and the conversion specifier.

For the wide-character formatted I/O functions, the conversion specification is a string of wide characters. For the byte I/O equivalent functions, it is a string of bytes.

Sections 2.4.1 and 2.4.2 describe these optional characters and conversion specifiers.

2.4.1 Converting Input Information

The format specification string for the input of information can include three kinds of items:

White-space characters (spaces, tabs, and new-line characters), which match optional white-space characters in the input field.
Ordinary characters (not %), which must match the next nonwhite-space character in the input.
Conversion specifications, which govern the conversion of the characters in an input field and their assignment to an object indicated by a corresponding input pointer.

Each input pointer is an address expression indicating an object whose type matches that of a corresponding conversion specification. Conversion specifications form part of the format string. The indicated object is the target that receives the input value. There must be as many input pointers as there are conversion specifications, and the addressed objects must match the types of the conversion specifications.

A conversion specification consists of the following characters, in the order listed:

A percent character (%) or the sequence %n$ (where n is an integer),
The sequence %n$ denotes that the conversion is applied to the nth input pointer listed, where n is a decimal integer between [1, NL_ARGMAX] (see the <limits.h> header file). For example, a conversion specification beginning %5$ means that the conversion will be applied to the 5th input pointer listed after the format specification. The sequence %$ is invalid.
If the conversion specification does not begin with the sequence %n$, the conversion specification is matched to its input pointer in left-to-right order. You should only use one type of conversion specification (% or %n$) in a format specification.
One or more optional characters (described in Table 2-2).
A conversion specifier (described in Table 2-3).

Table 2-2 shows the characters you can use between the percent sign (%) (or the sequence %n$), and the conversion specifier. These characters are optional but, if specified, must occur in the order shown in Table 2-2.

Table 2-2 Optional Characters Between % (or %n$) and the Input Conversion Specifier

Character	Meaning
*	An assignment-suppressing character.
field width	A nonzero decimal integer that specifies the maximum field width. For the wide-character input functions, the field width is measured in wide characters. For the byte input functions, the field width is measured in bytes, unless the directive is one of the following: %lc, %ls, %C, %S, %[ In these cases, the field width is measured in multibyte character units.
h, l, or L (or ll)	Precede a conversion specifier of d, i, or n with an h if the corresponding argument is a pointer to short int rather than a pointer to int; with an l (lowercase ell) if it is a pointer to long int; or, for OpenVMS Alpha systems only, with an L or ll (two lowercase ells) if it is a pointer to __int64. Precede a conversion specifier of o, u, or x with an h if the corresponding argument is a pointer to unsigned short int rather than a pointer to unsigned int; with an l if it is a pointer to unsigned long int; or, for OpenVMS Alpha systems only, with an L or ll if it is a pointer to unsigned __int64. Precede a conversion specifier of c, s, or [ with an l (lower ell) if the corresponding argument is a pointer to a wchar_t. Finally, precede a conversion specifier of e, f, or g with an l (lowercase ell) if the corresponding argument is a pointer to double rather than a pointer to float, or with an L if it is a pointer to long double. If an h, l, L, or ll appears with any other conversion specifier, the behavior is undefined.

Character

Meaning

An assignment-suppressing character.

field width

A nonzero decimal integer that specifies the maximum field width.

For the wide-character input functions, the field width is measured in wide characters.

For the byte input functions, the field width is measured in bytes, unless the directive is one of the following:

%lc, %ls, %C, %S, %[

In these cases, the field width is measured in multibyte character units.

h, l, or L (or ll)

Precede a conversion specifier of d, i, or n with an h if the corresponding argument is a pointer to short int rather than a pointer to int; with an l (lowercase ell) if it is a pointer to long int; or, for OpenVMS Alpha systems only, with an L or ll (two lowercase ells) if it is a pointer to __int64.

Precede a conversion specifier of o, u, or x with an h if the corresponding argument is a pointer to unsigned short int rather than a pointer to unsigned int; with an l if it is a pointer to unsigned long int; or, for OpenVMS Alpha systems only, with an L or ll if it is a pointer to unsigned __int64.

Precede a conversion specifier of c, s, or [ with an l (lower ell) if the corresponding argument is a pointer to a wchar_t.

Finally, precede a conversion specifier of e, f, or g with an l (lowercase ell) if the corresponding argument is a pointer to double rather than a pointer to float, or with an L if it is a pointer to long double.

If an h, l, L, or ll appears with any other conversion specifier, the behavior is undefined.

Table 2-3 describes the conversion specifiers for formatted input.

Table 2-3 Conversion Specifiers for Formatted Input

Specifier	Input Type[1]	Description
d		Expects a decimal integer in the input whose format is the same as expected for the subject sequence of the strtol function with the value 10 for the base argument. The corresponding argument must be a pointer to int.
i		Expects an integer whose type is determined by the leading input characters. A leading 0 is equated to octal, a leading 0X or 0x is equated to hexadecimal, and all other forms are equated to decimal. The corresponding argument must be a pointer to int.
o		Expects an octal integer in the input (with or without a leading 0). The corresponding argument must be a pointer to int.
u		Expects a decimal integer in the input whose format is the same as expected for the subject sequence of the strtoul function with the value 10 for the base argument.
x		Expects a hexadecimal integer in the input (with or without a leading 0x). The corresponding argument must be a pointer to unsigned int.
c	Byte	Expects a single byte in the input. The corresponding argument must be a pointer to char. If a field width precedes the c conversion specifier, the number of characters specified by the field width is read. In this case, the corresponding argument must be a pointer to an array of char. If the optional character l (lowercase ell) precedes this conversion specifier, then the specifier expects a multibyte character in the input which is converted into a wide-character code. The corresponding argument must be a pointer to type wchar_t. If a field width also precedes the c conversion specifier, the number of characters specified by the field width is read. In this case, the corresponding argument must be a pointer to an array of wchar_t.
	Wide- character	Expects a sequence of the number of characters specified in the optional field width; this is 1 if not specified. If no l (lowercase ell) precedes the c specifier, then the corresponding argument must be a pointer to an array of char. If an l (lowercase ell) precedes the c specifier, the corresponding argument must be a pointer to an array of wchar_t.
C	Byte	The specifier expects a multibyte character in the input, which is converted into a wide- character code. The corresponding argument must be a pointer to type wchar_t. If a field width also precedes the C conversion specifier, the number of characters specified by the field width is read. In this case, the corresponding argument must be a pointer to an array of wchar_t.
	Wide- character	Expects a sequence of the number of characters specified in the optional field width; this is 1 if not specified. The corresponding argument must be a pointer to an array of wchar_t.
s	Byte	Expects a sequences of bytes in the input. The corresponding argument must be a pointer to an array of characters that is large enough to contain the sequence and a terminating null character (\0) that is automatically added. The input field is terminated by a space, tab, or new-line character. If the optional character l (ell) precedes this conversion specifier, the specifier expects a sequence of multibyte characters in the input, which are converted to wide-character codes. The corresponding argument must be a pointer to an array of wide characters (type wchar_t) that is large enough to contain the sequence plus the terminating null wide-character code that is automatically added. The input field is terminated by a space, tab, or new-line character.
	Wide-character	Expects (conceptually) a sequence of nonwhite-space characters in the input. If no l (lowercase ell) precedes the s specifier, then the corresponding argument must be a pointer to an array of char large enough to contain the sequence plus the terminating null byte that is automatically added. If an l (lowercase ell) precedes the s specifier, then the corresponding argument must be a pointer to an array of wchar_ t large enough to contain the sequence plus the terminating null wide character that is automatically added.
S	Byte	The specifier expects a sequence of multibyte characters in the input, which are converted to wide-character codes. The corresponding argument must be a pointer to an array of wide characters (type wchar_t) that is large enough to contain the sequence plus a terminating null wide- character code which is added automatically. The input field is terminated by a space, tab, or new-line character.
	Wide-character	Expects a sequence of nonwhite-space characters in the input. The corresponding argument must be a pointer to an array of wchar_ t large enough to contain the sequence plus the terminating null wide character that is automatically added.
e, f, g		Expects a floating- point number in the input. The corresponding argument must be a pointer to float. The input format for floating-point numbers is: [ (as many as indicated by the field width minus the signs and the letter E). The radix character is defined in the current locale.
[ . . . ]		Expects a nonempty sequence of characters that is not delimited by a white-space character. The brackets enclose a set of characters (the scanset) expected in the input sequence. Any character in the input sequence that does not match a character in the scanset terminates the character sequence. All characters between the brackets comprise the scanset, unless the first character after the left bracket is a circumflex (^). In this case, the scanset contains all characters other than those that appear between the circumflex and the right bracket. Any character that does appear between the circumflex and the right bracket will terminate the input character sequence. If the conversion specifier begins with [] or [^], the right bracket character is in the scanset and the next right bracket character is the matching right bracket that ends the specification; otherwise, the first right bracket character ends the specification.
	Byte	If an l (lowercase ell) does not precede the [ specifier, then the characters in the scanset must be single-byte characters only. In this case, the corresponding argument must be a pointer to an array of char large enough to accept the sequence and the terminating null byte which is automatically added. If an l (lowercase ell) does precede the [ specifier, the characters in the input sequence are considered to be multibyte characters, which are then converted to a wide-character sequence for further processing. If character ranges are specified in the scanset, then the processing is done according to the LC_COLLATE category of the current program's locale. In this case, the corresponding argument must be a pointer to an array of wchar_t large enough to accept the sequence and the terminating null wide character which is automatically added.
	Wide-character	If no l (lowercase ell) precedes the [ conversion specifier, then processing is the same as described for the Byte-input type of the %l[ specifier, except that the corresponding argument must be an array of char large enough to accept the multibyte sequence plus the terminating null byte that is automatically added. If an l (lowercase ell) precedes the [ conversion specifier, then processing is the same as the preceding paragraph except that the corresponding argument must be an array of wchar_t large enough to accept the wide-character sequence plus the terminating null wide character that is automatically added.
p		Requires an argument that is a pointer to void. The input value is interpreted as a hexadecimal value.
n		No input is consumed. The corresponding argument is a pointer to an integer. The integer is assigned the number of characters read from the input stream so far by this call to the formatted input function. Execution of a %n directive does not increment the assignment count returned when the formatted input function completes execution.
%		Matches a single percent symbol. No conversion or assignment takes place. The complete conversion specification would be %%.
[1] Either Byte or Wide-character. Where neither is shown for a given specifier, the specifier description applies to both.

Remarks

You can change the delimiters of the input field with the bracket ([ ]) conversion specification. Otherwise, an input field is defined as a string of nonwhite-space characters. It extends either to the next white-space character or until the field width, if specified, is exhausted. The function reads across line and record boundaries, since the new-line character is a white-space character.
A call to one of the input conversion functions resumes searching immediately after the last character processed by a previous call.
If the assignment-suppression character (*) appears in the format specification, no assignment is made. The corresponding input field is interpreted and then skipped.
The arguments must be pointers or other address-valued expressions, since DEC C permits only calls by value. To read a number in decimal format and assign its value to n, you must use the following form:
```
scanf("%d", &n)
```
You cannot use the following form:
```
scanf("%d", n)
```
White space in a format specification matches optional white space in the input field. Consider the following format specification:
```
field = %x
```
This format specification matches the following forms:
```
field = 5218
field=5218
field= 5218
field =5218
```
These forms do not match the following example:
```
fiel d=5218
```

2.4.2 Converting Output Information

The format specification string for the output of information can contain:

Ordinary characters, which are copied to the output.
Conversion specifications, each of which causes the conversion of a corresponding output source to a character string in a particular format Conversion specifications are matched to output sources in left-to-right order.

A conversion specification consists of the following, in the order listed:

A percent character (%) or the sequence %n$.
The sequence %n$ denotes that the conversion is applied to the nth output source listed, where n is a decimal integer between [1, NL_ARGMAX] (see the <limits.h> header file). For example, a conversion specification beginning %5$ means that the conversion will be applied to the 5th output source listed after the format specification.
If the conversion specification does not begin with the sequence %n$, the conversion specification is matched to its output source in left-to-right order. You should only use one type of conversion specification (% or %n$) in a format specification.
One or more optional characters (described in Table 2-4).
A conversion specifier (described in Table 2-5) concludes the conversion specification.

For examples of conversion specifications, see the sample programs in Section 2.6.

Table 2-4 shows the characters you can use between the percent sign (%) (or the sequence %n$) and the conversion specifier. These characters are optional, but if specified, they must occur in the order shown in Table 2-4.

Table 2-4 Optional Characters Between % (or %n$) and the Output Conversion Specifier

Character

Meaning

flags

You can use the following flag characters, alone or in any combined order, to modify the conversion specification:

' (single quote)	Requests that a numeric conversion is formatted with the thousands separator character. Only the numbers to the left of the radix character are formatted with the separator character. The character used as a separator and the positioning of the separators are defined in the program's current locale.
- (hyphen)	Left-justifies the converted output source in its field.
+	Requests that an explicit sign be present on a signed conversion. If this flag is not specified, the result of a signed conversion begins with a sign only when a negative value is converted.
space	Prefixes a space to the result of a signed conversion, if the first character of the conversion is not a sign, or if the conversion results in no characters. If you specify both the space and the + flag, the space flag is ignored.

Requests an alternate conversion format. Depending on the conversion specified, different actions will occur.

For the o (octal) conversion, the precision is increased to force the first digit to be a zero.

For the x (or X) conversion, a nonzero result is prefixed with 0x (or 0X).

For e, E, f, g, and G conversions, the result contains a decimal point even at the end of an integer value.

For g and G conversions, trailing zeros are not trimmed.

For other conversions, the effect of # is undefined.

Uses zeros rather than spaces to pad the field width for d, i, o, u, x, X, e, E, f, g, and G conversions. If both the 0 and the - flags are specified, then the 0 flag is ignored. For d, i, o, u, x, and X conversions, if a precision is specified, the 0 flag is ignored. For other conversions, the behavior of the 0 flag is undefined.

field width

The minimum field width can be designated by a decimal integer constant, or by an output source. To specify an output source, use an asterisk (*) or the sequence *n$, where n refers to the nth output source listed after the format specification.

If the converted output source is wider than the minimum field, write it out.

If the converted output source is narrower than the minimum width, pad it to make up the field width. Pad with spaces by default. Pad with zeros if the 0 flag is specified; this does not mean that the width is an octal number. Padding is on the left by default, and on the right if a minus sign is specified.

For the wide-character output functions, the field width is measured in wide characters; for the byte output functions, it is measured in bytes.

. (period)

Separates the field width from the precision.

precision

The precision defines any of the following:

Minimum number of digits to appear for d, i, o, u, x, and X conversions
Number of digits to appear after the decimal-point character for e, E, and f conversions
Maximum number of significant digits for g and G conversions
Maximum number of characters to be written from a string in an s or S conversion

If a precision appears with any other conversion specifier, the behavior is undefined.

Precision can be designated by a decimal integer constant, or by an output source. To specify an output source, use an asterisk (*) or the sequence *n$, where n refers to the nth output source listed after the format specification.

If only the period is specified, the precision is taken as 0.

h, l, or L (or ll)

An h specifies that a following d, i, o, u, x, or X conversion specifier applies to a short int or unsigned short int argument; an h can also specify that a following n conversion specifier applies to a pointer to a short int argument.

An l (lowercase ell) specifies that a following d, i, o, u, x, or X conversion specifier applies to a long int or unsigned long int argument; an l can also specify that a following n conversion specifier applies to a pointer to a long int argument.

On OpenVMS Alpha systems, an L or ll (two lowercase ells) specifies that a following d, i, o, u, x, or X conversion specifier applies to an __int64 or unsigned __int64 argument. (Alpha only)

An L specifies that a following e, E, f, g, or G conversion specifier applies to a long double argument.

An l specifies that a following c or s conversion specifier applies to a wchar_t argument.

If an h, l, or L appears with any other conversion specifier, the behavior is undefined.

On OpenVMS VAX and Alpha systems, DEC C int values are equivalent to long values.

Table 2-5 decribes the conversion specifiers for formatted output.

Table 2-5 Conversion Specifiers for Formatted Output

Specifier	Output Type[1]	Description
d, i		Converts an int argument to signed decimal format.
o		Converts an unsigned int argument to unsigned octal format.
u		Converts an unsigned int argument to unsigned decimal format (giving a number in the range 0 to 4,294,967,295).
x, X		Converts an unsigned int argument to unsigned hexadecimal format (with or without a leading 0x). The letters abcdef are used for x conversion, and the letters ABCDEF are used for X conversion.
f		Converts a float or double argument to the format [-]mmm.nnnnnn. The number of n's is equal to the precision specification: If no precision is specified, the default is 6. If the precision is 0 and the # flag is specified, the decimal point appears but no n's appear. If the precision is 0 and the # flag is not specified, the decimal point also does not appear. If a decimal point appears, at least one digit appears before it. The value is rounded to the appropriate number of digits.
e, E		Converts a float or double argument to the format [-]m.nnnnnnE n's is specified by the precision. If no precision is specified, the default is 6. If the precision is explicitly 0 and the # flag is specified, the decimal point appears but no n's appear. If the precision is explicitly 0 and the # flag is not specified, the decimal point also does not appear. An 'e' is printed for e conversion; an 'E' is printed for E conversion. The exponent always contains at least two digits. If the value is 0, the exponent is 0.
g, G		Converts a float or double argument to format f or e (or E if the G conversion specifier is used), with the precision specifying the number of significant digits. If the precision is 0, it is taken as 1. The format used depends on the value of the argument: format e (or E) is used only if the exponent resulting from such a conversion is less than -4, or is greater than or equal to the precision; otherwise, format f is used. Trailing zeros are suppressed in the fractional portion of the result. A decimal point appears only if it is followed by a digit.
c	Byte	Converts an int argument to an unsigned char, and writes the resulting byte. If the optional character l (lowercase ell) precedes this conversion specifier, then the specifier converts a wchar_t argument to an array of bytes representing the character, and writes the resulting character. If the field width is specified and the resulting character occupies fewer bytes than the field width, it will be padded to the given width with space characters. If the precision is specified, the behavior is undefined.
	Wide-character	If an l (lowercase ell) does not precede the c specifier, then the int argument is converted to a wide character as if by calling btowc, and the resulting character is written. If an l (lowercase ell) precedes the c specifier, then the specifier converts a wchar_t argument to an array of bytes representing the character, and writes the resulting character. If the field width is specified and the resulting character occupies fewer characters than the field width, it will be padded to the given width with space characters. If the precision is specified, the behavior is undefined.
C	Byte	Converts a wchar_t argument to an array of bytes representing the character, and writes the resulting character. If the field width is specified and the resulting character occupies fewer bytes than the field width, it will be padded to the given width with space characters. If the precision is specified, the behavior is undefined.
	Wide-character	Converts a wchar_t argument to an array of bytes representing the character, and writes the resulting character. If the field width is specified and the resulting character occupies fewer wide characters than the field width, it will be padded to the given width with space characters. If the precision is specified, the behavior is undefined.
s	Byte	Requires an argument that is a pointer to an array of characters of type char. The argument is used to write characters until a null character is encountered or until the number of characters indicated by the precision specification is exhausted. If the precision specification is 0 or omitted, all characters up to a null are output. If the optional character l (lowercase ell) precedes this conversion specifier, then the specifier converts an array of wide-character codes to multibyte characters, and writes the multibyte characters. Requires an argument that is a pointer to an array of wide characters of type wchar_t. Characters are written until a null wide character is encountered or until the number of bytes indicated by the precision specification is exhausted. If the precision specification is omitted or is greater than the size of the array of converted bytes, the array of wide characters must be terminated by a null wide character.
	Wide-character	If an l (lowercase ell) does not precede the s specifier, then the specifier converts an array of multibyte characters, as if by calling mbrtowc for each multibyte character, and writes the resulting characters until a null wide character is encountered or the number of wide characters indicated by the precision specification is exhausted. If the precision specification is omitted or is greater than the size of the array of converted characters, the converted array must be terminated by a null wide character. If an l precedes this conversion specifier, then the argument is a pointer to an array of wchar_t. Characters from this array are written until a null wide character is encountered or the number of wide characters indicated by the precision specification is exhausted. If the precision specification is omitted or is greater than the size of the array, the array must be terminated by a null wide character.
S	Byte	Converts an array of wide-character codes to multibyte characters, and writes the multibyte characters. Requires an argument that is a pointer to an array of wide characters of type wchar_t. Characters are written until a null wide character is encountered or until the number of bytes indicated by the precision specification is exhausted. If the precision specification is omitted or is greater than the size of the array of converted bytes, the array of wide characters must be terminated by a null wide character.
	Wide- character	The argument is a pointer to an array of wchar_t. Characters from this array are written until a null wide character is encountered or the number of wide characters indicated by the precision specification is exhausted. If the precision specification is omitted or is greater than the size of the array, the array must be terminated by a null wide character.
p		Requires an argument that is a pointer to void. The value of the pointer is output as a hexadecimal number.
n		Requires an argument that is a pointer to an integer. The integer is assigned the number of characters written to the output stream so far by this call to the formatted output function. No argument is converted.
%		Writes out the percent symbol. No conversion is performed. The complete conversion specification would be %%.
[1] Either Byte or Wide-character. Where neither is shown for a given specifier, the specifier description applies to both.

Previous Page | Next Page | Table of Contents | Index