119 lines
9.6 KiB
Markdown
119 lines
9.6 KiB
Markdown
|
||
SOURCE: man 3 sscanf
|
||
|
||
# Format String
|
||
|
||
The format string consists of a sequence of directives which describe how to process the sequence of input characters. If processing of a directive fails, no further input is read, and sscanf() returns.
|
||
A "failure" can be either of the following: input failure, meaning that input characters were unavailable, or matching failure, meaning that the input was inappropriate (see below).
|
||
|
||
A directive is one of the following:
|
||
|
||
• A sequence of white-space characters (space, tab, newline, etc.; see isspace(3)). This directive matches any amount of white space, including none, in the input.
|
||
|
||
• An ordinary character (i.e., one other than white space or '%'). This character must exactly match the next character of input.
|
||
|
||
• A conversion specification, which commences with a '%' (percent) character. A sequence of characters from the input is converted according to this specification, and the result is placed in the
|
||
corresponding pointer argument. If the next item of input does not match the conversion specification, the conversion fails—this is a matching failure.
|
||
|
||
Each conversion specification in format begins with either the character '%' or the character sequence "%n$" (see below for the distinction) followed by:
|
||
|
||
• An optional '*' assignment-suppression character: sscanf() reads input as directed by the conversion specification, but discards the input. No corresponding pointer argument is required, and this
|
||
specification is not included in the count of successful assignments returned by scanf().
|
||
|
||
• For decimal conversions, an optional quote character ('). This specifies that the input number may include thousands' separators as defined by the LC_NUMERIC category of the current locale. (See
|
||
setlocale(3).) The quote character may precede or follow the '*' assignment-suppression character.
|
||
|
||
• An optional 'm' character. This is used with string conversions (%s, %c, %[), and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, sscanf() allocates
|
||
a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char * variable (this variable does not need to be ini‐
|
||
tialized before the call). The caller should subsequently free(3) this buffer when it is no longer required.
|
||
|
||
• An optional decimal integer which specifies the maximum field width. Reading of characters stops either when this maximum is reached or when a nonmatching character is found, whichever happens
|
||
first. Most conversions discard initial white space characters (the exceptions are noted below), and these discarded characters don't count toward the maximum field width. String input conver‐
|
||
sions store a terminating null byte ('\0') to mark the end of the input; the maximum field width does not include this terminator.
|
||
|
||
• An optional type modifier character. For example, the l type modifier is used with integer conversions such as %d to specify that the corresponding pointer argument refers to a long rather than a
|
||
pointer to an int.
|
||
|
||
• A conversion specifier that specifies the type of input conversion to be performed.
|
||
|
||
The conversion specifications in format are of two forms, either beginning with '%' or beginning with "%n$". The two forms should not be mixed in the same format string, except that a string containing
|
||
"%n$" specifications can include %% and %*. If format contains '%' specifications, then these correspond in order with successive pointer arguments. In the "%n$" form (which is specified in
|
||
POSIX.1-2001, but not C99), n is a decimal integer that specifies that the converted input should be placed in the location referred to by the n-th pointer argument following format.
|
||
|
||
|
||
## Conversions
|
||
The following type modifier characters can appear in a conversion specification:
|
||
|
||
h Indicates that the conversion will be one of d, i, o, u, x, X, or n and the next pointer is a pointer to a short or unsigned short (rather than int).
|
||
|
||
hh As for h, but the next pointer is a pointer to a signed char or unsigned char.
|
||
|
||
j As for h, but the next pointer is a pointer to an intmax_t or a uintmax_t. This modifier was introduced in C99.
|
||
|
||
l Indicates either that the conversion will be one of d, i, o, u, x, X, or n and the next pointer is a pointer to a long or unsigned long (rather than int), or that the conversion will be one of e,
|
||
f, or g and the next pointer is a pointer to double (rather than float). If used with %c or %s, the corresponding parameter is considered as a pointer to a wide character or wide-character string
|
||
respectively.
|
||
|
||
ll (ell-ell) Indicates that the conversion will be one of b, d, i, o, u, x, X, or n and the next pointer is a pointer to a long long or unsigned long long (rather than int).
|
||
|
||
L Indicates that the conversion will be either e, f, or g and the next pointer is a pointer to long double or (as a GNU extension) the conversion will be d, i, o, u, or x and the next pointer is a
|
||
pointer to long long.
|
||
|
||
q equivalent to L. This specifier does not exist in ANSI C.
|
||
|
||
t As for h, but the next pointer is a pointer to a ptrdiff_t. This modifier was introduced in C99.
|
||
|
||
z As for h, but the next pointer is a pointer to a size_t. This modifier was introduced in C99.
|
||
|
||
The following conversion specifiers are available:
|
||
|
||
% Matches a literal '%'. That is, %% in the format string matches a single input '%' character. No conversion is done (but initial white space characters are discarded), and assignment does not
|
||
occur.
|
||
|
||
d Matches an optionally signed decimal integer; the next pointer must be a pointer to int.
|
||
|
||
i Matches an optionally signed integer; the next pointer must be a pointer to int. The integer is read in base 16 if it begins with 0x or 0X, in base 8 if it begins with 0, and in base 10 other‐
|
||
wise. Only characters that correspond to the base are used.
|
||
|
||
o Matches an unsigned octal integer; the next pointer must be a pointer to unsigned int.
|
||
|
||
u Matches an unsigned decimal integer; the next pointer must be a pointer to unsigned int.
|
||
|
||
x Matches an unsigned hexadecimal integer (that may optionally begin with a prefix of 0x or 0X, which is discarded); the next pointer must be a pointer to unsigned int.
|
||
|
||
X Equivalent to x.
|
||
|
||
f Matches an optionally signed floating-point number; the next pointer must be a pointer to float.
|
||
|
||
e Equivalent to f.
|
||
|
||
g Equivalent to f.
|
||
|
||
E Equivalent to f.
|
||
|
||
a (C99) Equivalent to f.
|
||
|
||
s Matches a sequence of non-white-space characters; the next pointer must be a pointer to the initial element of a character array that is long enough to hold the input sequence and the terminating
|
||
null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
|
||
|
||
c Matches a sequence of characters whose length is specified by the maximum field width (default 1); the next pointer must be a pointer to char, and there must be enough room for all the characters
|
||
(no terminating null byte is added). The usual skip of leading white space is suppressed. To skip white space first, use an explicit space in the format.
|
||
|
||
|
||
|
||
|
||
***[ Matches a nonempty sequence of characters from the specified set of accepted characters; the next pointer must be a pointer to char, and there must be enough room for all the characters in the
|
||
string, plus a terminating null byte. The usual skip of leading white space is suppressed. The string is to be made up of characters in (or not in) a particular set; the set is defined by the
|
||
characters between the open bracket [ character and a close bracket ] character. The set excludes those characters if the first character after the open bracket is a circumflex (^). To include a
|
||
close bracket in the set, make it the first character after the open bracket or the circumflex; any other position will end the set. The hyphen character - is also special; when placed between
|
||
two other characters, it adds all intervening characters to the set. To include a hyphen, make it the last character before the final close bracket. For instance, [^]0-9-] means the set "every‐
|
||
thing except close bracket, zero through nine, and hyphen". The string ends with the appearance of a character not in the (or, with a circumflex, in) set or when the field width runs out.***
|
||
|
||
|
||
|
||
p Matches a pointer value (as printed by %p in printf(3)); the next pointer must be a pointer to a pointer to void.
|
||
|
||
n Nothing is expected; instead, the number of characters consumed thus far from the input is stored through the next pointer, which must be a pointer to int, or variant whose size matches the (op‐
|
||
tionally) supplied integer length modifier. This is not a conversion and does not increase the count returned by the function. The assignment can be suppressed with the * assignment-suppression
|
||
character, but the effect on the return value is undefined. Therefore %*n conversions should not be used.
|