character constant
character constant
Syntax
' c-char ' | (1) | |
u ' c-char ' (since C11) | (2) | |
U ' c-char ' (since C11) | (3) | |
L ' c-char ' | (4) | |
' c-char-sequence ' | (5) |
where.
- c-char is either
- a character from the basic source character set minus single-quote (
'
), backslash (\
), or the newline character. - escape sequence: one of special character escapes
\'
\"
\?
\\
\a
\b
\f
\n
\r
\t
\v
, hex escapes\x...
or octal escapes\...
as defined in escape sequences. -
universal character name,
\u...
or\U...
as defined in escape sequences.(since C99) - c-char-sequence is a sequence of two or more c-chars.
1) single-byte integer character constant, e.g.'a'
or'\n'
or'\13'
. Such constant has typeint
and a value equal to the representation of c-char in the execution character set as a value of typechar
mapped toint
. If c-char is not representable as a single byte in the execution character set, the value is implementation-defined.2) 16-bit wide character constant, e.g.u'貓'
, but notu'????'
(u'\U0001f34c'
). Such constant has typechar16_t
and a value equal to the value of c-char in the 16-bit encoding produced bymbrtoc16
(normally UTF-16). If c-char is not representable or maps to more than one 16-bit character, the behavior is implementation-defined.3) 32-bit wide character constant, e.g.U'貓'
orU'????'
. Such constant has typechar32_t
and a value equal to the value of c-char in in the 32-bit encoding produced bymbrtoc32
(normally UTF-32). If c-char is not representable or maps to more than one 32-bit character, the behavior is implementation-defined.4) wide character constant, e.g.L'β'
orL'貓
. Such constant has typewchar_t
and a value equal to the value of c-char in the execution wide character set (that is, the value that would be produced bymbtowc
). If c-char is not representable or maps to more than one wide character (e.g. a non-BMP value on Windows where wchar_t is 16-bit), the behavior is implementation-defined .5) multicharacter constant, e.g.'AB'
, has typeint
and implementation-defined value.Notes
Many implementations of multicharacter constants use the values of each char in the constant to initialize successive bytes of the resulting integer, in big-endian order, e.g. the value of
'\1\2\3\4'
is0x01020304
.In C++, ordinary character constants have type
char
, rather thanint
.Unlike integer constants, a character constant may have a negative value if
char
is signed: on such implementations'\xFF'
is anint
with the value-1
.When used in a controlling expression of #if or #elif, character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.
Example
#include <stddef.h> #include <stdio.h> #include <uchar.h> int main (void) { printf("constant value \n"); printf("-------- ----------\n"); // integer character constants, int c1='a'; printf("'a': %#010x\n", c1); int c2='????'; printf("'????': %#010x\n\n", c2); // implementation-defined // multicharacter constant int c3='ab'; printf("'ab': %#010x\n\n", c3); // implementation-defined // 16-bit wide character constants char16_t uc1 = u'a'; printf("'a': %#010x\n", (int)uc1); char16_t uc2 = u'¢'; printf("'¢': %#010x\n", (int)uc2); char16_t uc3 = u'猫'; printf("'猫': %#010x\n", (int)uc3); // implementation-defined (???? maps to two 16-bit characters) char16_t uc4 = u'????'; printf("'????': %#010x\n\n", (int)uc4); // 32-bit wide character constants char32_t Uc1 = U'a'; printf("'a': %#010x\n", (int)Uc1); char32_t Uc2 = U'¢'; printf("'¢': %#010x\n", (int)Uc2); char32_t Uc3 = U'猫'; printf("'猫': %#010x\n", (int)Uc3); char32_t Uc4 = U'????'; printf("'????': %#010x\n\n", (int)Uc4); // wide character constants wchar_t wc1 = L'a'; printf("'a': %#010x\n", (int)wc1); wchar_t wc2 = L'¢'; printf("'¢': %#010x\n", (int)wc2); wchar_t wc3 = L'猫'; printf("'猫': %#010x\n", (int)wc3); wchar_t wc4 = L'????'; printf("'????': %#010x\n\n", (int)wc4); }
Possible output:
constant value -------- ---------- 'a': 0x00000061 '????': 0xf09f8d8c 'ab': 0x00006162 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '????': 0x0000df4c 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '????': 0x0001f34c 'a': 0x00000061 '¢': 0x000000a2 '猫': 0x0000732b '????': 0x0001f34c
References
- C11 standard (ISO/IEC 9899:2011):
- 6.4.4.4 Character constants (p: 67-70)
- C99 standard (ISO/IEC 9899:1999):
- 6.4.4.4 Character constants (p: 59-61)
- C89/C90 standard (ISO/IEC 9899:1990):
- 3.1.3.4 Character constants
See also
C++ documentation for character literal
© cppreference.com
Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.
http://en.cppreference.com/w/c/language/character_constant