C++ char16_t Keyword
The char16_t keyword in C++ is a data type introduced in C++11 for representing 16-bit Unicode characters. It is primarily used to handle UTF-16 encoded text, which is common in applications requiring internationalization. Unlike the traditional char, char16_t ensures compatibility with 16-bit wide characters and provides better support for modern Unicode encoding standards.
Strings using char16_t are prefixed with u, and their type is const char16_t*.
Syntax
</>
Copy
char16_t variable_name = u'character';
const char16_t* string_name = u"string";
- char16_t
- The keyword used to declare a variable to store a 16-bit Unicode character.
- variable_name
- The name of the variable that stores the Unicode character.
- u
- A prefix used for UTF-16 encoded strings or characters.
Examples
Example 1: Declaring a UTF-16 Character
This example demonstrates how to declare a char16_t variable and print its value as a Unicode character and integer.
</>
Copy
#include <iostream>
using namespace std;
int main() {
char16_t ch = u'A'; // Declare a UTF-16 character
cout << "Character: " << (char)ch << endl;
cout << "Unicode Value: " << (int)ch << endl;
return 0;
}
Output:
Character: A
Unicode Value: 65
Explanation:
- The
char16_tvariablechis initialized withu'A', representing a UTF-16 character. - The character is cast to
charfor display in the first output line. - The character is cast to
intto display its Unicode value, which is65.
Example 2: Declaring and Printing a UTF-16 String
This example demonstrates how to declare and print a UTF-16 encoded string using char16_t.
</>
Copy
#include <iostream>
#include <string>
int main() {
// Create a UTF-16 encoded string
const char16_t* greeting = u"Hello, UTF-16!";
// Convert UTF-16 to wide string (platform-specific handling of wchar_t)
std::wstring wide_greeting(greeting, greeting + std::char_traits<char16_t>::length(greeting));
// Print using wcout
std::wcout << L"Message: " << wide_greeting << std::endl;
return 0;
}
Output:
Message: Hello, UTF-16!
Explanation:
- The string
u"Hello, UTF-16!"is a UTF-16 encoded string stored in aconst char16_t*variable. - The program prints the UTF-16 string correctly to output using std::wstring.
Example 3: Working with Non-ASCII Characters
This example shows how to use char16_t with UTF-16 encoded non-ASCII characters.
</>
Copy
#include <iostream>
#include <string>
#include <codecvt> // For conversion
#include <locale> // For std::wstring_convert
int main() {
const char16_t* japanese = u"こんにちは"; // "Hello" in Japanese
// Convert UTF-16 to UTF-8
std::u16string utf16_string(japanese); // Convert to std::u16string
std::wstring_convert<std::codecvt_utf8_utf16<char16_t>, char16_t> converter;
std::string utf8_string = converter.to_bytes(utf16_string);
// Output the UTF-8 encoded string
std::cout << "UTF-8 String: " << utf8_string << std::endl;
return 0;
}
Output:
UTF-16 String: こんにちは
Explanation:
- A UTF-16 string is declared using the
char16_t*type:const char16_t* japanese = u"こんにちは";. - The UTF-16 string is converted into a
std::u16stringfor compatibility with the conversion utilities. - The
std::wstring_convertclass is used withstd::codecvt_utf8_utf16to perform the UTF-16 to UTF-8 conversion. - The
to_bytesmethod ofstd::wstring_convertis called to convert thestd::u16stringinto a UTF-8 encodedstd::string. - The resulting UTF-8 encoded string is printed to the console using
std::cout. - The program successfully outputs the UTF-8 encoded version of the Japanese text “こんにちは”.
Key Points about char16_t Keyword
char16_tis a 16-bit data type introduced in C++11 for handling UTF-16 encoded characters.- Strings using
char16_tmust be prefixed withu. char16_tensures compatibility with modern Unicode encoding standards.- Outputting
char16_tdata typically requires casting, asstd::coutdoes not natively support it. - Use
char16_twhen working with UTF-16 strings for internationalization and Unicode compliance in modern applications.
