What do we mean by “wide string” in a C++ app? What is a u16string? How can I convert a u16string to a wstring? Let’s answer these questions.
Table of Contents
What is a wstring?
Wide strings are the string class for ‘wide’ characters represented by wstring. Alphanumeric characters are stored and displayed in string form. In other words wstring stores alphanumeric text with 2 or 4 byte chars. Wide strings are the instantiation of the basic_string class template that uses wchar_t as the character type. Simply we can define a wstring as below,
1 2 3 |
typedef std::basic_string<wchar_t> std::wstring; |
What is u16string?
The u16string (std::u16string or std::pmr::u16string) are the string class data types for the 16bit characters defined in the std and std::pmr namespaces. It is a string class for 16-bit characters.
This is an instantiation of the basic_string class template that uses char16_t as the character type, with its default char_traits ad allocator types. In example, the std:string uses one byte (8 bits) while the std::u16string uses two bytes (16bits) per each character of the text string. In basic string definition std::u16string is defined as std::basic_string<char16_t>. Type definition can be shown as below,
1 2 3 |
typedef basic_string<char16_t> u16string; |
Is there a simple example of using wstring and u16string in Modern C++?
Here is a simple C++ example of how to use u16string,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
#include <iostream> int main() { std::wstring str2 = L"This is a String"; std::u16string str3 = u"This is a String"; std::pmr::wstring pstr2 = L"This is a String"; std::pmr::u16string pstr3 = u"This is a String"; return 0; } |
How do I convert u16string to a wstring?
First, we should say that we do not recommend converting the higher byte string to a lower byte string. Currently there is no way to convert u16string to a wstring in modern C++. There is a way to convert them by using std::wstring_convert and std::codecvt_utf16 templates. In some references these both has been deprecated, however there is no good alternative thus they are not removed in most of popular C++ compilers and this means you can use them if you really need to – but do so with caution.
We can use std::wstring_convert class template to convert a u16string to a wstring. The wstring_convert class template converts byte string to wide strings using an individual code conversion facet Codecvt
These standard facets suitable for use with the std::wstring_convert. We can use these with,
- std::codecvt_utf8 for the UTF-8/UCS2 and UTF-8/UCS4 conversions
- std::codecvt_utf8_utf16 for the UTF-8/UTF-16 conversions.
Syntax of std::wstring_convert class template (deprecated in C++17),
1 2 3 4 5 6 7 |
template< class Codecvt, class Elem = wchar_t, class Wide_alloc = std::allocator<Elem>, class Byte_alloc = std::allocator<char> > class wstring_convert; |
Syntax of std::codecvt_utf8 class template (deprecated in C++17),
1 2 3 4 5 6 |
template< class Elem, unsigned long Maxcode = 0x10ffff, std::codecvt_mode Mode = (std::codecvt_mode)0 > class codecvt_utf16 : public std::codecvt<Elem, char, std::mbstate_t>; |
Is there an example of how to convert u16string to a wstring?
Here is a example that converts u16string to a wstring,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
#include <iostream> #include <string> #include <locale> #include <codecvt> std::wstring u16_to_wstring (const std::u16string& str) { std::wstring_convert< std::codecvt_utf16<wchar_t, 0x10ffff, std::little_endian>, wchar_t> conv; std::wstring wstr = conv.from_bytes( reinterpret_cast<const char*> (&str[0]), reinterpret_cast<const char*> (&str[0] + str.size()) ); return(wstr); } //----------------------------------------------------------------------------- int main() { std::wstring wstr; std::u16string ustr16 = u"This is a Unicode16 String"; wstr = u16_to_wstring(ustr16); std::wcout << wstr << std::endl; getchar(); return 0; } |
This example works in most of C++ compilers including C++ Builder applications.
C++ Builder is the easiest and fastest C and C++ IDE for building simple or professional applications on the Windows, MacOS, iOS & Android operating systems. It is also easy for beginners to learn with its wide range of samples, tutorials, help files, and LSP support for code. RAD Studio’s C++ Builder version comes with the award-winning VCL framework for high-performance native Windows apps and the powerful FireMonkey (FMX) framework for cross-platform UIs.
There is a free C++ Builder Community Edition for students, beginners, and startups; it can be downloaded from here. For professional developers, there are Professional, Architect, or Enterprise versions of C++ Builder and there is a trial version you can download from here.