Boost.Locale
|
Each locale is defined by a specific locale identifier, which contains a mandatory part (Language) and several optional parts (Country, Variant, keywords and character encoding of std::string
). Boost.Locale uses the POSIX naming convention for locales, i.e. a locale is defined as language[_COUNTRY][.encoding][@variant]
, where lang is ISO-639 language name like "en" or "ru", COUNTRY is the ISO-3166 country identifier like "US" or "DE", encoding is the eight-bit character encoding like UTF-8
or ISO-8859-1
, and variant is additional options for specializing the locale, like euro
or calendar=hebrew
, see Variant.
Note that each locale should include the encoding in order to handle char
based strings correctly.
The class generator provides tools to generate the locales we need. The simplest way to use generator
is to create a locale and set it as the global one:
#include <boost/locale.hpp> using namespace boost::locale; int main() { generator gen; // Create locale generator std::locale::global(gen("")); // "" - the system default locale, set // it globally }
Of course we can also specify the locale manually
std::locale loc = gen("en_US.UTF-8"); // Use English, United States locale
cout
or fstream
. LC_CTYPE
, LC_ALL
, and LANG
in that order (i.e. LC_CTYPE
first and LANG
last). On Windows, the library also queries the LOCALE_USER_DEFAULT
option in the Win32 API when these variables are not set.Tip: Prefer using UTF-8 Unicode encoding over 8-bit encodings like the ISO-8859-X ones.
By default the generated locales include all supported categories and character types. However, if your application uses only 8-bit encodings, only wide-character encodings, or only specific facets, you can limit the facet generation to specific categories and character types by calling the categories and characters member functions of the generator class.
For example:
generator gen; gen.characters(wchar_t_facet); gen.categories(collation_facet | formatting_facet); std::locale::global(gen("de_DE.UTF-8"));
The variant part of the locale (the part that comes after @ symbol) is localization back-end dependent.
POSIX and std back-ends use their own OS specific naming conventions and depend on the current OS configuration. For example typical Linux distribution provides euro
for currency selection, cyrillic
and latin
for specification of language script.
winapi back-end does not support any variants.
ICU provides wide range of locale variant options. For detailed instructions read this ICU manual pages.
However in general it is represented as set of key=value pairs separated with a semicolon ";" For example: "@collation=phonebook;calendar=islamic-civil".
Currently ICU supports following keys:
calendar
- the calendar used for the current locale. For example: gregorian
, japanese
, buddhist
, islamic
, hebrew
, chinese
, islamic-civil
.collation
- the collation order used for this locales, for example phonebook
, pinyin
, traditional
, stroke
, direct
, posix
.currency
- the currency used in this locale, the standard 3 letter code like USD or JPY.numbers
- the numbering system used, for example: latn
, arab
, thai
.Please refer to CLDR and ICU documentation for exact list of keys and values: