ICU4X
International Components for Unicode
Loading...
Searching...
No Matches
Using ICU4X from C++

ICU4X's core functionality is completely available from C++, with headers generated by Diplomat. The port is header-only; no additional C++ translation units need to be compiled to use ICU4X from C++.

Typically C++ users can build ICU4X by building the icu_capi Rust crate, and linking the resultant static library to their C++ application. This crate contains all of the relevant Diplomat-generated extern "C" declarations, as well as an idiomatic C++ wrapper using these functions.

Using ICU4X in C++ is best demonstrated via the [examples](cpp). For example, here's an example showing off decimal formatting in ICU4X, built with this Makefile.

We are still working on improving the user experience of using ICU4X from other languages. As such, this tutorial may be a bit sparse, but we are happy to answer questions on our discussions forum and help you out

Building ICU4X

After installing Rust, create a local Rust configuration for ICU4X:

touch Cargo.toml
[package]
name = "unused"
version = "0.0.0"
[lib]
path = "unused"
[dependencies]
icu_capi = { version = "1.4", default-features = false }

Some of the keys are required by the parser, but won't be used by us.

icu_capi supports a list of option features:

  • default enables a default set of features
  • std [default] set this when building for a target with a Rust standard library, otherwise see below
  • compiled_data [default] to include data (ICU4XDataProvider::create_compiled())
  • simple_logger [default] enable basic stdout logging of error metadata. Further loggers can be added on request.
  • default_components [default] activate all stable ICU4X components. For smaller builds, this can be disabled, and components can be added with features like icu_list.
  • buffer_provider for working with blob data providers (ICU4XDataProvider::create_from_byte_slice())

You can now set features using the --features icu_capi/<feature> syntax to build the library:

cargo rustc --release -p icu_capi --crate-type staticlib --features icu_capi/default,icu_capi/buffer_provider
  • Be sure to pass --release to get an optimized build
  • Set CARGO_PROFILE_RELEASE_LTO=true to enable link-time optimization
  • Set CARGO_PROFILE_RELEASE_OPT_LEVEL="s" to optimize for size
  • See [cargo profiles](cargo-profiles) for more options

You should now have a target/release/libicu_capi.a, ready to compile into your C++ binary.

Using ICU4X from C++

Here's an annotated, shorter version of the fixed decimal example:

#include "ICU4XLogger.hpp"
#include <iostream>
#include <array>
int main() {
// For basic logging
// Create a locale object representing Bangla
ICU4XLocale locale = ICU4XLocale::create_from_string("bn").ok().value();
// Use compiled data
// Create a formatter object with the appropriate settings
dp, locale, ICU4XFixedDecimalGroupingStrategy::Auto).ok().value();
// Create a decimal representing the number 1,000,007
// Format it to a string
std::string out = fdf.format(decimal).ok().value();
// Report formatted value
std::cout << "Formatted value is " << out << std::endl;
if (out != "১০,০০,০০৭") {
std::cout << "Output does not match expected output" << std::endl;
return 1;
}
return 0;
}
Definition: ICU4XDataProvider.hpp:32
static ICU4XDataProvider create_compiled()
Definition: ICU4XDataProvider.hpp:127
Definition: ICU4XFixedDecimalFormatter.hpp:36
diplomat::result< std::string, ICU4XError > format(const ICU4XFixedDecimal &value) const
Definition: ICU4XFixedDecimalFormatter.hpp:114
static diplomat::result< ICU4XFixedDecimalFormatter, ICU4XError > create_with_grouping_strategy(const ICU4XDataProvider &provider, const ICU4XLocale &locale, ICU4XFixedDecimalGroupingStrategy grouping_strategy)
Definition: ICU4XFixedDecimalFormatter.hpp:83
Definition: ICU4XFixedDecimal.hpp:32
static ICU4XFixedDecimal create_from_u64(uint64_t v)
Definition: ICU4XFixedDecimal.hpp:327
Definition: ICU4XLocale.hpp:32
static diplomat::result< ICU4XLocale, ICU4XError > create_from_string(const std::string_view name)
Definition: ICU4XLocale.hpp:224
static bool init_simple_logger()
Definition: ICU4XLogger.hpp:54

Compiling with the ICU4X header files

The header files are shipped inside the crate, which Cargo has put somewhere on your system. You can find its location with

HEADERS=$(cargo metadata --format-version 1 | jq '.packages[] | select(.name == "icu_capi").manifest_path' | xargs dirname)/bindings/cpp

Then you can build with

g++ -Ltarget/release -I"$HEADERS" main.cpp -licu_capi

C++ versions beyond C++17 are supported, as are other C++ compilers.

Embedded platforms (<tt>no_std</tt>)

Users wishing to use ICU4X on a no_std platform will need to provide an allocator and a panic hook in order to build a linkable library. The icu_capi crate can provide a looping panic handler, and a malloc-backed allocator, under the looping_panic_handler and libc_alloc features, respectively.

cargo rustc --release -p icu_capi --crate-type staticlib --features icu_capi/default_components,icu_capi/buffer_provider,icu_capi/looping_panic_handler,icu_capi/libc_alloc

Tips

Documentation can be found here. These docs mirror the Rust code in the icu_capi crate, which can be explored on [docs.rs][rust-docs], though the precise types used may be different.

Fallible methods return diplomat::result, a Result type that can most commonly be converted to a std::optional over its Ok/Err types by calling .ok() or .err(). Most methods either use ICU4XError (an enum of error codes) as their error type, or std::monostate. Further error details can be logged by enabling a logger via ICU4XLogger, further loggers may be added on request.

The C++ headers include C headers for the underlying APIs as well, under namespace capi, found in the .h files. While these can be used directly, we recommend against it unless you are writing C code. These headers are not intended to be ergonomic and primarily exist for the C++ headers to use internally.

Slices are represented using std::span if available, otherwise a simple wrapper called diplomat::span is used.

These bindings may be customized by running diplomat-tool directly (including replacing the types used with alternate types like mozilla::Span), please ask on our discussions forum for more help on this.