ICU4X
International Components for Unicode
Loading...
Searching...
No Matches
Using ICU4X from C++

ICU4X's core functionality is completely available from C++, with headers generated by Diplomat. The port is header-only; no additional C++ translation units need to be compiled to use ICU4X from C++.

Typically C++ users can build ICU4X by building the icu_capi Rust crate, and linking the resultant static library to their C++ application. This crate contains all of the relevant Diplomat-generated extern "C" declarations, as well as an idiomatic C++ wrapper using these functions.

Using ICU4X in C++ is best demonstrated via the [examples](cpp). For example, here's an example showing off decimal formatting in ICU4X, built with this Makefile.

We are still working on improving the user experience of using ICU4X from other languages. As such, this tutorial may be a bit sparse, but we are happy to answer questions on our discussions forum and help you out

Building ICU4X

After installing Rust, create a local Rust configuration for ICU4X:

touch Cargo.toml
[package]
name = "unused"
version = "0.0.0"
resolver = "2"
[lib]
path = "unused"
[dependencies]
icu_capi = { version = "2.0.0-dev", default-features = false, features = [] }

Some of the keys are required by the parser, but won't be used by us.

icu_capi supports a list of option features:

  • default enables a default set of features
  • std [default] set this when building for a target with a Rust standard library, otherwise see below
  • compiled_data [default] to include data (ICU4XDataProvider::create_compiled())
  • simple_logger [default] enable basic stdout logging of error metadata. Further loggers can be added on request.
  • default_components [default] activate all stable ICU4X components. For smaller builds, this can be disabled, and components can be added with features like list.
  • buffer_provider for working with blob data providers (ICU4XDataProvider::create_from_byte_slice())

You can now set features by updating the features key in Cargo.toml:

icu_capi = { version = "2.0.0-dev", default-features = false, features = ["default", "buffer_provider"] }

You can now build a staticlib with the following command:

cargo rustc --release -p icu_capi --crate-type staticlib
  • Be sure to pass --release to get an optimized build
  • Set CARGO_PROFILE_RELEASE_LTO=true to enable link-time optimization
  • Set CARGO_PROFILE_RELEASE_OPT_LEVEL="s" to optimize for size
  • See [cargo profiles](cargo-profiles) for more options

You should now have a target/release/libicu_capi.a, ready to compile into your C++ binary.

Using ICU4X from C++

Here's an annotated, shorter version of the fixed decimal example:

#include "FixedDecimalFormatter.hpp"
#include "DataStruct.hpp"
#include "Logger.hpp"
#include <iostream>
#include <array>
int main() {
// For basic logging
Logger::init_simple_logger();
// Create a locale object representing Bangla
Locale locale = Locale::create_from_string("bn").ok().value();
// Use compiled data
DataProvider dp = DataProvider::create_compiled();
// Create a formatter object with the appropriate settings
FixedDecimalFormatter fdf = FixedDecimalFormatter::create_with_grouping_strategy(
dp, locale, FixedDecimalGroupingStrategy::Auto).ok().value();
// Create a decimal representing the number 1,000,007
FixedDecimal decimal = FixedDecimal::create_from_u64(1000007);
// Format it to a string
std::string out = fdf.format(decimal).ok().value();
// Report formatted value
std::cout << "Formatted value is " << out << std::endl;
if (out != "১০,০০,০০৭") {
std::cout << "Output does not match expected output" << std::endl;
return 1;
}
return 0;
}

Compiling with the ICU4X header files

The header files are shipped inside the crate, which Cargo has put somewhere on your system. You can find its location with

HEADERS=$(cargo metadata --format-version 1 | jq '.packages[] | select(.name == "icu_capi").manifest_path' | xargs dirname)/bindings/cpp

Then you can build with

g++ -Ltarget/release -I"$HEADERS" main.cpp -licu_capi

C++ versions beyond C++17 are supported, as are other C++ compilers.

Embedded platforms (<tt>no_std</tt>)

Users wishing to use ICU4X on a no_std platform will need to provide an allocator and a panic hook in order to build a linkable library. The icu_capi crate can provide a looping panic handler, and a malloc-backed allocator, under the looping_panic_handler and libc_alloc features, respectively.

icu_capi = { version = "2.0.0-dev", default-features = false, features = ["default_components", "buffer_provider", "looping_panic_handler", "libc_alloc"] }

This can be built the same way, with an explicitly specified --target (in this case, thumbv7em-none-eabi, but it can be any no_std target)

cargo rustc --release -p icu_capi --crate-type staticlib --target thumbv7em-none-eabi

Tips

Documentation can be found here. These docs mirror the Rust code in the icu_capi crate, which can be explored on [docs.rs][rust-docs], though the precise types used may be different.

Fallible methods return diplomat::result, a Result type that can most commonly be converted to a std::optional over its Ok/Err types by calling .ok() or .err(). Most methods either use ICU4XError (an enum of error codes) as their error type, or std::monostate. Further error details can be logged by enabling a logger via ICU4XLogger, further loggers may be added on request.

The C++ headers include C headers for the underlying APIs as well, under namespace capi, found in the .h files. While these can be used directly, we recommend against it unless you are writing C code. These headers are not intended to be ergonomic and primarily exist for the C++ headers to use internally.

Slices are represented using std::span if available, otherwise a simple wrapper called diplomat::span is used.

These bindings may be customized by running diplomat-tool directly (including replacing the types used with alternate types like mozilla::Span), please ask on our discussions forum for more help on this.