ICU 78

ICU is the premier library for software internationalization, used by a wide array of companies and organizations.

Release Candidate

This is a release candidate. Please use it for testing, but do not use it in production.

Release Overview

ICU 78 updates to Unicode 17 (blog), including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations.

It also updates to CLDR 48 (beta blog) locale data with new locales, and various additions and corrections.

In Java, there is a new Segmenter API which is easier and safer to use than BreakIterator.
In C++, there is a new set of APIs for Unicode string (UTF-8/16/32) code point iteration that works seamlessly with modern C++ iterators and ranges.

The Java implementation of the CLDR MessageFormat 2.0 specification has been updated to CLDR 48. The core API has been upgraded to “draft”, while the Data Model API remains in technology preview.

The C++ implementation of MessageFormat 2.0 is at CLDR 47 level and remains in technology preview.

ICU 78 and CLDR 48 are major releases, including a new version of Unicode and major locale data improvements.

For more details, including migration issues, see below.

Please use the icu-support mailing list and/or find/submit error reports.

Attention: Future Changes

Beginning with CLDR 49 / ICU 79 (2026-mar), CLDR and ICU are planning to make changes in time formatting options for the hour cycle (details of 12/24 hour formats), make the week-of-year numbering always follow ISO rules, and remove the pre-Meiji Japanese eras.

See the CLDR V49 advance warnings.

Version Number

The initial release has library version number 78.1.

If there are maintenance releases, they will be 78.2, 78.3, etc. (During ICU 78 development, the library version number was 78.0.x.)

Note: There may be additional commits on the maint/maint-78 branch that are not included in the prepackaged download files.

Common Changes

  • Unicode 17 (blog):
    • Adds two modern-use scripts: Beria Erfe & Tolong Siki
    • Adds two historic scripts & more than 4300 new CJK ideographs
    • New currency sign: SAUDI RIYAL SIGN
    • Eight new emoji characters
    • Segmentation improvements (word & line breaking)
    • Identifiers & security: The majority of CJK ideographs, the Bopomofo script, and more than 1100 other pre-Unicode 17 characters, are no longer recommended for use in identifiers by default (Identifier_Status & Identifier_Type properties), after thorough review.
      • For CJK ideographs, only 19,842 that are in modern common use (out of now more than 100,000) remain recommended for default identifier use.
  • CLDR 48 (beta blog):
    • Significant data updates across all locales
    • Locales which are now at modern coverage level: Akan, Bashkir, Chuvash, Kazakh (Arabic), Romansh, Shan, Quechua
    • Locales which are now at moderate coverage level: Anii, Esperanto
    • Many new measurement units, for scientific contexts (coulombs, farads, teslas, etc.) and for English systems (fortnights, imperial pints, etc.)
    • Some measurement unit identifiers changed, see CLDR 48 Migration
  • Time zone data (tzdata) version 2025b (2025-mar).

ICU4C Specific Changes

  • API changes since ICU4C 77 (Markdown) / (HTML)
    • The widely used class Locale has been optimized. Locale objects for most common locale IDs now only use 40 bytes (down from at least 224 bytes). (ICU-20392)
    • New set of “C++ Header-Only APIs” for Unicode string (UTF-8/16/32) code point iteration that works seamlessly with modern C++ iterators and ranges. As with the existing C macros, there are versions which validate the code unit sequences on the fly, as well as fast but “unsafe” versions which assume & require well-formed strings. (ICU-23004]):
      unicode/utfiterator.h
      (For an introduction to “C++ Header-Only APIs” see this section of the ICU 76 Migration Issues.)
    • Additional Unicode helper APIs (ICU-23152):
      • (unicode/utf.h, unicode/utf8.h): U_IS_CODE_POINT(cp), U_IS_SCALAR_VALUE(cp), U8_LENGTH_FROM_LEAD_BYTE[_UNSAFE]
      • (icu::UnicodeString): begin(), end(), rbegin(), rend(), push_back(c) (a UnicodeString is now a C++ “range” of char16_t code units), toUTF8String() without output string parameter
      • (new unicode/utfiterator.h): “Range” classes AllCodePoints & AllScalarValues
      • (new unicode/utfstring.h): Functions that write a code point to a string object

ICU4J Specific Changes

  • The minimum required Java version has been upgraded from Java 8 to Java 11. (ICU-23072)
    This is a significant, useful update in terms of the Java language and standard library, and simplifies ICU tooling. Note that Android desugaring supports at least Java 11 since late 2023.
  • API Changes since ICU4J 77
    • New Segmenter API which is easier and safer to use than BreakIterator. The new API builds immutable objects and returns Java Streams of boundaries and segments. (ICU-22789)
      See package com.ibm.icu.segmenter, interface Segmenter and its implementing classes, etc.
    • All clone() functions now explicitly return their class types, rather than Object (“covariant return types”), so that call sites no longer need to downcast. (ICU-23140)
    • Additional Unicode helper functions (class UCharacter): isNoncharacter(cp), isScalarValue(cp), allCodePoints(), allScalarValuesStream(), etc. (ICU-23152)
  • We have removed the ICU4J Locale Service Provider. (ICU-23071)
    It had become much less useful than when we added it and had very low usage. Projects that used it should call ICU4J directly instead.
  • The Java implementation of the CLDR MessageFormat 2.0 specification has been updated to CLDR 48. The core API has been upgraded to “draft”, while the Data Model API remains in technology preview.

Known Issues

  • (none yet)

Migration Issues

  • (none yet)

ICU4C Platform Support

ICU4C requires C++17 and has been tested with up to C++23.

We routinely test on recent versions of Linux, macOS, and Windows.

We accept patches for other platforms.

Windows: The minimum supported version is Windows 7. (See How To Build And Install On Windows for more details.)

ICU4J Platform Support

ICU4J works on Java 11..25 (at least).

ICU4J should work on Android API level 21 and later but may require “library desugaring”.

Download

GitHub

Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-78.1rc

See the Source Code Setup page for how to download the ICU file tree directly from GitHub.

ICU locale data was generated from CLDR data equivalent to:

Maven

  • TODO: not published yet
  • https://mvnrepository.com/artifact/com.ibm.icu/icu4j/78.1
  • https://mvnrepository.com/artifact/com.ibm.icu/icu4j-charset/78.1