Why Use ICU4J?

Summary

Fully implements current standards
- Unicode collation, normalization, break iteration
- Updated more frequently than Java
- Full CLDR Locale data
Improved performance

Details

Normalization
- Addresses lack of Unicode normalization support in Java 5
- Addresses outdated Unicode normalization support in Java 6
Up-To-Date Unicode version
- Java 5 & 6 are Unicode 4.0, while ICU 4.0 is Unicode 5.1
- Characters added after Unicode 4.0 do not have character properties in Java
IDNA and StringPrep
- Addresses lack of Internationalized Domain Name support in Java 5
- Addresses generic stringprep (RFC3454) support. stringprep is required for supporting various internet protocols (NFS, LDAP…)
Collation
- Provides Unicode standard compliant collation support
- ICU Collator fully implements UTR#10, while the Java implementation is outdated and not compatible.
Provides ICU UnicodeSet for easy character range validation
- much more flexible and convenient for validating identifiers/text tokens with a given syntax
- full boolean operations (union, intersection, difference)
- all Unicode properties supported
Locales
- BCP47 (language tag) support in locale class (supporting “script”, 3-letter language codes, 3-digit region codes)
- Locale data coverage - much better, many more locales, up-to-date
Broader charset converter coverage
- In ICU4J 4.2, also output charset selection
- Custom fallback in charset converter
Other features missing in the JDK
- Dates:
  - Many more date formats: month+day, year+month,…
  - Date interval formats: “Dec 15-17, 2009”
  - APIs for returning time zone transitions
- Other formatting
  - Plural formatting, including units: “1 hour” / “2 hours”
  - Rule based number format (“three thousand two hundred”)
  - Extensive Non-Gregorian calendar support
- Transliterator (for flexible text/script transformations)
- Collation-sensitive string search
- Same data as ICU4C, allowing same behavior across programming languages
- All Unicode character properties - over 80, Java provides access to only about 10
- Thai wordbreak

Performance & Size

Instantiation times are comparable
- Common instantiate and reuse model
- ICU4J and Java both use caches to limit impact
Collation performance many times faster
- sorting: 2 to 20 times faster
- sort key generation: 1.5 to 4 times faster
- sort key length: 2/3 to 1/4 the length of Java sort keys
Property access much faster (isLetter, isWhitespace,…)
Can easily produce scaled-down version (removing data)

API

Subclasses of JDK classes where possible
Drop-in (change of import) if not

Summary

ICU4J is not for you if
- you have tight size constraints
- you require the Java runtime behavior
ICU4J is for you if
- you need full compliance with current standards
- you need current or additional locale and property data
- you need customizability
- you need features missing from Java (normalization, collation,…)
- you need better performance