Struct icu_locale::LanguageIdentifier

pub struct LanguageIdentifier {
    pub language: Language,
    pub script: Option<Script>,
    pub region: Option<Region>,
    pub variants: Variants,
}
Expand description

A core struct representing a Unicode BCP47 Language Identifier.

§Parsing

Unicode recognizes three levels of standard conformance for any language identifier:

  • well-formed - syntactically correct
  • valid - well-formed and only uses registered language, region, script and variant subtags…
  • canonical - valid and no deprecated codes or structure.

At the moment parsing normalizes a well-formed language identifier converting _ separators to - and adjusting casing to conform to the Unicode standard.

Any syntactically invalid subtags will cause the parsing to fail with an error.

This operation normalizes syntax to be well-formed. No legacy subtag replacements is performed. For validation and canonicalization, see LocaleCanonicalizer.

§Ordering

This type deliberately does not implement Ord or PartialOrd because there are multiple possible orderings, and the team did not want to favor one over any other.

Instead, there are functions available that return these different orderings:

See issue: https://github.com/unicode-org/icu4x/issues/1215

§Examples

Simple example:

use icu::locale::{
    langid,
    subtags::{language, region},
};

let li = langid!("en-US");

assert_eq!(li.language, language!("en"));
assert_eq!(li.script, None);
assert_eq!(li.region, Some(region!("US")));
assert_eq!(li.variants.len(), 0);

More complex example:

use icu::locale::{
    langid,
    subtags::{language, region, script, variant},
};

let li = langid!("eN_latn_Us-Valencia");

assert_eq!(li.language, language!("en"));
assert_eq!(li.script, Some(script!("Latn")));
assert_eq!(li.region, Some(region!("US")));
assert_eq!(li.variants.get(0), Some(&variant!("valencia")));

Fields§

§language: Language

Language subtag of the language identifier.

§script: Option<Script>

Script subtag of the language identifier.

§region: Option<Region>

Region subtag of the language identifier.

§variants: Variants

Variant subtags of the language identifier.

Implementations§

§

impl LanguageIdentifier

pub fn try_from_str(s: &str) -> Result<LanguageIdentifier, ParseError>

A constructor which takes a utf8 slice, parses it and produces a well-formed LanguageIdentifier.

§Examples
use icu::locale::LanguageIdentifier;

LanguageIdentifier::try_from_str("en-US").expect("Parsing failed");

pub fn try_from_utf8( code_units: &[u8], ) -> Result<LanguageIdentifier, ParseError>

pub fn try_from_locale_bytes(v: &[u8]) -> Result<LanguageIdentifier, ParseError>

A constructor which takes a utf8 slice which may contain extension keys, parses it and produces a well-formed LanguageIdentifier.

§Examples
use icu::locale::{langid, LanguageIdentifier};

let li = LanguageIdentifier::try_from_locale_bytes(b"en-US-x-posix")
    .expect("Parsing failed.");

assert_eq!(li, langid!("en-US"));

This method should be used for input that may be a locale identifier. All extensions will be lost.

pub const fn default() -> LanguageIdentifier

Const-friendly version of Default::default.

pub const fn is_default(&self) -> bool

Whether this language identifier equals Self::default.

pub fn normalize_utf8(input: &[u8]) -> Result<Cow<'_, str>, ParseError>

Normalize the language identifier (operating on UTF-8 formatted byte slices)

This operation will normalize casing and the separator.

§Examples
use icu::locale::LanguageIdentifier;

assert_eq!(
    LanguageIdentifier::normalize("pL_latn_pl").as_deref(),
    Ok("pl-Latn-PL")
);

pub fn normalize(input: &str) -> Result<Cow<'_, str>, ParseError>

Normalize the language identifier (operating on strings)

This operation will normalize casing and the separator.

§Examples
use icu::locale::LanguageIdentifier;

assert_eq!(
    LanguageIdentifier::normalize("pL_latn_pl").as_deref(),
    Ok("pl-Latn-PL")
);

pub fn strict_cmp(&self, other: &[u8]) -> Ordering

Compare this LanguageIdentifier with BCP-47 bytes.

The return value is equivalent to what would happen if you first converted this LanguageIdentifier to a BCP-47 string and then performed a byte comparison.

This function is case-sensitive and results in a total order, so it is appropriate for binary search. The only argument producing Ordering::Equal is self.to_string().

§Examples
use icu::locale::LanguageIdentifier;
use std::cmp::Ordering;

let bcp47_strings: &[&str] = &[
    "pl-Latn-PL",
    "und",
    "und-Adlm",
    "und-GB",
    "und-ZA",
    "und-fonipa",
    "zh",
];

for ab in bcp47_strings.windows(2) {
    let a = ab[0];
    let b = ab[1];
    assert!(a.cmp(b) == Ordering::Less);
    let a_langid = a.parse::<LanguageIdentifier>().unwrap();
    assert!(a_langid.strict_cmp(a.as_bytes()) == Ordering::Equal);
    assert!(a_langid.strict_cmp(b.as_bytes()) == Ordering::Less);
}

pub fn total_cmp(&self, other: &LanguageIdentifier) -> Ordering

Compare this LanguageIdentifier with another LanguageIdentifier field-by-field. The result is a total ordering sufficient for use in a BTreeSet.

Unlike LanguageIdentifier::strict_cmp, the ordering may or may not be equivalent to string ordering, and it may or may not be stable across ICU4X releases.

§Examples

Using a wrapper to add one of these to a BTreeSet:

use icu::locale::LanguageIdentifier;
use std::cmp::Ordering;
use std::collections::BTreeSet;

#[derive(PartialEq, Eq)]
struct LanguageIdentifierTotalOrd(LanguageIdentifier);

impl Ord for LanguageIdentifierTotalOrd {
    fn cmp(&self, other: &Self) -> Ordering {
        self.0.total_cmp(&other.0)
    }
}

impl PartialOrd for LanguageIdentifierTotalOrd {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        Some(self.cmp(other))
    }
}

let _: BTreeSet<LanguageIdentifierTotalOrd> = unimplemented!();

pub fn normalizing_eq(&self, other: &str) -> bool

Compare this LanguageIdentifier with a potentially unnormalized BCP-47 string.

The return value is equivalent to what would happen if you first parsed the BCP-47 string to a LanguageIdentifier and then performed a structural comparison.

§Examples
use icu::locale::LanguageIdentifier;

let bcp47_strings: &[&str] = &[
    "pl-LaTn-pL",
    "uNd",
    "UnD-adlm",
    "uNd-GB",
    "UND-FONIPA",
    "ZH",
];

for a in bcp47_strings {
    assert!(a.parse::<LanguageIdentifier>().unwrap().normalizing_eq(a));
}

Trait Implementations§

§

impl Bake for LanguageIdentifier

§

fn bake(&self, env: &CrateEnv) -> TokenStream

Returns a TokenStream that would evaluate to self. Read more
§

impl Clone for LanguageIdentifier

§

fn clone(&self) -> LanguageIdentifier

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
§

impl Debug for LanguageIdentifier

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl Default for LanguageIdentifier

§

fn default() -> LanguageIdentifier

Returns the “default value” for a type. Read more
§

impl<'de> Deserialize<'de> for LanguageIdentifier

§

fn deserialize<D>( deserializer: D, ) -> Result<LanguageIdentifier, <D as Deserializer<'de>>::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
§

impl Display for LanguageIdentifier

This trait is implemented for compatibility with fmt!. To create a string, [Writeable::write_to_string] is usually more efficient.

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl From<&LanguageIdentifier> for LocalePreferences

§

fn from(lid: &LanguageIdentifier) -> LocalePreferences

Converts to this type from the input type.
§

impl From<(Language, Option<Script>, Option<Region>)> for LanguageIdentifier

Convert from an LSR tuple to a LanguageIdentifier.

§Examples

use icu::locale::{
    langid,
    subtags::{language, region, script},
    LanguageIdentifier,
};

let lang = language!("en");
let script = script!("Latn");
let region = region!("US");
assert_eq!(
    LanguageIdentifier::from((lang, Some(script), Some(region))),
    langid!("en-Latn-US")
);
§

fn from(lsr: (Language, Option<Script>, Option<Region>)) -> LanguageIdentifier

Converts to this type from the input type.
§

impl From<Language> for LanguageIdentifier

§Examples

use icu::locale::{langid, subtags::language, LanguageIdentifier};

assert_eq!(LanguageIdentifier::from(language!("en")), langid!("en"));
§

fn from(language: Language) -> LanguageIdentifier

Converts to this type from the input type.
§

impl From<LanguageIdentifier> for Locale

§

fn from(id: LanguageIdentifier) -> Locale

Converts to this type from the input type.
§

impl From<Locale> for LanguageIdentifier

§

fn from(loc: Locale) -> LanguageIdentifier

Converts to this type from the input type.
§

impl From<Option<Region>> for LanguageIdentifier

§Examples

use icu::locale::{langid, subtags::region, LanguageIdentifier};

assert_eq!(
    LanguageIdentifier::from(Some(region!("US"))),
    langid!("und-US")
);
§

fn from(region: Option<Region>) -> LanguageIdentifier

Converts to this type from the input type.
§

impl From<Option<Script>> for LanguageIdentifier

§Examples

use icu::locale::{langid, subtags::script, LanguageIdentifier};

assert_eq!(
    LanguageIdentifier::from(Some(script!("latn"))),
    langid!("und-Latn")
);
§

fn from(script: Option<Script>) -> LanguageIdentifier

Converts to this type from the input type.
§

impl FromStr for LanguageIdentifier

§

type Err = ParseError

The associated error which can be returned from parsing.
§

fn from_str( s: &str, ) -> Result<LanguageIdentifier, <LanguageIdentifier as FromStr>::Err>

Parses a string s to return a value of this type. Read more
§

impl Hash for LanguageIdentifier

§

fn hash<__H>(&self, state: &mut __H)
where __H: Hasher,

Feeds this value into the given Hasher. Read more
1.3.0 · source§

fn hash_slice<H>(data: &[Self], state: &mut H)
where H: Hasher, Self: Sized,

Feeds a slice of this type into the given Hasher. Read more
§

impl PartialEq for LanguageIdentifier

§

fn eq(&self, other: &LanguageIdentifier) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
§

impl Serialize for LanguageIdentifier

§

fn serialize<S>( &self, serializer: S, ) -> Result<<S as Serializer>::Ok, <S as Serializer>::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more
§

impl Writeable for LanguageIdentifier

§

fn write_to<W>(&self, sink: &mut W) -> Result<(), Error>
where W: Write + ?Sized,

Writes a string to the given sink. Errors from the sink are bubbled up. The default implementation delegates to write_to_parts, and discards any Part annotations.
§

fn writeable_length_hint(&self) -> LengthHint

Returns a hint for the number of UTF-8 bytes that will be written to the sink. Read more
§

fn write_to_string(&self) -> Cow<'_, str>

Creates a new String with the data from this Writeable. Like ToString, but smaller and faster. Read more
§

fn write_to_parts<S>(&self, sink: &mut S) -> Result<(), Error>
where S: PartsWrite + ?Sized,

Write bytes and Part annotations to the given sink. Errors from the sink are bubbled up. The default implementation delegates to write_to, and doesn’t produce any Part annotations.
§

impl Eq for LanguageIdentifier

§

impl StructuralPartialEq for LanguageIdentifier

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> CloneToUninit for T
where T: Clone,

source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> IntoEither for T

source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
source§

impl<T> Serialize for T
where T: Serialize + ?Sized,

source§

fn erased_serialize(&self, serializer: &mut dyn Serializer) -> Result<Ok, Error>

source§

impl<T> ToOwned for T
where T: Clone,

source§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T> ToString for T
where T: Display + ?Sized,

source§

default fn to_string(&self) -> String

Converts the given value to a String. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

source§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

§

impl<T> ErasedDestructor for T
where T: 'static,

§

impl<T> MaybeSendSync for T
where T: Send + Sync,