Package com.ibm.icu.text
Class UnicodeSetSpanner
- java.lang.Object
-
- com.ibm.icu.text.UnicodeSetSpanner
-
public class UnicodeSetSpanner extends Object
A helper class used to count, replace, and trim CharSequences based on UnicodeSet matches. An instance is immutable (and thus thread-safe) iff the source UnicodeSet is frozen.Note: The counting, deletion, and replacement depend on alternating a
UnicodeSet.SpanConditionwith its inverse. That is, the code spans, then spans for the inverse, then spans, and so on. For the inverse, the following mapping is used:UnicodeSet.SpanCondition.SIMPLE→UnicodeSet.SpanCondition.NOT_CONTAINEDUnicodeSet.SpanCondition.CONTAINED→UnicodeSet.SpanCondition.NOT_CONTAINEDUnicodeSet.SpanCondition.NOT_CONTAINED→UnicodeSet.SpanCondition.SIMPLE
SIMPLE xxx[ab]cyyy CONTAINED xxx[abc]yyy NOT_CONTAINED [xxx]ab[cyyy] So here is what happens when you alternate:
start |xxxabcyyy NOT_CONTAINED xxx|abcyyy CONTAINED xxxabc|yyy NOT_CONTAINED xxxabcyyy| The entire string is traversed.
- Status:
- Stable ICU 54.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classUnicodeSetSpanner.CountMethodOptions for replaceFrom and countIn to control how to treat each matched span.static classUnicodeSetSpanner.TrimOptionOptions for the trim() method
-
Constructor Summary
Constructors Constructor Description UnicodeSetSpanner(UnicodeSet source)Create a spanner from a UnicodeSet.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description intcountIn(CharSequence sequence)Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.intcountIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod)Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE.intcountIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Returns the number of matching characters found in a character sequence.StringdeleteFrom(CharSequence sequence)Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.StringdeleteFrom(CharSequence sequence, UnicodeSet.SpanCondition spanCondition)Delete all matching spans in sequence, according to the spanCondition.booleanequals(Object other)UnicodeSetgetUnicodeSet()Returns the UnicodeSet used for processing.inthashCode()StringreplaceFrom(CharSequence sequence, CharSequence replacement)Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE.StringreplaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod)Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE.StringreplaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition.CharSequencetrim(CharSequence sequence)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE.CharSequencetrim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE.CharSequencetrim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption, UnicodeSet.SpanCondition spanCondition)Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition.
-
-
-
Constructor Detail
-
UnicodeSetSpanner
public UnicodeSetSpanner(UnicodeSet source)
Create a spanner from a UnicodeSet. For speed and safety, the UnicodeSet should be frozen. However, this class can be used with a non-frozen version to avoid the cost of freezing.- Parameters:
source- the original UnicodeSet- Status:
- Stable ICU 54.
-
-
Method Detail
-
getUnicodeSet
public UnicodeSet getUnicodeSet()
Returns the UnicodeSet used for processing. It is frozen iff the original was.- Returns:
- the construction set.
- Status:
- Stable ICU 54.
-
countIn
public int countIn(CharSequence sequence)
Returns the number of matching characters found in a character sequence, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters in- Returns:
- the count. Zero if there are none.
- Status:
- Stable ICU 54.
-
countIn
public int countIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod)
Returns the number of matching characters found in a character sequence, using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters incountMethod- whether to treat an entire span as a match, or individual elements as matches- Returns:
- the count. Zero if there are none.
- Status:
- Stable ICU 54.
-
countIn
public int countIn(CharSequence sequence, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)
Returns the number of matching characters found in a character sequence. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- the sequence to count characters incountMethod- whether to treat an entire span as a match, or individual elements as matchesspanCondition- the spanCondition to use. SIMPLE or CONTAINED means only count the elements in the span; NOT_CONTAINED is the reverse.
WARNING: when a UnicodeSet contains strings, there may be unexpected behavior in edge cases.- Returns:
- the count. Zero if there are none.
- Status:
- Stable ICU 54.
-
deleteFrom
public String deleteFrom(CharSequence sequence)
Delete all the matching spans in sequence, using SpanCondition.SIMPLE The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.- Returns:
- modified string.
- Status:
- Stable ICU 54.
-
deleteFrom
public String deleteFrom(CharSequence sequence, UnicodeSet.SpanCondition spanCondition)
Delete all matching spans in sequence, according to the spanCondition. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.spanCondition- specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)- Returns:
- modified string.
- Status:
- Stable ICU 54.
-
replaceFrom
public String replaceFrom(CharSequence sequence, CharSequence replacement)
Replace all matching spans in sequence by the replacement, counting by CountMethod.MIN_ELEMENTS using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""- Returns:
- modified string.
- Status:
- Stable ICU 54.
-
replaceFrom
public String replaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod)
Replace all matching spans in sequence by replacement, according to the CountMethod, using SpanCondition.SIMPLE. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""countMethod- whether to treat an entire span as a match, or individual elements as matches- Returns:
- modified string.
- Status:
- Stable ICU 54.
-
replaceFrom
public String replaceFrom(CharSequence sequence, CharSequence replacement, UnicodeSetSpanner.CountMethod countMethod, UnicodeSet.SpanCondition spanCondition)
Replace all matching spans in sequence by replacement, according to the countMethod and spanCondition. The code alternates spans; see the class doc forUnicodeSetSpannerfor a note about boundary conditions.- Parameters:
sequence- charsequence to replace matching spans in.replacement- replacement sequence. To delete, use ""countMethod- whether to treat an entire span as a match, or individual elements as matchesspanCondition- specify whether to modify the matching spans (CONTAINED or SIMPLE) or the non-matching (NOT_CONTAINED)- Returns:
- modified string.
- Status:
- Stable ICU 54.
-
trim
public CharSequence trim(CharSequence sequence)
Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start and end of the string, using TrimOption.BOTH and SpanCondition.SIMPLE. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab")"cat".- Parameters:
sequence- the sequence to trim- Returns:
- a subsequence
- Status:
- Stable ICU 54.
-
trim
public CharSequence trim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption)
Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, using the trimOption and SpanCondition.SIMPLE. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING)"catbab".- Parameters:
sequence- the sequence to trimtrimOption- LEADING, TRAILING, or BOTH- Returns:
- a subsequence
- Status:
- Stable ICU 54.
-
trim
public CharSequence trim(CharSequence sequence, UnicodeSetSpanner.TrimOption trimOption, UnicodeSet.SpanCondition spanCondition)
Returns a trimmed sequence (using CharSequence.subsequence()), that omits matching elements at the start or end of the string, depending on the trimOption and spanCondition. For example:
... returnsnew UnicodeSet("[ab]").trim("abacatbab", TrimOption.LEADING, SpanCondition.SIMPLE)"catbab".- Parameters:
sequence- the sequence to trimtrimOption- LEADING, TRAILING, or BOTHspanCondition- SIMPLE, CONTAINED or NOT_CONTAINED- Returns:
- a subsequence
- Status:
- Stable ICU 54.
-
-