Class SearchIterator
- java.lang.Object
-
- com.ibm.icu.text.SearchIterator
-
- Direct Known Subclasses:
StringSearch
public abstract class SearchIterator extends Object
SearchIteratoris an abstract base class that provides methods to search for a pattern within a text string. Instances ofSearchIteratormaintain a current position and scan over the target text, returning the indices the pattern is matched and the length of each match.SearchIteratordefines a protocol for text searching. Subclasses provide concrete implementations of various search algorithms. For example,StringSearchimplements language-sensitive pattern matching based on the comparison rules defined in aRuleBasedCollatorobject.Other options for searching include using a BreakIterator to restrict the points at which matches are detected.
SearchIteratorprovides an API that is similar to that of other text iteration classes such asBreakIterator. Using this class, it is easy to scan through text looking for all occurrences of a given pattern. The following example uses aStringSearchobject to find all instances of "fox" in the target string. Any other subclass ofSearchIteratorcan be used in an identical manner.String target = "The quick brown fox jumped over the lazy fox"; String pattern = "fox"; SearchIterator iter = new StringSearch(pattern, target); for (int pos = iter.first(); pos != SearchIterator.DONE; pos = iter.next()) { System.out.println("Found match at " + pos + ", length is " + iter.getMatchLength()); }- Author:
- Laura Werner, synwee
- See Also:
BreakIterator,RuleBasedCollator- Status:
- Stable ICU 2.0.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classSearchIterator.ElementComparisonTypeOption to control how collation elements are compared.
-
Field Summary
Fields Modifier and Type Field Description protected BreakIteratorbreakIteratorThe BreakIterator to define the boundaries of a logical match.static intDONEDONE is returned by previous() and next() after all valid matches have been returned, and by first() and last() if there are no matches at all.protected intmatchLengthLength of the most current match in target text.protected CharacterIteratortargetTextTarget text for searching.
-
Constructor Summary
Constructors Modifier Constructor Description protectedSearchIterator(CharacterIterator target, BreakIterator breaker)Protected constructor for use by subclasses.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description intfirst()Returns the first index at which the string text matches the search pattern.intfollowing(int position)Returns the first index equal or greater thanpositionat which the string text matches the search pattern.BreakIteratorgetBreakIterator()Returns the BreakIterator that is used to restrict the indexes at which matches are detected.SearchIterator.ElementComparisonTypegetElementComparisonType()Returns the collation element comparison type.abstract intgetIndex()Return the current index in the text being searched.StringgetMatchedText()Returns the text that was matched by the most recent call tofirst(),next(),previous(), orlast().intgetMatchLength()Returns the length of text in the string which matches the search pattern.intgetMatchStart()Returns the index to the match in the text string that was searched.CharacterIteratorgetTarget()Return the string text to be searched.protected abstract inthandleNext(int start)Abstract method which subclasses override to provide the mechanism for finding the next match in the target text.protected abstract inthandlePrevious(int startAt)Abstract method which subclasses override to provide the mechanism for finding the previous match in the target text.booleanisOverlapping()Return true if the overlapping property has been set.intlast()Returns the last index in the target text at which it matches the search pattern.intnext()Returns the index of the next point at which the text matches the search pattern, starting from the current position The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found.intpreceding(int position)Returns the first index less thanpositionat which the string text matches the search pattern.intprevious()Returns the index of the previous point at which the string text matches the search pattern, starting at the current position.voidreset()Resets the iteration.voidsetBreakIterator(BreakIterator breakiter)Set the BreakIterator that will be used to restrict the points at which matches are detected.voidsetElementComparisonType(SearchIterator.ElementComparisonType type)Sets the collation element comparison type.voidsetIndex(int position)Sets the position in the target text at which the next search will start.protected voidsetMatchLength(int length)Sets the length of the most recent match in the target text.protected voidsetMatchNotFound()Deprecated.This API is ICU internal only.voidsetOverlapping(boolean allowOverlap)Determines whether overlapping matches are returned.voidsetTarget(CharacterIterator text)Set the target text to be searched.
-
-
-
Field Detail
-
breakIterator
protected BreakIterator breakIterator
The BreakIterator to define the boundaries of a logical match. This value can be a null. See class documentation for more information.- See Also:
setBreakIterator(BreakIterator),getBreakIterator(),BreakIterator- Status:
- Stable ICU 2.0.
-
targetText
protected CharacterIterator targetText
Target text for searching.- See Also:
setTarget(CharacterIterator),getTarget()- Status:
- Stable ICU 2.0.
-
matchLength
protected int matchLength
Length of the most current match in target text. Value 0 is the default value.- See Also:
setMatchLength(int),getMatchLength()- Status:
- Stable ICU 2.0.
-
DONE
public static final int DONE
DONE is returned by previous() and next() after all valid matches have been returned, and by first() and last() if there are no matches at all.- See Also:
previous(),next(), Constant Field Values- Status:
- Stable ICU 2.0.
-
-
Constructor Detail
-
SearchIterator
protected SearchIterator(CharacterIterator target, BreakIterator breaker)
Protected constructor for use by subclasses. Initializes the iterator with the argument target text for searching and sets the BreakIterator. See class documentation for more details on the use of the target text andBreakIterator.- Parameters:
target- The target text to be searched.breaker- ABreakIteratorthat is used to determine the boundaries of a logical match. This argument can be null.- Throws:
IllegalArgumentException- thrown when argument target is null, or of length 0- See Also:
BreakIterator- Status:
- Stable ICU 2.0.
-
-
Method Detail
-
setIndex
public void setIndex(int position)
Sets the position in the target text at which the next search will start. This method clears any previous match.
- Parameters:
position- position from which to start the next search- Throws:
IndexOutOfBoundsException- thrown if argument position is out of the target text range.- See Also:
getIndex()- Status:
- Stable ICU 2.8.
-
setOverlapping
public void setOverlapping(boolean allowOverlap)
Determines whether overlapping matches are returned. See the class documentation for more information about overlapping matches.The default setting of this property is false
- Parameters:
allowOverlap- flag indicator if overlapping matches are allowed- See Also:
isOverlapping()- Status:
- Stable ICU 2.8.
-
setBreakIterator
public void setBreakIterator(BreakIterator breakiter)
Set the BreakIterator that will be used to restrict the points at which matches are detected.- Parameters:
breakiter- A BreakIterator that will be used to restrict the points at which matches are detected. If a match is found, but the match's start or end index is not a boundary as determined by theBreakIterator, the match will be rejected and another will be searched for. If this parameter isnull, no break detection is attempted.- See Also:
BreakIterator- Status:
- Stable ICU 2.0.
-
setTarget
public void setTarget(CharacterIterator text)
Set the target text to be searched. Text iteration will then begin at the start of the text string. This method is useful if you want to reuse an iterator to search within a different body of text.- Parameters:
text- new text iterator to look for match,- Throws:
IllegalArgumentException- thrown when text is null or has 0 length- See Also:
getTarget()- Status:
- Stable ICU 2.4.
-
getMatchStart
public int getMatchStart()
Returns the index to the match in the text string that was searched. This call returns a valid result only after a successful call tofirst(),next(),previous(), orlast(). Just after construction, or after a searching method returnsDONE, this method will returnDONE.Use
getMatchLength()to get the matched string length.- Returns:
- index of a substring within the text string that is being searched.
- See Also:
first(),next(),previous(),last()- Status:
- Stable ICU 2.0.
-
getIndex
public abstract int getIndex()
Return the current index in the text being searched. If the iteration has gone past the end of the text (or past the beginning for a backwards search),DONEis returned.- Returns:
- current index in the text being searched.
- Status:
- Stable ICU 2.8.
-
getMatchLength
public int getMatchLength()
Returns the length of text in the string which matches the search pattern. This call returns a valid result only after a successful call tofirst(),next(),previous(), orlast(). Just after construction, or after a searching method returnsDONE, this method will return 0.- Returns:
- The length of the match in the target text, or 0 if there is no match currently.
- See Also:
first(),next(),previous(),last()- Status:
- Stable ICU 2.0.
-
getBreakIterator
public BreakIterator getBreakIterator()
Returns the BreakIterator that is used to restrict the indexes at which matches are detected. This will be the same object that was passed to the constructor or tosetBreakIterator(com.ibm.icu.text.BreakIterator). If theBreakIteratorhas not been set,nullwill be returned. SeesetBreakIterator(com.ibm.icu.text.BreakIterator)for more information.- Returns:
- the BreakIterator set to restrict logic matches
- See Also:
setBreakIterator(com.ibm.icu.text.BreakIterator),BreakIterator- Status:
- Stable ICU 2.0.
-
getTarget
public CharacterIterator getTarget()
Return the string text to be searched.- Returns:
- text string to be searched.
- Status:
- Stable ICU 2.0.
-
getMatchedText
public String getMatchedText()
Returns the text that was matched by the most recent call tofirst(),next(),previous(), orlast(). If the iterator is not pointing at a valid match (e.g. just after construction or afterDONEhas been returned, returns an empty string.- Returns:
- the substring in the target test of the most recent match, or null if there is no match currently.
- See Also:
first(),next(),previous(),last()- Status:
- Stable ICU 2.0.
-
next
public int next()
Returns the index of the next point at which the text matches the search pattern, starting from the current position The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to a position after the end of the text string.- Returns:
- The index of the next match after the current position,
or
DONEif there are no more matches. - See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
previous
public int previous()
Returns the index of the previous point at which the string text matches the search pattern, starting at the current position. The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to the indexDONE.- Returns:
- The index of the previous match before the current position,
or
DONEif there are no more matches. - See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
isOverlapping
public boolean isOverlapping()
Return true if the overlapping property has been set. SeesetOverlapping(boolean)for more information.- Returns:
- true if the overlapping property has been set, false otherwise
- See Also:
setOverlapping(boolean)- Status:
- Stable ICU 2.8.
-
reset
public void reset()
Resets the iteration. Search will begin at the start of the text string if a forward iteration is initiated before a backwards iteration. Otherwise if a backwards iteration is initiated before a forwards iteration, the search will begin at the end of the text string.- Status:
- Stable ICU 2.0.
-
first
public final int first()
Returns the first index at which the string text matches the search pattern. The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to the indexDONE.- Returns:
- The character index of the first match, or
DONEif there are no matches. - See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
following
public final int following(int position)
Returns the first index equal or greater thanpositionat which the string text matches the search pattern. The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to the indexDONE.- Parameters:
position- where search if to start from.- Returns:
- The character index of the first match following
position, orDONEif there are no matches. - Throws:
IndexOutOfBoundsException- If position is less than or greater than the text range for searching.- See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
last
public final int last()
Returns the last index in the target text at which it matches the search pattern. The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to the indexDONE.- Returns:
- The index of the first match, or
DONEif there are no matches. - See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
preceding
public final int preceding(int position)
Returns the first index less thanpositionat which the string text matches the search pattern. The iterator is adjusted so that its current index (as returned bygetIndex()) is the match position if one was found. If a match is not found,DONEwill be returned and the iterator will be adjusted to the indexDONEWhen the overlapping option (
isOverlapping()) is off, the last index of the result match is always less thanposition. When the overlapping option is on, the result match may span acrossposition.- Parameters:
position- where search is to start from.- Returns:
- The character index of the first match preceding
position, orDONEif there are no matches. - Throws:
IndexOutOfBoundsException- If position is less than or greater than the text range for searching- See Also:
getIndex()- Status:
- Stable ICU 2.0.
-
setMatchLength
protected void setMatchLength(int length)
Sets the length of the most recent match in the target text. Subclasses' handleNext() and handlePrevious() methods should call this after they find a match in the target text.- Parameters:
length- new length to set- See Also:
handleNext(int),handlePrevious(int)- Status:
- Stable ICU 2.0.
-
handleNext
protected abstract int handleNext(int start)
Abstract method which subclasses override to provide the mechanism for finding the next match in the target text. This allows different subclasses to provide different search algorithms.If a match is found, the implementation should return the index at which the match starts and should call
setMatchLength(int)with the number of characters in the target text that make up the match. If no match is found, the method should returnDONE.- Parameters:
start- The index in the target text at which the search should start.- Returns:
- index at which the match starts, else if match is not found
DONEis returned - See Also:
setMatchLength(int)- Status:
- Stable ICU 2.0.
-
handlePrevious
protected abstract int handlePrevious(int startAt)
Abstract method which subclasses override to provide the mechanism for finding the previous match in the target text. This allows different subclasses to provide different search algorithms.If a match is found, the implementation should return the index at which the match starts and should call
setMatchLength(int)with the number of characters in the target text that make up the match. If no match is found, the method should returnDONE.- Parameters:
startAt- The index in the target text at which the search should start.- Returns:
- index at which the match starts, else if match is not found
DONEis returned - See Also:
setMatchLength(int)- Status:
- Stable ICU 2.0.
-
setMatchNotFound
@Deprecated protected void setMatchNotFound()
Deprecated.This API is ICU internal only.- Status:
- Internal. This API is ICU internal only.
-
setElementComparisonType
public void setElementComparisonType(SearchIterator.ElementComparisonType type)
Sets the collation element comparison type.The default comparison type is
SearchIterator.ElementComparisonType.STANDARD_ELEMENT_COMPARISON.- See Also:
SearchIterator.ElementComparisonType,getElementComparisonType()- Status:
- Stable ICU 53.
-
getElementComparisonType
public SearchIterator.ElementComparisonType getElementComparisonType()
Returns the collation element comparison type.- See Also:
SearchIterator.ElementComparisonType,setElementComparisonType(ElementComparisonType)- Status:
- Stable ICU 53.
-
-