ICU 76.1 76.1
|
This class allows one to iterate through all the strings that are canonically equivalent to a given string. More...
#include <caniter.h>
Public Member Functions | |
CanonicalIterator (const UnicodeString &source, UErrorCode &status) | |
Construct a CanonicalIterator object. | |
virtual | ~CanonicalIterator () |
Destructor Cleans pieces. | |
UnicodeString | getSource () |
Gets the NFD form of the current source we are iterating over. | |
void | reset () |
Resets the iterator so that one can start again from the beginning. | |
UnicodeString | next () |
Get the next canonically equivalent string. | |
void | setSource (const UnicodeString &newSource, UErrorCode &status) |
Set a new source for this iterator. | |
virtual UClassID | getDynamicClassID () const override |
ICU "poor man's RTTI", returns a UClassID for the actual class. | |
Public Member Functions inherited from icu::UObject | |
virtual | ~UObject () |
Destructor. | |
Static Public Member Functions | |
static void | permute (UnicodeString &source, UBool skipZeros, Hashtable *result, UErrorCode &status, int32_t depth=0) |
Dumb recursive implementation of permutation. | |
static UClassID | getStaticClassID () |
ICU "poor man's RTTI", returns a UClassID for this class. | |
This class allows one to iterate through all the strings that are canonically equivalent to a given string.
For example, here are some sample results: Results for: {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 1: \u0041\u030A\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 2: \u0041\u030A\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 3: \u0041\u030A\u1E0B\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 4: \u0041\u030A\u1E11\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 5: \u00C5\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 6: \u00C5\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 7: \u00C5\u1E0B\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 8: \u00C5\u1E11\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 9: \u212B\u0064\u0307\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 10: \u212B\u0064\u0327\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 11: \u212B\u1E0B\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 12: \u212B\u1E11\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE}
Note: the code is intended for use with small strings, and is not suitable for larger ones, since it has not been optimized for that situation. Note, CanonicalIterator is not intended to be subclassed.
icu::CanonicalIterator::CanonicalIterator | ( | const UnicodeString & | source, |
UErrorCode & | status | ||
) |
Construct a CanonicalIterator object.
source | string to get results for |
status | Fill-in parameter which receives the status of this operation. |
|
virtual |
Destructor Cleans pieces.
ICU "poor man's RTTI", returns a UClassID for the actual class.
Reimplemented from icu::UObject.
UnicodeString icu::CanonicalIterator::getSource | ( | ) |
Gets the NFD form of the current source we are iterating over.
ICU "poor man's RTTI", returns a UClassID for this class.
UnicodeString icu::CanonicalIterator::next | ( | ) |
Get the next canonically equivalent string.
Warning: The strings are not guaranteed to be in any particular order.
|
static |
Dumb recursive implementation of permutation.
TODO: optimize
source | the string to find permutations for |
skipZeros | determine if skip zeros |
result | the results in a set. |
status | Fill-in parameter which receives the status of this operation. |
depth | depth of the call. |
void icu::CanonicalIterator::reset | ( | ) |
Resets the iterator so that one can start again from the beginning.
void icu::CanonicalIterator::setSource | ( | const UnicodeString & | newSource, |
UErrorCode & | status | ||
) |
Set a new source for this iterator.
Allows object reuse.
newSource | the source string to iterate against. This allows the same iterator to be used while changing the source string, saving object creation. |
status | Fill-in parameter which receives the status of this operation. |