public final class Edits extends Object
There are two types of edits: change edits and no-change edits. Add edits to
instances of this class using addReplace(int, int)
(for change edits) and
addUnchanged(int)
(for no-change edits). Change edits are retained with full granularity,
whereas adjacent no-change edits are always merged together. In no-change edits, there is a one-to-one
mapping between code points in the source and destination strings.
After all edits have been added, instances of this class should be considered immutable, and an
Edits.Iterator
can be used for queries.
There are four flavors of Edits.Iterator:
getFineIterator()
retains full granularity of change edits.
getFineChangesIterator()
retains full granularity of change edits, and when calling
next() on the iterator, skips over no-change edits (unchanged regions).
getCoarseIterator()
treats adjacent change edits as a single edit. (Adjacent no-change
edits are automatically merged during the construction phase.)
getCoarseChangesIterator()
treats adjacent change edits as a single edit, and when
calling next() on the iterator, skips over no-change edits (unchanged regions).
For example, consider the string "abcßDeF", which case-folds to "abcssdef". This string has the following fine edits:
The "fine changes" and "coarse changes" iterators will step through only the change edits when their
Edits.Iterator.next()
methods are called. They are identical to the non-change iterators when
their Edits.Iterator.findSourceIndex(int)
or Edits.Iterator.findDestinationIndex(int)
methods are used to walk through the string.
For examples of how to use this class, see the test TestCaseMapEditsIteratorDocs
in
UCharacterCaseTest.java.
Modifier and Type | Class and Description |
---|---|
static class |
Edits.Iterator
Access to the list of edits.
|
Constructor and Description |
---|
Edits()
Constructs an empty object.
|
Modifier and Type | Method and Description |
---|---|
void |
addReplace(int oldLength,
int newLength)
Adds a change edit: a record for a text replacement/insertion/deletion.
|
void |
addUnchanged(int unchangedLength)
Adds a no-change edit: a record for an unchanged segment of text.
|
Edits.Iterator |
getCoarseChangesIterator()
Returns an Iterator for coarse-grained change edits
(adjacent change edits are treated as one).
|
Edits.Iterator |
getCoarseIterator()
Returns an Iterator for coarse-grained change and no-change edits
(adjacent change edits are treated as one).
|
Edits.Iterator |
getFineChangesIterator()
Returns an Iterator for fine-grained change edits
(full granularity of change edits is retained).
|
Edits.Iterator |
getFineIterator()
Returns an Iterator for fine-grained change and no-change edits
(full granularity of change edits is retained).
|
boolean |
hasChanges() |
int |
lengthDelta()
How much longer is the new text compared with the old text?
|
Edits |
mergeAndAppend(Edits ab,
Edits bc)
Merges the two input Edits and appends the result to this object.
|
int |
numberOfChanges() |
void |
reset()
Resets the data but may not release memory.
|
public void reset()
public void addUnchanged(int unchangedLength)
public void addReplace(int oldLength, int newLength)
public int lengthDelta()
public boolean hasChanges()
public int numberOfChanges()
public Edits.Iterator getCoarseChangesIterator()
public Edits.Iterator getCoarseIterator()
public Edits.Iterator getFineChangesIterator()
public Edits.Iterator getFineIterator()
public Edits mergeAndAppend(Edits ab, Edits bc)
Consider two string transformations (for example, normalization and case mapping)
where each records Edits in addition to writing an output string.
Edits ab reflect how substrings of input string a
map to substrings of intermediate string b.
Edits bc reflect how substrings of intermediate string b
map to substrings of output string c.
This function merges ab and bc such that the additional edits
recorded in this object reflect how substrings of input string a
map to substrings of output string c.
If unrelated Edits are passed in where the output string of the first has a different length than the input string of the second, then an IllegalArgumentException is thrown.
ab
- reflects how substrings of input string a
map to substrings of intermediate string b.bc
- reflects how substrings of intermediate string b
map to substrings of output string c.Copyright © 2016 Unicode, Inc. and others.