|
ICU 78.1
78.1
|
C++ header-only API: C++ iterators over Unicode strings (=UTF-8/16/32 if well-formed). More...
#include "unicode/utypes.h"#include <iterator>#include <string>#include <string_view>#include <type_traits>#include "unicode/utf16.h"#include "unicode/utf8.h"#include "unicode/uversion.h"Go to the source code of this file.
Typedefs | |
| typedef enum UTFIllFormedBehavior | UTFIllFormedBehavior |
| Some defined behaviors for handling ill-formed Unicode strings. More... | |
| template<typename Iter > | |
| using | U_HEADER_ONLY_NAMESPACE::prv::iter_value_t = typename std::iterator_traits< Iter >::value_type |
| template<typename Iter > | |
| using | U_HEADER_ONLY_NAMESPACE::prv::iter_difference_t = typename std::iterator_traits< Iter >::difference_type |
Enumerations | |
| enum | UTFIllFormedBehavior { UTF_BEHAVIOR_NEGATIVE , UTF_BEHAVIOR_FFFD , UTF_BEHAVIOR_SURROGATE } |
| Some defined behaviors for handling ill-formed Unicode strings. More... | |
Functions | |
| template<typename CP32 , UTFIllFormedBehavior behavior, typename UnitIter , typename LimitIter = UnitIter> | |
| auto | U_HEADER_ONLY_NAMESPACE::utfIterator (UnitIter start, UnitIter p, LimitIter limit) |
| UTFIterator factory function for start <= p < limit. More... | |
| template<typename CP32 , UTFIllFormedBehavior behavior, typename UnitIter , typename LimitIter = UnitIter> | |
| auto | U_HEADER_ONLY_NAMESPACE::utfIterator (UnitIter p, LimitIter limit) |
| UTFIterator factory function for start = p < limit. More... | |
| template<typename CP32 , UTFIllFormedBehavior behavior, typename UnitIter > | |
| auto | U_HEADER_ONLY_NAMESPACE::utfIterator (UnitIter p) |
| UTFIterator factory function for a start or limit sentinel. More... | |
| template<typename CP32 , typename UnitIter > | |
| auto | U_HEADER_ONLY_NAMESPACE::unsafeUTFIterator (UnitIter iter) |
| UnsafeUTFIterator factory function. More... | |
Variables | |
| template<typename Iter > | |
| constexpr bool | U_HEADER_ONLY_NAMESPACE::prv::forward_iterator |
| template<typename Iter > | |
| constexpr bool | U_HEADER_ONLY_NAMESPACE::prv::bidirectional_iterator |
| template<typename Range > | |
| constexpr bool | U_HEADER_ONLY_NAMESPACE::prv::range = range_type<Range>::value |
| template<typename T > | |
| constexpr bool | U_HEADER_ONLY_NAMESPACE::prv::is_basic_string_view_v = is_basic_string_view<T>::value |
| template<typename CP32 , UTFIllFormedBehavior behavior> | |
| constexpr UTFStringCodePointsAdaptor< CP32, behavior > | U_HEADER_ONLY_NAMESPACE::utfStringCodePoints |
| Range adaptor function object returning a UTFStringCodePoints object that represents a "range" of code points in a code unit range, which validates while decoding. More... | |
| template<typename CP32 > | |
| constexpr UnsafeUTFStringCodePointsAdaptor< CP32 > | U_HEADER_ONLY_NAMESPACE::unsafeUTFStringCodePoints |
| Range adaptor function object returning an UnsafeUTFStringCodePoints object that represents a "range" of code points in a code unit range. More... | |
C++ header-only API: C++ iterators over Unicode strings (=UTF-8/16/32 if well-formed).
Sample code:
Definition in file utfiterator.h.
| using U_HEADER_ONLY_NAMESPACE::prv::iter_difference_t = typedef typename std::iterator_traits<Iter>::difference_type |
This API is for internal use only.
Definition at line 203 of file utfiterator.h.
| using U_HEADER_ONLY_NAMESPACE::prv::iter_value_t = typedef typename std::iterator_traits<Iter>::value_type |
This API is for internal use only.
Definition at line 199 of file utfiterator.h.
| typedef enum UTFIllFormedBehavior UTFIllFormedBehavior |
Some defined behaviors for handling ill-formed Unicode strings.
This is a template parameter for UTFIterator and related classes.
When a validating UTFIterator encounters an ill-formed code unit sequence, then CodeUnits.codePoint() is a value according to this parameter.
| enum UTFIllFormedBehavior |
Some defined behaviors for handling ill-formed Unicode strings.
This is a template parameter for UTFIterator and related classes.
When a validating UTFIterator encounters an ill-formed code unit sequence, then CodeUnits.codePoint() is a value according to this parameter.
| Enumerator | |
|---|---|
| UTF_BEHAVIOR_NEGATIVE | Returns a negative value (-1=U_SENTINEL) instead of a code point. If the CP32 template parameter for the relevant classes is an unsigned type, then the negative value becomes 0xffffffff=UINT32_MAX.
|
| UTF_BEHAVIOR_FFFD | Returns U+FFFD Replacement Character.
|
| UTF_BEHAVIOR_SURROGATE | UTF-8: Not allowed; UTF-16: returns the unpaired surrogate; UTF-32: returns the surrogate code point, or U+FFFD if out of range.
|
Definition at line 149 of file utfiterator.h.
| auto U_HEADER_ONLY_NAMESPACE::unsafeUTFIterator | ( | UnitIter | iter | ) |
UnsafeUTFIterator factory function.
Deduces the UnitIter template parameter from the input.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t |
| UnitIter | Can usually be omitted/deduced: An iterator (often a pointer) that returns a code unit type: UTF-8: char or char8_t or uint8_t; UTF-16: char16_t or uint16_t or (on Windows) wchar_t; UTF-32: char32_t or UChar32=int32_t or (on Linux) wchar_t |
| iter | code unit iterator |
Definition at line 2486 of file utfiterator.h.
References U_HEADER_ONLY_NAMESPACE::unsafeUTFIterator().
Referenced by U_HEADER_ONLY_NAMESPACE::unsafeUTFIterator().
| auto U_HEADER_ONLY_NAMESPACE::utfIterator | ( | UnitIter | p | ) |
UTFIterator factory function for a start or limit sentinel.
Deduces the UnitIter template parameter from the input. Requires UnitIter to be copyable.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t |
| behavior | How to handle ill-formed Unicode strings |
| UnitIter | Can usually be omitted/deduced: An iterator (often a pointer) that returns a code unit type: UTF-8: char or char8_t or uint8_t; UTF-16: char16_t or uint16_t or (on Windows) wchar_t; UTF-32: char32_t or UChar32=int32_t or (on Linux) wchar_t |
| p | code unit iterator. When using a code unit sentinel, then that sentinel also works as a sentinel for the code point iterator. |
Definition at line 1745 of file utfiterator.h.
References U_HEADER_ONLY_NAMESPACE::utfIterator().
| auto U_HEADER_ONLY_NAMESPACE::utfIterator | ( | UnitIter | p, |
| LimitIter | limit | ||
| ) |
UTFIterator factory function for start = p < limit.
Deduces the UnitIter and LimitIter template parameters from the inputs.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t |
| behavior | How to handle ill-formed Unicode strings |
| UnitIter | Can usually be omitted/deduced: An iterator (often a pointer) that returns a code unit type: UTF-8: char or char8_t or uint8_t; UTF-16: char16_t or uint16_t or (on Windows) wchar_t; UTF-32: char32_t or UChar32=int32_t or (on Linux) wchar_t |
| LimitIter | Either the same as UnitIter, or an iterator sentinel type. |
| p | start and current-position code unit iterator |
| limit | limit (exclusive-end) code unit iterator. When using a code unit sentinel (UnitIter≠LimitIter), then that sentinel also works as a sentinel for the code point iterator. |
Definition at line 1715 of file utfiterator.h.
References U_HEADER_ONLY_NAMESPACE::utfIterator().
| auto U_HEADER_ONLY_NAMESPACE::utfIterator | ( | UnitIter | start, |
| UnitIter | p, | ||
| LimitIter | limit | ||
| ) |
UTFIterator factory function for start <= p < limit.
Deduces the UnitIter and LimitIter template parameters from the inputs. Only enabled if UnitIter is a (multi-pass) forward_iterator or better.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t |
| behavior | How to handle ill-formed Unicode strings |
| UnitIter | Can usually be omitted/deduced: An iterator (often a pointer) that returns a code unit type: UTF-8: char or char8_t or uint8_t; UTF-16: char16_t or uint16_t or (on Windows) wchar_t; UTF-32: char32_t or UChar32=int32_t or (on Linux) wchar_t |
| LimitIter | Either the same as UnitIter, or an iterator sentinel type. |
| start | start code unit iterator |
| p | current-position code unit iterator |
| limit | limit (exclusive-end) code unit iterator. When using a code unit sentinel (UnitIter≠LimitIter), then that sentinel also works as a sentinel for the code point iterator. |
Definition at line 1688 of file utfiterator.h.
References U_HEADER_ONLY_NAMESPACE::utfIterator().
Referenced by U_HEADER_ONLY_NAMESPACE::utfIterator().
|
constexpr |
This API is for internal use only.
Definition at line 214 of file utfiterator.h.
|
constexpr |
This API is for internal use only.
Definition at line 207 of file utfiterator.h.
|
constexpr |
This API is for internal use only.
Definition at line 244 of file utfiterator.h.
|
constexpr |
This API is for internal use only.
Definition at line 232 of file utfiterator.h.
|
constexpr |
Range adaptor function object returning an UnsafeUTFStringCodePoints object that represents a "range" of code points in a code unit range.
The string must be well-formed. Deduces the Range template parameter from the input, taking into account the value category: the code units will be referenced if possible, and moved if necessary.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t |
| Range | A C++ "range" of Unicode UTF-8/16/32 code units |
| unitRange | input range |
Definition at line 2658 of file utfiterator.h.
|
constexpr |
Range adaptor function object returning a UTFStringCodePoints object that represents a "range" of code points in a code unit range, which validates while decoding.
Deduces the Range template parameter from the input, taking into account the value category: the code units will be referenced if possible, and moved if necessary.
| CP32 | Code point type: UChar32 (=int32_t) or char32_t or uint32_t; should be signed if UTF_BEHAVIOR_NEGATIVE |
| behavior | How to handle ill-formed Unicode strings |
| Range | A C++ "range" of Unicode UTF-8/16/32 code units |
| unitRange | input range |
Definition at line 1926 of file utfiterator.h.