ICU 68.2
68.2
|
C++ API: MessagePattern class: Parses and represents ICU MessageFormat patterns. More...
Go to the source code of this file.
Data Structures | |
class | icu::MessagePattern |
Parses and represents ICU MessageFormat patterns. More... | |
class | icu::MessagePattern::Part |
A message pattern "part", representing a pattern parsing event. More... | |
Namespaces | |
icu | |
File coll.h. | |
Macros | |
#define | UMSGPAT_ARG_TYPE_HAS_PLURAL_STYLE(argType) ((argType)==UMSGPAT_ARG_TYPE_PLURAL || (argType)==UMSGPAT_ARG_TYPE_SELECTORDINAL) |
Returns true if the argument type has a plural style part sequence and semantics, for example UMSGPAT_ARG_TYPE_PLURAL and UMSGPAT_ARG_TYPE_SELECTORDINAL. More... | |
#define | UMSGPAT_NO_NUMERIC_VALUE ((double)(-123456789)) |
Special value that is returned by getNumericValue(Part) when no numeric value is defined for a part. More... | |
Typedefs | |
typedef enum UMessagePatternApostropheMode | UMessagePatternApostropheMode |
typedef enum UMessagePatternPartType | UMessagePatternPartType |
typedef enum UMessagePatternArgType | UMessagePatternArgType |
C++ API: MessagePattern class: Parses and represents ICU MessageFormat patterns.
Definition in file messagepattern.h.
#define UMSGPAT_ARG_TYPE_HAS_PLURAL_STYLE | ( | argType | ) | ((argType)==UMSGPAT_ARG_TYPE_PLURAL || (argType)==UMSGPAT_ARG_TYPE_SELECTORDINAL) |
Returns true if the argument type has a plural style part sequence and semantics, for example UMSGPAT_ARG_TYPE_PLURAL and UMSGPAT_ARG_TYPE_SELECTORDINAL.
Definition at line 272 of file messagepattern.h.
#define UMSGPAT_NO_NUMERIC_VALUE ((double)(-123456789)) |
Special value that is returned by getNumericValue(Part) when no numeric value is defined for a part.
Definition at line 299 of file messagepattern.h.
Definition at line 1 of file messagepattern.h.
typedef enum UMessagePatternArgType UMessagePatternArgType |
Definition at line 1 of file messagepattern.h.
typedef enum UMessagePatternPartType UMessagePatternPartType |
Definition at line 1 of file messagepattern.h.
anonymous enum |
Enumerator | |
---|---|
UMSGPAT_ARG_NAME_NOT_NUMBER | Return value from MessagePattern.validateArgumentName() for when the string is a valid "pattern identifier" but not a number.
|
UMSGPAT_ARG_NAME_NOT_VALID | Return value from MessagePattern.validateArgumentName() for when the string is invalid. It might not be a valid "pattern identifier", or it have only ASCII digits but there is a leading zero or the number is too large.
|
UIDNA_DEFAULT | Default options value: None of the other options are set. For use in static worker and factory methods.
|
UIDNA_ALLOW_UNASSIGNED | Option to allow unassigned code points in domain names and labels. For use in static worker and factory methods. This option is ignored by the UTS46 implementation. (UTS #46 disallows unassigned code points.)
|
UIDNA_USE_STD3_RULES | Option to check whether the input conforms to the STD3 ASCII rules, for example the restriction of labels to LDH characters (ASCII Letters, Digits and Hyphen-Minus). For use in static worker and factory methods.
|
UIDNA_CHECK_BIDI | IDNA option to check for whether the input conforms to the BiDi rules. For use in static worker and factory methods. This option is ignored by the IDNA2003 implementation. (IDNA2003 always performs a BiDi check.)
|
UIDNA_CHECK_CONTEXTJ | IDNA option to check for whether the input conforms to the CONTEXTJ rules. For use in static worker and factory methods. This option is ignored by the IDNA2003 implementation. (The CONTEXTJ check is new in IDNA2008.)
|
UIDNA_NONTRANSITIONAL_TO_ASCII | IDNA option for nontransitional processing in ToASCII(). For use in static worker and factory methods. By default, ToASCII() uses transitional processing. This option is ignored by the IDNA2003 implementation. (This is only relevant for compatibility of newer IDNA implementations with IDNA2003.)
|
UIDNA_NONTRANSITIONAL_TO_UNICODE | IDNA option for nontransitional processing in ToUnicode(). For use in static worker and factory methods. By default, ToUnicode() uses transitional processing. This option is ignored by the IDNA2003 implementation. (This is only relevant for compatibility of newer IDNA implementations with IDNA2003.)
|
UIDNA_CHECK_CONTEXTO | IDNA option to check for whether the input conforms to the CONTEXTO rules. For use in static worker and factory methods. This option is ignored by the IDNA2003 implementation. (The CONTEXTO check is new in IDNA2008.) This is for use by registries for IDNA2008 conformance. UTS #46 does not require the CONTEXTO check.
|
UITER_UNKNOWN_INDEX | Constant value that may be returned by UCharIteratorMove indicating that the final UTF-16 index is not known, but that the move succeeded. This can occur when moving relative to limit or length, or when moving relative to the current index after a setState() when the current UTF-16 index is not known. It would be very inefficient to have to count from the beginning of the text just to get the current/limit/length index after moving relative to it. The actual index can be determined with getIndex(UITER_CURRENT) which will count the UChars if necessary.
|
UNORM_UNICODE_3_2 | Options bit set value to select Unicode 3.2 normalization (except NormalizationCorrections). At most one Unicode version can be selected at a time.
|
USET_IGNORE_SPACE | Ignore white space within patterns unless quoted or escaped.
|
USET_CASE_INSENSITIVE | Enable case insensitive matching. E.g., "[ab]" with this flag will match 'a', 'A', 'b', and 'B'. "[^ab]" with this flag will match all except 'a', 'A', 'b', and 'B'. This performs a full closure over case mappings, e.g. U+017F for s. The resulting set is a superset of the input for the code points but not for the strings. It performs a case mapping closure of the code points and adds full case folding strings for the code points, and reduces strings of the original set to their full case folding equivalents. This is designed for case-insensitive matches, for example in regular expressions. The full code point case closure allows checking of an input character directly against the closure set. Strings are matched by comparing the case-folded form from the closure set with an incremental case folding of the string in question. The closure set will also contain single code points if the original set contained case-equivalent strings (like U+00DF for "ss" or "Ss" etc.). This is not necessary (that is, redundant) for the above matching method but results in the same closure sets regardless of whether the original set contained the code point or a string.
|
USET_ADD_CASE_MAPPINGS | Enable case insensitive matching. E.g., "[ab]" with this flag will match 'a', 'A', 'b', and 'B'. "[^ab]" with this flag will match all except 'a', 'A', 'b', and 'B'. This adds the lower-, title-, and uppercase mappings as well as the case folding of each existing element in the set.
|
UTEXT_PROVIDER_LENGTH_IS_EXPENSIVE | It is potentially time consuming for the provider to determine the length of the text.
|
UTEXT_PROVIDER_STABLE_CHUNKS | Text chunks remain valid and usable until the text object is modified or deleted, not just until the next time the access() function is called (which is the default).
|
UTEXT_PROVIDER_WRITABLE | The provider supports modifying the text via the replace() and copy() functions.
|
UTEXT_PROVIDER_HAS_META_DATA | There is meta data associated with the text.
|
UTEXT_PROVIDER_OWNS_TEXT | Text provider owns the text storage. Generally occurs as the result of a deep clone of the UText. When closing the UText, the associated text must also be closed/deleted/freed/ whatever is appropriate.
|
Definition at line 275 of file messagepattern.h.
Mode for when an apostrophe starts quoted literal text for MessageFormat output.
The default is DOUBLE_OPTIONAL unless overridden via uconfig.h (UCONFIG_MSGPAT_DEFAULT_APOSTROPHE_MODE).
A pair of adjacent apostrophes always results in a single apostrophe in the output, even when the pair is between two single, text-quoting apostrophes.
The following table shows examples of desired MessageFormat.format() output with the pattern strings that yield that output.
Desired output | DOUBLE_OPTIONAL | DOUBLE_REQUIRED |
---|---|---|
I see {many} | I see '{many}' | (same) |
I said {'Wow!'} | I said '{''Wow!''}' | (same) |
I don't know | I don't know OR I don''t know | I don''t know |
Enumerator | |
---|---|
UMSGPAT_APOS_DOUBLE_OPTIONAL | A literal apostrophe is represented by either a single or a double apostrophe pattern character. Within a MessageFormat pattern, a single apostrophe only starts quoted literal text if it immediately precedes a curly brace {}, or a pipe symbol | if inside a choice format, or a pound symbol # if inside a plural format. This is the default behavior starting with ICU 4.8.
|
UMSGPAT_APOS_DOUBLE_REQUIRED | A literal apostrophe must be represented by a double apostrophe pattern character. A single apostrophe always starts quoted literal text. This is the behavior of ICU 4.6 and earlier, and of the JDK.
|
Definition at line 70 of file messagepattern.h.
Argument type constants.
Returned by Part.getArgType() for ARG_START and ARG_LIMIT parts.
Messages nested inside an argument are each delimited by MSG_START and MSG_LIMIT, with a nesting level one greater than the surrounding message.
Enumerator | |
---|---|
UMSGPAT_ARG_TYPE_NONE | The argument has no specified type.
|
UMSGPAT_ARG_TYPE_SIMPLE | The argument has a "simple" type which is provided by the ARG_TYPE part. An ARG_STYLE part might follow that.
|
UMSGPAT_ARG_TYPE_CHOICE | The argument is a ChoiceFormat with one or more ((ARG_INT | ARG_DOUBLE), ARG_SELECTOR, message) tuples.
|
UMSGPAT_ARG_TYPE_PLURAL | The argument is a cardinal-number PluralFormat with an optional ARG_INT or ARG_DOUBLE offset (e.g., offset:1) and one or more (ARG_SELECTOR [explicit-value] message) tuples. If the selector has an explicit value (e.g., =2), then that value is provided by the ARG_INT or ARG_DOUBLE part preceding the message. Otherwise the message immediately follows the ARG_SELECTOR.
|
UMSGPAT_ARG_TYPE_SELECT | The argument is a SelectFormat with one or more (ARG_SELECTOR, message) pairs.
|
UMSGPAT_ARG_TYPE_SELECTORDINAL | The argument is an ordinal-number PluralFormat with the same style parts sequence and semantics as UMSGPAT_ARG_TYPE_PLURAL.
|
Definition at line 221 of file messagepattern.h.
MessagePattern::Part type constants.
Enumerator | |
---|---|
UMSGPAT_PART_TYPE_MSG_START | Start of a message pattern (main or nested). The length is 0 for the top-level message and for a choice argument sub-message, otherwise 1 for the '{'. The value indicates the nesting level, starting with 0 for the main message. There is always a later MSG_LIMIT part.
|
UMSGPAT_PART_TYPE_MSG_LIMIT | End of a message pattern (main or nested). The length is 0 for the top-level message and the last sub-message of a choice argument, otherwise 1 for the '}' or (in a choice argument style) the '|'. The value indicates the nesting level, starting with 0 for the main message.
|
UMSGPAT_PART_TYPE_SKIP_SYNTAX | Indicates a substring of the pattern string which is to be skipped when formatting. For example, an apostrophe that begins or ends quoted text would be indicated with such a part. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_INSERT_CHAR | Indicates that a syntax character needs to be inserted for auto-quoting. The length is 0. The value is the character code of the insertion character. (U+0027=APOSTROPHE)
|
UMSGPAT_PART_TYPE_REPLACE_NUMBER | Indicates a syntactic (non-escaped) # symbol in a plural variant. When formatting, replace this part's substring with the (value-offset) for the plural argument value. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_ARG_START | Start of an argument. The length is 1 for the '{'. The value is the ordinal value of the ArgType. Use getArgType(). This part is followed by either an ARG_NUMBER or ARG_NAME, followed by optional argument sub-parts (see UMessagePatternArgType constants) and finally an ARG_LIMIT part.
|
UMSGPAT_PART_TYPE_ARG_LIMIT | End of an argument. The length is 1 for the '}'. The value is the ordinal value of the ArgType. Use getArgType().
|
UMSGPAT_PART_TYPE_ARG_NUMBER | The argument number, provided by the value.
|
UMSGPAT_PART_TYPE_ARG_NAME | The argument name. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_ARG_TYPE | The argument type. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_ARG_STYLE | The argument style text. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_ARG_SELECTOR | A selector substring in a "complex" argument style. The value is undefined and currently always 0.
|
UMSGPAT_PART_TYPE_ARG_INT | An integer value, for example the offset or an explicit selector value in a PluralFormat style. The part value is the integer value.
|
UMSGPAT_PART_TYPE_ARG_DOUBLE | A numeric value, for example the offset or an explicit selector value in a PluralFormat style. The part value is an index into an internal array of numeric values; use getNumericValue().
|
Definition at line 102 of file messagepattern.h.