pub enum WordBreak {
CR,
LF,
Newline,
Extend,
ZWJ,
RegionalIndicator,
Format,
Katakana,
HebrewLetter,
ALetter,
SingleQuote,
DoubleQuote,
MidNumLet,
MidLetter,
MidNum,
Numeric,
ExtendNumLet,
EBase,
EModifier,
GlueAfterZwj,
EBaseGAZ,
Other,
}
Represents the Unicode character
Word_Break
property.
U+000D CARRIAGE RETURN (CR)
U+000B LINE TABULATION
U+000C FORM FEED (FF)
U+0085 NEXT LINE (NEL)
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
Grapheme_Extend = Yes, or
General_Category = Spacing_Mark
and not U+200D ZERO WIDTH JOINER (ZWJ)
Regional_Indicator = Yes
This consists of the range:
U+1F1E6 REGIONAL INDICATOR SYMBOL LETTER A
..U+1F1FF REGIONAL INDICATOR SYMBOL LETTER Z
General_Category = Format
and not U+200B ZERO WIDTH SPACE (ZWSP)
and not U+200C ZERO WIDTH NON-JOINER (ZWNJ)
and not U+200D ZERO WIDTH JOINER (ZWJ)
Script = KATAKANA, or
any of the following:
U+3031 ( 〱 ) VERTICAL KANA REPEAT MARK
U+3032 ( 〲 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK
U+3033 ( 〳 ) VERTICAL KANA REPEAT MARK UPPER HALF
U+3034 ( 〴 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
U+3035 ( 〵 ) VERTICAL KANA REPEAT MARK LOWER HALF
U+309B ( ゛ ) KATAKANA-HIRAGANA VOICED SOUND MARK
U+309C ( ゜ ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
U+30A0 ( ゠ ) KATAKANA-HIRAGANA DOUBLE HYPHEN
U+30FC ( ー ) KATAKANA-HIRAGANA PROLONGED SOUND MARK
U+FF70 ( ー ) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
Script = Hebrew
and General_Category = Other_Letter
Alphabetic = Yes, or
any of the following 36 characters:
U+02C2 ( ˂ ) MODIFIER LETTER LEFT ARROWHEAD
..U+02C5 ( ˅ ) MODIFIER LETTER DOWN ARROWHEAD
U+02D2 ( ˒ ) MODIFIER LETTER CENTRED RIGHT HALF RING
..U+02D7 ( ˗ ) MODIFIER LETTER MINUS SIGN
U+02DE ( ˞ ) MODIFIER LETTER RHOTIC HOOK
U+02DF ( ˟ ) MODIFIER LETTER CROSS ACCENT
U+02ED ( ˭ ) MODIFIER LETTER UNASPIRATED
U+02EF ( ˯ ) MODIFIER LETTER LOW DOWN ARROWHEAD
..U+02FF ( ˿ ) MODIFIER LETTER LOW LEFT ARROW
U+05F3 ( ׳ ) HEBREW PUNCTUATION GERESH
U+A720 ( ꜠ ) MODIFIER LETTER STRESS AND HIGH TONE
U+A721 ( ꜡ ) MODIFIER LETTER STRESS AND LOW TONE
U+A789 ( ꞉ ) MODIFIER LETTER COLON
U+A78A ( ꞊ ) MODIFIER LETTER SHORT EQUALS SIGN
U+AB5B ( ꭛ ) MODIFIER BREVE WITH INVERTED BREVE
and Ideographic = No
and Word_Break ≠ Katakana
and Line_Break ≠ Complex_Context (SA)
and Script ≠ Hiragana
and Word_Break ≠ Extend
and Word_Break ≠ Hebrew_Letter
U+0022 ( " ) QUOTATION MARK
U+002E ( . ) FULL STOP
U+2018 ( ‘ ) LEFT SINGLE QUOTATION MARK
U+2019 ( ’ ) RIGHT SINGLE QUOTATION MARK
U+2024 ( ․ ) ONE DOT LEADER
U+FE52 ( ﹒ ) SMALL FULL STOP
U+FF07 ( ' ) FULLWIDTH APOSTROPHE
U+FF0E ( . ) FULLWIDTH FULL STOP
U+00B7 ( · ) MIDDLE DOT
U+0387 ( · ) GREEK ANO TELEIA
U+05F4 ( ״ ) HEBREW PUNCTUATION GERSHAYIM
U+2027 ( ‧ ) HYPHENATION POINT
U+003A ( : ) COLON (used in Swedish)
U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
U+FE55 ( ﹕ ) SMALL COLON
U+FF1A ( : ) FULLWIDTH COLON
Line_Break = Infix_Numeric, or
any of the following:
U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
U+FE50 ( ﹐ ) SMALL COMMA
U+FE54 ( ﹔ ) SMALL SEMICOLON
U+FF0C ( , ) FULLWIDTH COMMA
U+FF1B ( ; ) FULLWIDTH SEMICOLON
and not U+003A ( : ) COLON
and not U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
and not U+002E ( . ) FULL STOP
Line_Break = Numeric
and not U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
General_Category = Connector_Punctuation, or
U+202F NARROW NO-BREAK SPACE (NNBSP)
Emoji characters that do not break from a previous ZWJ in a defined emoji ZWJ sequence,
and are not listed as Emoji_Modifier_Base=Yes
in emoji-data.txt
.
See https://www.unicode.org/reports/tr51/.
Find the character Word_Break property value.
Performs copy-assignment from source
. Read more
Formats the value using the given formatter. Read more
This method tests for self
and other
values to be equal, and is used by ==
. Read more
#[must_use]
fn ne(&self, other: &Rhs) -> bool
| 1.0.0 [src] |
This method tests for !=
.
Feeds this value into the given [Hasher
]. Read more
Feeds a slice of this type into the given [Hasher
]. Read more
type Err = ()
The associated error which can be returned from parsing.
Parses a string s
to return a value of this type. Read more
The abbreviated name of the property.
The long name of the property.
The human-readable name of the property.
Formats the value using the given formatter. Read more
Exhaustive list of all property values.
The abbreviated name of the property value.
The long name of the property value.
The human-readable name of the property value.
The property value for the character.
Returns the "default value" for a type. Read more