Enum unic_ucd_segment::word_break::WordBreak[]

pub enum WordBreak {
    CR,
    LF,
    Newline,
    Extend,
    ZWJ,
    RegionalIndicator,
    Format,
    Katakana,
    HebrewLetter,
    ALetter,
    SingleQuote,
    DoubleQuote,
    MidNumLet,
    MidLetter,
    MidNum,
    Numeric,
    ExtendNumLet,
    EBase,
    EModifier,
    GlueAfterZwj,
    EBaseGAZ,
    Other,
}

Represents the Unicode character Word_Break property.

References

Variants

U+000D CARRIAGE RETURN (CR)
U+000A LINE FEED (LF)
U+000B LINE TABULATION
U+000C FORM FEED (FF)
U+0085 NEXT LINE (NEL)
U+2028 LINE SEPARATOR
U+2029 PARAGRAPH SEPARATOR
Grapheme_Extend = Yes, or
General_Category = Spacing_Mark
and not U+200D ZERO WIDTH JOINER (ZWJ)
U+200D ZERO WIDTH JOINER
Regional_Indicator = Yes

This consists of the range:

U+1F1E6 REGIONAL INDICATOR SYMBOL LETTER A
..U+1F1FF REGIONAL INDICATOR SYMBOL LETTER Z
General_Category = Format
and not U+200B ZERO WIDTH SPACE (ZWSP)
and not U+200C ZERO WIDTH NON-JOINER (ZWNJ)
and not U+200D ZERO WIDTH JOINER (ZWJ)
Script = KATAKANA, or
any of the following:
U+3031 ( 〱 ) VERTICAL KANA REPEAT MARK
U+3032 ( 〲 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK
U+3033 ( 〳 ) VERTICAL KANA REPEAT MARK UPPER HALF
U+3034 ( 〴 ) VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
U+3035 ( 〵 ) VERTICAL KANA REPEAT MARK LOWER HALF
U+309B ( ゛ ) KATAKANA-HIRAGANA VOICED SOUND MARK
U+309C ( ゜ ) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
U+30A0 ( ゠ ) KATAKANA-HIRAGANA DOUBLE HYPHEN
U+30FC ( ー ) KATAKANA-HIRAGANA PROLONGED SOUND MARK
U+FF70 ( ー ) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
Script = Hebrew
and General_Category = Other_Letter
Alphabetic = Yes, or
any of the following 36 characters:
U+02C2 ( ˂ ) MODIFIER LETTER LEFT ARROWHEAD
..U+02C5 ( ˅ ) MODIFIER LETTER DOWN ARROWHEAD
U+02D2 ( ˒ ) MODIFIER LETTER CENTRED RIGHT HALF RING
..U+02D7 ( ˗ ) MODIFIER LETTER MINUS SIGN
U+02DE ( ˞ ) MODIFIER LETTER RHOTIC HOOK
U+02DF ( ˟ ) MODIFIER LETTER CROSS ACCENT
U+02ED ( ˭ ) MODIFIER LETTER UNASPIRATED
U+02EF ( ˯ ) MODIFIER LETTER LOW DOWN ARROWHEAD
..U+02FF ( ˿ ) MODIFIER LETTER LOW LEFT ARROW
U+05F3 ( ׳ ) HEBREW PUNCTUATION GERESH
U+A720 ( ꜠ ) MODIFIER LETTER STRESS AND HIGH TONE
U+A721 ( ꜡ ) MODIFIER LETTER STRESS AND LOW TONE
U+A789 ( ꞉ ) MODIFIER LETTER COLON
U+A78A ( ꞊ ) MODIFIER LETTER SHORT EQUALS SIGN
U+AB5B ( ꭛ ) MODIFIER BREVE WITH INVERTED BREVE
and Ideographic = No
and Word_Break ≠ Katakana
and Line_Break ≠ Complex_Context (SA)
and Script ≠ Hiragana
and Word_Break ≠ Extend
and Word_Break ≠ Hebrew_Letter
U+0027 ( ' ) APOSTROPHE
U+0022 ( " ) QUOTATION MARK
U+002E ( . ) FULL STOP
U+2018 ( ‘ ) LEFT SINGLE QUOTATION MARK
U+2019 ( ’ ) RIGHT SINGLE QUOTATION MARK
U+2024 ( ․ ) ONE DOT LEADER
U+FE52 ( ﹒ ) SMALL FULL STOP
U+FF07 ( ' ) FULLWIDTH APOSTROPHE
U+FF0E ( . ) FULLWIDTH FULL STOP
U+00B7 ( · ) MIDDLE DOT
U+0387 ( · ) GREEK ANO TELEIA
U+05F4 ( ״ ) HEBREW PUNCTUATION GERSHAYIM
U+2027 ( ‧ ) HYPHENATION POINT
U+003A ( : ) COLON (used in Swedish)
U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
U+FE55 ( ﹕ ) SMALL COLON
U+FF1A ( : ) FULLWIDTH COLON
Line_Break = Infix_Numeric, or
any of the following:
U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
U+FE50 ( ﹐ ) SMALL COMMA
U+FE54 ( ﹔ ) SMALL SEMICOLON
U+FF0C ( , ) FULLWIDTH COMMA
U+FF1B ( ; ) FULLWIDTH SEMICOLON
and not U+003A ( : ) COLON
and not U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON
and not U+002E ( . ) FULL STOP
Line_Break = Numeric
and not U+066C ( ٬ ) ARABIC THOUSANDS SEPARATOR
General_Category = Connector_Punctuation, or
U+202F NARROW NO-BREAK SPACE (NNBSP)

Emoji characters listed as Emoji_Modifier_Base=Yes in emoji-data.txt, which do not occur after ZWJ in emoji-zwj-sequences.txt.

See https://www.unicode.org/reports/tr51/.

Emoji characters listed as Emoji_Modifer=Yes in emoji-data.txt.

See https://www.unicode.org/reports/tr51/.

Emoji characters that do not break from a previous ZWJ in a defined emoji ZWJ sequence, and are not listed as Emoji_Modifier_Base=Yes in emoji-data.txt.

See https://www.unicode.org/reports/tr51/.

Emoji characters listed as Emoji_Modifer_Base=Yes in emoji_data.txt, and also occur after ZWJ in emoji-zwj-sequences.txt.

See https://www.unicode.org/reports/tr51/.

All other characters

Methods

impl WordBreak
[src]

Find the character Word_Break property value.

Trait Implementations

impl Copy for WordBreak

impl Clone for WordBreak

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl Debug for WordBreak

Formats the value using the given formatter. Read more

impl Eq for WordBreak

impl PartialEq for WordBreak

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

impl Hash for WordBreak

Feeds this value into the given [Hasher]. Read more

Feeds a slice of this type into the given [Hasher]. Read more

impl FromStr for WordBreak

The associated error which can be returned from parsing.

Parses a string s to return a value of this type. Read more

impl CharProperty for WordBreak

The abbreviated name of the property.

The long name of the property.

The human-readable name of the property.

impl Display for WordBreak

Formats the value using the given formatter. Read more

impl EnumeratedCharProperty for WordBreak

Exhaustive list of all property values.

The abbreviated name of the property value.

The long name of the property value.

The human-readable name of the property value.

impl TotalCharProperty for WordBreak
[src]

The property value for the character.

impl Default for WordBreak
[src]

Returns the "default value" for a type. Read more

Auto Trait Implementations

impl Send for WordBreak

impl Sync for WordBreak