Skip to main content

Module word

Module word 

Source

Structsยง

ShyBreak ๐Ÿ”’
Result of splitting a Word at one of its SHY break offsets. prefix.text already includes a trailing U+002D HYPHEN-MINUS and its width_pt is the post-shape advance sum (including the hyphen). suffix.text carries the remaining bytes with shy_break_offsets rebased to the suffixโ€™s local indexing and boundary offsets (0 / len) dropped.
Word ๐Ÿ”’

Enumsยง

WordItem ๐Ÿ”’
Inline item emitted by collect_words. The greedy line-breaker (and, later, the Knuth-Plass breaker) walks the stream and emits page geometry; HardBreak is a sentinel that forces a flush of the in-progress line without contributing any glyphs.

Functionsยง

glyphs_advance_pt ๐Ÿ”’
split_soft_hyphens ๐Ÿ”’
Strip U+00AD (soft hyphen) codepoints from text and return the stripped string plus the byte offsets in the stripped output where each SHY originally sat. The offsets mark the codepoint boundary after the preceding cluster: a break taken at offset o leaves bytes [0..o) on the previous line and [o..) on the next.
try_shy_break ๐Ÿ”’
Try to break word at the latest SHY offset whose prefix-plus- visible-hyphen fits in max_prefix_width. Returns None if no valid offset fits. Offsets equal to 0 or word.text.len() (leading / trailing SHY) are ignored, matching the rule that a break must produce a non-empty visible prefix and a non-empty suffix. Consecutive duplicate offsets (e.g. a\u{AD}\u{AD}b โ†’ [1, 1]) are deduped on the fly.
word_clusters ๐Ÿ”’