Appearance
Unicode
Unicode text handling: East Asian ambiguous character width, wide character wrapping at line boundaries, and tab stop behavior with mixed-width text. Correct Unicode handling is essential for TUI applications to maintain proper cursor alignment and text layout across different scripts and character sets. Unicode width handling is arguably the hardest problem in terminal emulation. The wcwidth() function (from 1988) predates emoji entirely. Different terminals use different Unicode versions for width tables, and there's no standard for grapheme cluster width.
Terminal Unicode handling has three hard problems. Width calculation: is a character 1 or 2 columns wide? UAX #11 provides East_Asian_Width properties, but ambiguous-width characters (like certain Greek and Cyrillic symbols) vary between terminals. Grapheme clustering: a flag emoji like U+1F1F3 U+1F1F4 (two regional indicators) should display as one 2-column glyph, not two separate characters. And variation selectors: U+FE0E forces text presentation (1 column), U+FE0F forces emoji presentation (2 columns) — the same codepoint can be different widths depending on the following byte.
The most treacherous case is zero-width joiners (ZWJ, U+200D). A ZWJ sequence like woman + ZWJ + laptop should render as a single emoji glyph if the terminal's font supports it, but as three separate characters if it doesn't. The terminal must either trust the font's ligature tables or maintain its own ZWJ sequence database — and that database changes with every Unicode release.
For developers, the practical test is simple: does the cursor end up in the right place after printing a string? If a terminal calculates "hello" + flag_emoji as 7 columns but the font renders the flag as 2 columns, every subsequent character on that line will be offset. This breaks table alignment, progress bars, box drawing, and any TUI that relies on precise cursor positioning. The wcwidth() function and its many implementations are the battleground where these disagreements play out.
Analysis2026-04-06
Terminal Applications
Headless Backends
Parser correctness tested via Termless. A ✓ means the parser accepts the sequence, not that it renders correctly.