fast.unicode

Functions to work with the Unicode Transformation Format.

Grapheme clusters: A grapheme cluster is roughly speaking what the user would perceive as the smallest unit in a writing system. Their count can be thought of as a caret position in a text editor. In particular at grapheme cluster level, different normalization forms (NFC, NFD) become transparent. The default definition used here is independent of the user's locale.

Members

Aliases

GeneralCategory
alias GeneralCategory = DerivedGeneralCategory.Enum

Enumeration for the Unicode "General Category" used to roughly classify codepoints into letters, punctuation etc.

Functions

countGraphemes
size_t countGraphemes(const(char)[] str)

Counts the number of grapheme clusters (character count) in a UTF string.

Properties

generalCategory
CodePointInfo!GeneralCategory generalCategory [@property getter]

Retrieves the "General Category" of the first code point in some UTF-8 string. For broken UTF-8, the property is set to GeneralCategory.__ (0).

Structs

CodePointInfo
struct CodePointInfo(Enum)

A customizable structure providing information on a code point. It consists of a Unicode property in the form of an enum (e.g. GeneralCategory) and a length in bytes of the code point in UTF-8.

Meta