class ICU::Normalizer

Overview

Normalization

Unicode normalization functionality for standard Unicode normalization.

Usage

str = "À"
str.bytes # => [65, 204, 128]
norm = ICU::NFCNormalizer.new
norm.normalized?(str)       # => false
norm.normalized_quick?(str) # => Maybe
norm.normalize(str).bytes   # => [195, 128]

See also

Direct Known Subclasses

Defined in:

icu/normalizer.cr

Constant Summary

TYPES = {NFC: {name: "nfc", mode: Mode::Compose}, NFD: {name: "nfc", mode: Mode::Decompose}, NFKC: {name: "nfkc", mode: Mode::Compose}, NFKD: {name: "nfkc", mode: Mode::Decompose}, NFKCCF: {name: "nfkc_cf", mode: Mode::Compose}}

Constructors

Instance Method Summary

Constructor Detail

def self.new(type : Symbol) #

Create a new normalizer that will use the specified mode (NFC, NFD, NFKC, NFKD, NFKCCF)


[View source]

Instance Method Detail

def decomposition(chr : Char) : String #

Gets the decomposition mapping of a character

ICU::NFCNormalizer.new.decomposition("À") # => [65, 204, 128]

[View source]
def finalize #

[View source]
def inert?(chr : Char) : Bool #

Tests if the character is normalization-inert

norm = ICU::NFCNormalizer.new
norm.inert?("À") # => false
norm.inert?("A") # => true

[View source]
def normalize(text : String) : String #

Normalize some text

str = "À"
str.bytes                                   # => [65, 204, 128]
ICU::NFCNormalizer.new.normalize(str).bytes # => [195, 128]

[View source]
def normalized?(text : String) : Bool #

Tests if the string is normalized

ICU::NFCNormalizer.new.normalized?("À") # => false

(see also: #normalized_quick?)


[View source]
def normalized_quick?(text : String) : CheckResult #

Tests if the string is normalized (faster but less accurate than #normalized?

ICU::NFCNormalizer.new.normalized_quick?("À") # => Maybe

[View source]