class ICU::Normalizer
- ICU::Normalizer
- Reference
- Object
Overview
Normalization
Unicode normalization functionality for standard Unicode normalization.
Usage
str = "À"
str.bytes # => [65, 204, 128]
norm = ICU::NFCNormalizer.new
norm.normalized?(str) # => false
norm.normalized_quick?(str) # => Maybe
norm.normalize(str).bytes # => [195, 128]
See also
Direct Known Subclasses
Defined in:
icu/normalizer.crConstant Summary
-
TYPES =
{NFC: {name: "nfc", mode: Mode::Compose}, NFD: {name: "nfc", mode: Mode::Decompose}, NFKC: {name: "nfkc", mode: Mode::Compose}, NFKD: {name: "nfkc", mode: Mode::Decompose}, NFKCCF: {name: "nfkc_cf", mode: Mode::Compose}}
Constructors
-
.new(type : Symbol)
Create a new normalizer that will use the specified mode (NFC, NFD, NFKC, NFKD, NFKCCF)
Instance Method Summary
-
#decomposition(chr : Char) : String
Gets the decomposition mapping of a character
- #finalize
-
#inert?(chr : Char) : Bool
Tests if the character is normalization-inert
-
#normalize(text : String) : String
Normalize some text
-
#normalized?(text : String) : Bool
Tests if the string is normalized
-
#normalized_quick?(text : String) : CheckResult
Tests if the string is normalized (faster but less accurate than
#normalized?
Constructor Detail
def self.new(type : Symbol)
#
Create a new normalizer that will use the specified mode (NFC, NFD, NFKC, NFKD, NFKCCF)
Instance Method Detail
Gets the decomposition mapping of a character
ICU::NFCNormalizer.new.decomposition("À") # => [65, 204, 128]
Tests if the character is normalization-inert
norm = ICU::NFCNormalizer.new
norm.inert?("À") # => false
norm.inert?("A") # => true
Normalize some text
str = "À"
str.bytes # => [65, 204, 128]
ICU::NFCNormalizer.new.normalize(str).bytes # => [195, 128]
Tests if the string is normalized
ICU::NFCNormalizer.new.normalized?("À") # => false
(see also: #normalized_quick?
)
Tests if the string is normalized (faster but less accurate than #normalized?
ICU::NFCNormalizer.new.normalized_quick?("À") # => Maybe