class ICU::USet
- ICU::USet
- Reference
- Object
Overview
Sets of Unicode Code Points and Strings
ICU::USet is a mutable set of Unicode code points (characters) and
multi-character strings. It is used internally by ICU APIs such as
ICU::Collator and ICU::Transliterator, and is exposed here as an
Enumerable(Char) so that callers can work with ordinary Crystal types.
Sets can be constructed from a character range, from a UnicodeSet pattern
string (e.g. "[\\p{L}]", "[a-zA-Z0-9]"), or built up programmatically
via #add / #add_range.
For simple inspection you can call #to_set to obtain a plain Set(Char),
or iterate directly with #each.
Usage
s = ICU::USet.new('a', 'z') # characters a–z
s.includes?('m') # => true
s.size # => 26
s.to_set # => Set{'a', 'b', ..., 'z'}
vowels = ICU::USet.new("[aeiouAEIOU]")
vowels.includes?('e') # => true
See also
Included Modules
- Enumerable(Char)
Defined in:
icu/uset.crConstructors
-
.new(from : Char, to : Char)
Creates a set containing all characters in the range from..to (inclusive).
-
.new(pattern : String)
Creates a set from a UnicodeSet pattern.
-
.new
Creates an empty set.
Instance Method Summary
-
#add(char : Char) : self
Adds char to this set.
-
#add(string : String) : self
Adds a multi-character string element to this set.
-
#add_all(other : USet) : self
Adds all members of other to this set (union in place).
-
#add_range(from : Char, to : Char) : self
Adds all characters in the range from..to (inclusive) to this set.
-
#clear : self
Removes all members, leaving an empty set.
-
#complement : self
Complements this set: every character previously absent is added, and every character previously present is removed.
-
#disjoint?(other : USet) : Bool
Returns
trueif this set and other have no characters in common. -
#each(& : Char -> ) : Nil
Yields each character in this set (string elements are skipped).
-
#empty? : Bool
Returns
trueif the set contains no characters or strings. - #finalize
-
#includes?(char : Char) : Bool
Returns
trueif char is a member of this set. -
#includes?(string : String) : Bool
Returns
trueif string (as a multi-character string element) is in this set. -
#includes_all_of?(string : String) : Bool
Returns
trueif every character of string is individually in this set. -
#intersects?(other : USet) : Bool
Returns
trueif this set and other have at least one character in common. -
#remove(char : Char) : self
Removes char from this set.
-
#remove_all(other : USet) : self
Removes all members of other from this set (set difference in place).
-
#remove_range(from : Char, to : Char) : self
Removes all characters in the range from..to from this set.
-
#retain_all(other : USet) : self
Retains only the members of other (intersection in place).
-
#size : Int32
Returns the number of characters (and strings) in this set.
-
#superset?(other : USet) : Bool
Returns
trueif every member of other is also in this set. -
#to_pattern(escape_unprintable : Bool = false) : String
Returns the UnicodeSet pattern string representing this set.
-
#to_set : Set(Char)
Returns a Crystal
Set(Char)with all characters from this set. - #to_unsafe : LibICU::USet
Constructor Detail
Creates a set containing all characters in the range from..to (inclusive).
ICU::USet.new('a', 'z').size # => 26
Creates a set from a UnicodeSet pattern.
Raises ICU::Error if the pattern is invalid.
ICU::USet.new("[\\p{Lu}]") # all uppercase letters
ICU::USet.new("[aeiou]") # vowels
Instance Method Detail
Adds char to this set.
s = ICU::USet.new
s.add('x')
s.includes?('x') # => true
Adds all members of other to this set (union in place).
a = ICU::USet.new('a', 'c')
b = ICU::USet.new('d', 'f')
a.add_all(b).size # => 6
Adds all characters in the range from..to (inclusive) to this set.
s = ICU::USet.new
s.add_range('a', 'f')
s.size # => 6
Complements this set: every character previously absent is added, and every character previously present is removed.
Returns true if this set and other have no characters in common.
Yields each character in this set (string elements are skipped).
ICU::USet.new('a', 'c').each { |c| print c } # => abc
Returns true if the set contains no characters or strings.
ICU::USet.new.empty? # => true
Returns true if char is a member of this set.
ICU::USet.new('a', 'z').includes?('m') # => true
ICU::USet.new('a', 'z').includes?('M') # => false
Returns true if string (as a multi-character string element) is in
this set.
Returns true if every character of string is individually in this set.
ICU::USet.new('a', 'z').includes_all_of?("hello") # => true
ICU::USet.new('a', 'z').includes_all_of?("Hello") # => false
Returns true if this set and other have at least one character in common.
Removes all members of other from this set (set difference in place).
Removes all characters in the range from..to from this set.
Retains only the members of other (intersection in place).
a = ICU::USet.new('a', 'f')
b = ICU::USet.new('d', 'z')
a.retain_all(b).to_set # => Set{'d', 'e', 'f'}
Returns the number of characters (and strings) in this set.
ICU::USet.new('a', 'z').size # => 26
Returns true if every member of other is also in this set.
Returns the UnicodeSet pattern string representing this set.
ICU::USet.new('a', 'c').to_pattern # => "[a-c]"
Returns a Crystal Set(Char) with all characters from this set.
Multi-character string elements are omitted.
ICU::USet.new('a', 'c').to_set # => Set{'a', 'b', 'c'}