1

How to print all utf8 glyphs using swift 3?

using this would be too slow/time consuming :

let G = "\u{0047}" // "G"

is there a shorter/more elegant way of doing so?

ielyamani
  • 15,238
  • 9
  • 45
  • 77

2 Answers2

1

You can use the UnicodeScalar type to create the string using a numeric value. And iterate the value for the range you are interested in. According to the Swift String documentation Unicode scalars are defined for the ranges U+0000 to U+D7FF and U+E000 to U+10FFF. See: https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html

NOTE

A Unicode scalar is any Unicode code point in the range U+0000 to U+D7FF inclusive or U+E000 to U+10FFFF inclusive. Unicode scalars do not include the Unicode surrogate pair code points, which are the code points in the range U+D800 to U+DFFF inclusive.

let range1From = Int("0", radix: 16)
let range1To = Int("D7FF", radix: 16)

print("Code points from U+0000 to U+D7FF")

for var code in stride(from: range1From!, to: range1To!, by: 1) {
    if let scalar = UnicodeScalar(code) {
        var string = "\(scalar)"
        print(string)
    }
}

print("Code points from U+E000 to U+10FFFF")

let range2From = Int("E000", radix: 16)
let range2To = Int("10FFFF", radix: 16)

for var code in stride(from: range2From!, to: range2To!, by: 1) {
    if let scalar = UnicodeScalar(code) {
        var string = "\(scalar)"
        print(string)
    }
}

Notice that most of the code points will be empty and some may not be displayable on your console. You may want to change the by value of the stride in the second loop to have a quick look:

for var code in stride(from: range2From!, to: range2To!, by: 100) {

This displays the full range of Unicode code points available, depending on your needs you may only be interested in the U+0000 to U+D7FF range (or even a range within). Just change the values of the range1From and range1To constants with the values of the range you are interested in.

xpereta
  • 675
  • 9
  • 21
  • This prints `Optional("\u{01}") Optional("\u{02}") Optional("\u{03}") ... ` which is not the desired result – ielyamani Oct 03 '16 at 11:29
  • I fixed the unwrapping, my console was apparently unwrapping it automatically. I also added the display of the full range of Unicode code points available. – xpereta Oct 03 '16 at 16:00
1

Try this:

let n = 1000

for i in 1...n {
    if let scalar = UnicodeScalar(i) {
        let str = String(stringInterpolationSegment: scalar)
        print(str)
    }
}

Unicode currently defines 17 planes, which can store about 1M characters, but only about 10% of that is allocated. You can also combine multiple code points to make a single character (more technically, a grapheme cluster). This defines a single character despite using 2 scalars:

let char = "a\u{33c}"
print(char)                   // a̼
print(char.characters.count)  // 1

Unicode is a very strange beast!

Community
  • 1
  • 1
Code Different
  • 73,850
  • 14
  • 125
  • 146