1

I tried to print a content of the CharacterSet.decimalDigits with:

print(CharacterSet.decimalDigits)

output: CFCharacterSet Predefined DecimalDigit Set

But my expectation was something like this:

[1, 2, 3, 4 ...]

So my question is: How to print content of the CharacterSet.decimalDigits?

Honey
  • 24,125
  • 14
  • 123
  • 212
Blazej SLEBODA
  • 6,503
  • 3
  • 35
  • 69
  • Possible duplicate of [NSArray from NSCharacterSet](https://stackoverflow.com/q/15741631/1187415) – which has solutions also for Swift and CharacterSet. – Martin R Dec 21 '19 at 14:52

2 Answers2

2

This is not easy. Character sets are not made to be iterated, they are made to check whether a character is inside them or not. They don't contain the characters themselves and the ranges cannot be accessed.

The only thing you can do is to iterate over all characters and check every one of them against the character set, e.g.:

let set = CharacterSet.decimalDigits
let allCharacters = UInt32.min ... UInt32.max

allCharacters
    .lazy
    .compactMap { UnicodeScalar($0) }
    .filter { set.contains($0) }
    .map { String($0) }
    .forEach { print($0) }

However, note that such a thing takes significant time and shouldn't be used inside a production application.

Sulthan
  • 118,286
  • 20
  • 194
  • 245
  • 1
    Wow output ... ۷, ۹, ߀, ߁, ߂, ߃, ߄, ߅, ߆, ߇, ߈, ߉, ०, १, २, ३, ४, ५, ६, ७, ८, ९, ০, ১, ২, ৩, ৪, ৫, ৬, ৭, ৮, ৯, ੦, ੧, ੨, ੩, ੪, ੫, ੬, ੭, ੮, ੯, ૦, ૧, ૨, ૩, ૪, ૫, ૬, ૭, ૮, ૯, ୦, ୧, ୨, ୩, ୪, ୫, ୬, ୭, ୮, ୯, ௦, ௧, ௨, ௩, ௪, ௫, ௬, ௭, ௮, ௯, ౦, ౧, ౨, ౩, ౪, ౫, ౬, ౭, ౮, ౯, ೦, ೧, ೨, ೩, ೪, ೫, ೬, ೭, ೮, ೯, ൦, ൧, ൨, ൩, ൪, ൫, ൬, ൭, ൮, ൯, ෦, ෧, ෨, ෩, ෪, ෫, ෬, ෭, ෮, ෯, ๐, ๑, ๒, ๓, ๔, ๕, ๖, ๗, ๘, ๙, ໐, ໑, ໒, ໓, ໔, ໕, ໖, ໗, ໘, ໙, ༠, ༡, ༢, ༣, ༤, ༥, ༦, ༧, ༨, ༩, ၀, ၁, ၂, ၃, ၄, ၅, ၆, ၇, ၈, ၉, ႐, ႑, ႒, ႓, ႔, ႕, ႖, ႗, ႘, ႙, ០, ១, ២, ៣, ៤, ៥, ៦, ៧, ៨, ៩, ᠐, ᠑, ᠒, ᠓, ᠔, ᠕, ᠖, ᠗, ᠘, ᠙, ᥆, ᥇, ᥈, ᥉, ᱘, ᱙, ꘠, ꘡, ꘢, ꘣, ꘤, ꘥ ... – Blazej SLEBODA Dec 21 '19 at 14:04
  • @Adobels In general, it should be https://www.fileformat.info/info/unicode/category/Nd/list.htm – Sulthan Dec 21 '19 at 14:10
  • 2
    Instead of iterating over 2^32 characters you can check for which planes the set contains characters at all, compare https://stackoverflow.com/a/15742659/1187415. – Martin R Dec 21 '19 at 14:54
0

I don't think you can to that, at least not directly. If you look at the output of

let data = CharacterSet.decimalDigits.bitmapRepresentation

for byte in data {
    print(String(format: "%02x", byte))
}

you'll see that the set internally stores bits at the code positions where the decimal digits are.

Gereon
  • 14,827
  • 4
  • 36
  • 62
  • ... 00 ff 03 00 00 00 00 00 00 00 00 00 00 ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – Blazej SLEBODA Dec 21 '19 at 14:00
  • Why there are two hex values only: ff and 03? Why those values are so separated? I was thinking that digits lay next to each other – Blazej SLEBODA Dec 21 '19 at 14:02
  • Look at e.g. only the first 20 lines or so of output. You'll see that the 7th and 8th byte are 0xFF and 0x03, respectively. That's 10 bits, and in exactly the places where the digits "0"-"9" sit in the ASCII encoding (run `man 7 ascii` in a terminal and look at the hex table for reference). – Gereon Dec 21 '19 at 14:06
  • That is true only for the first 2^16 characters of the Unicode character set, i.e. for the basic multilingual plane (BMP). The exact format is described here: https://developer.apple.com/documentation/foundation/nscharacterset/1417719-bitmaprepresentation. – Martin R Dec 21 '19 at 14:51