I'm having trouble getting NSRegularExpression
to match patterns on strings with wider (?) Unicode characters in them. It looks like the problem is the range parameter -- Swift counts individual Unicode characters, while Objective-C treats strings as if they're made up of UTF-16 code units.
Here is my test string and two regular expressions:
let str = "dogcow"
let dogRegex = NSRegularExpression(pattern: "d.g", options: nil, error: nil)!
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!
I can match the first regex with no problems:
let dogMatch = dogRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: countElements(str)))
println(dogMatch?.range) // (0, 3)
But the second fails with the same parameters, because the range I send it (0...7) isn't long enough to cover the whole string as far as NSRegularExpression
is concerned:
let cowMatch = cowRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: countElements(str)))
println(cowMatch.range) // nil
If I use a different range I can make the match succeed:
let cowMatch2 = cowRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: str.utf16Count))
println(cowMatch2?.range) // (7, 3)
but then I don't know how to extract the matched text out of the string, since that range falls outside the range of the Swift string.