9

I'm having trouble getting NSRegularExpression to match patterns on strings with wider (?) Unicode characters in them. It looks like the problem is the range parameter -- Swift counts individual Unicode characters, while Objective-C treats strings as if they're made up of UTF-16 code units.

Here is my test string and two regular expressions:

let str = "dogcow"
let dogRegex = NSRegularExpression(pattern: "d.g", options: nil, error: nil)!
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!

I can match the first regex with no problems:

let dogMatch = dogRegex.firstMatchInString(str, options: nil, 
                   range: NSRange(location: 0, length: countElements(str)))
println(dogMatch?.range)  // (0, 3)

But the second fails with the same parameters, because the range I send it (0...7) isn't long enough to cover the whole string as far as NSRegularExpression is concerned:

let cowMatch = cowRegex.firstMatchInString(str, options: nil, 
                   range: NSRange(location: 0, length: countElements(str)))
println(cowMatch.range)  // nil

If I use a different range I can make the match succeed:

let cowMatch2 = cowRegex.firstMatchInString(str, options: nil, 
                    range: NSRange(location: 0, length: str.utf16Count))
println(cowMatch2?.range)  // (7, 3)

but then I don't know how to extract the matched text out of the string, since that range falls outside the range of the Swift string.

Nate Cook
  • 87,949
  • 32
  • 210
  • 173

1 Answers1

10

Turns out you can fight fire with fire. Using the Swift-native string's utf16Count property and the substringWithRange: method of NSString -- not String -- gets the right result. Here's the full working code:

let str = "dogcow"
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!

if let cowMatch = cowRegex.firstMatchInString(str, options: nil,
                      range: NSRange(location: 0, length: str.utf16Count)) {
    println((str as NSString).substringWithRange(cowMatch.range))
    // prints "cow"
}

(I figured this out in the process of writing the question; score one for rubber duck debugging.)

Nate Cook
  • 87,949
  • 32
  • 210
  • 173
  • 1
    If you convert `let nsstr = str as NSString` first then you can simply use `length: [nsstr length]` as you would in ObjC. – Martin R Sep 17 '14 at 05:11