186

I want to extract substrings from a string that match a regex pattern.

So I'm looking for something like this:

func matchesForRegexInText(regex: String!, text: String!) -> [String] {
   ???
}

So this is what I have:

func matchesForRegexInText(regex: String!, text: String!) -> [String] {

    var regex = NSRegularExpression(pattern: regex, 
        options: nil, error: nil)

    var results = regex.matchesInString(text, 
        options: nil, range: NSMakeRange(0, countElements(text))) 
            as Array<NSTextCheckingResult>

    /// ???

    return ...
}

The problem is, that matchesInString delivers me an array of NSTextCheckingResult, where NSTextCheckingResult.range is of type NSRange.

NSRange is incompatible with Range<String.Index>, so it prevents me of using text.substringWithRange(...)

Any idea how to achieve this simple thing in swift without too many lines of code?

mitchkman
  • 4,963
  • 7
  • 30
  • 55

12 Answers12

337

Even if the matchesInString() method takes a String as the first argument, it works internally with NSString, and the range parameter must be given using the NSString length and not as the Swift string length. Otherwise it will fail for "extended grapheme clusters" such as "flags".

As of Swift 4 (Xcode 9), the Swift standard library provides functions to convert between Range<String.Index> and NSRange.

func matches(for regex: String, in text: String) -> [String] {

    do {
        let regex = try NSRegularExpression(pattern: regex)
        let results = regex.matches(in: text,
                                    range: NSRange(text.startIndex..., in: text))
        return results.map {
            String(text[Range($0.range, in: text)!])
        }
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

Example:

let string = "€4€9"
let matched = matches(for: "[0-9]", in: string)
print(matched)
// ["4", "9"]

Note: The forced unwrap Range($0.range, in: text)! is safe because the NSRange refers to a substring of the given string text. However, if you want to avoid it then use

        return results.flatMap {
            Range($0.range, in: text).map { String(text[$0]) }
        }

instead.


(Older answer for Swift 3 and earlier:)

So you should convert the given Swift string to an NSString and then extract the ranges. The result will be converted to a Swift string array automatically.

(The code for Swift 1.2 can be found in the edit history.)

Swift 2 (Xcode 7.3.1) :

func matchesForRegexInText(regex: String, text: String) -> [String] {

    do {
        let regex = try NSRegularExpression(pattern: regex, options: [])
        let nsString = text as NSString
        let results = regex.matchesInString(text,
                                            options: [], range: NSMakeRange(0, nsString.length))
        return results.map { nsString.substringWithRange($0.range)}
    } catch let error as NSError {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

Example:

let string = "€4€9"
let matches = matchesForRegexInText("[0-9]", text: string)
print(matches)
// ["4", "9"]

Swift 3 (Xcode 8)

func matches(for regex: String, in text: String) -> [String] {

    do {
        let regex = try NSRegularExpression(pattern: regex)
        let nsString = text as NSString
        let results = regex.matches(in: text, range: NSRange(location: 0, length: nsString.length))
        return results.map { nsString.substring(with: $0.range)}
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

Example:

let string = "€4€9"
let matched = matches(for: "[0-9]", in: string)
print(matched)
// ["4", "9"]
Martin R
  • 488,667
  • 78
  • 1,132
  • 1,248
  • 9
    You saved me from becoming insane. Not kidding. Thank you so much! – mitchkman Jan 10 '15 at 20:27
  • 1
    @MathijsSegers: I have updated the code for Swift 1.2/Xcode 6.3. Thanks for letting me know! – Martin R Apr 16 '15 at 13:01
  • 1
    but what if i want to search for strings between a tag? I need the same result (match information) like: https://regex101.com/r/cU6jX8/2. which regex pattern would you suggest? – Peter Kreinz Aug 18 '15 at 21:09
  • The update is for Swift 1.2, not Swift 2. The code doesn't compile with Swift 2. – PatrickNLT Sep 13 '15 at 19:17
  • @pnollet: Strange. I have just double-checked that the "Swift 2" version compiles and runs with Xcode 7 GM, and it worked as expected. What error message do you get? – Martin R Sep 13 '15 at 19:20
  • 1
    Thanks! What if you only want to extract what's actually between () in the regex? For example, in "[0-9]{3}([0-9]{6})" I'd only want to get the last 6 numbers. – p4bloch Sep 23 '15 at 23:01
  • If I try using this method to extract text between parentheses I get compile errors (using Swift 2) - the reg ex I am passing in is `"\((.*?)\)"`can anyone help? – Kevin Mann Nov 13 '15 at 14:25
  • @KevinMann Double backslash the special characters. You need to escape the string and the special character – Declan McKenna Dec 14 '15 at 10:56
  • The Swift 3 example here does not compile. The try statement gives a compile error "Errors thrown from here are not handled" – Fuad Kamal Sep 26 '16 at 19:57
  • @FuadKamal: That is strange. I have double-checked it with Xcode 8, and it compiles and runs as expected. – Martin R Sep 26 '16 at 20:13
  • @MartinR it works for me now as well. Didn't even have to do a clean...Xcode strangeness I guess ¯\_(ツ)_/¯ Thanks for checking! – Fuad Kamal Sep 27 '16 at 01:27
  • @MartinR shouldn't you check for nil? – Vyachaslav Gerchicov Jul 06 '17 at 13:07
  • @VyachaslavGerchicov: `$0.range` is an NSRange returned from `regex.matches(...)` as the range in the string which matches the given pattern. Therefore I think it is safe to assume that it can be converted back to a `Range`. – Martin R Jul 06 '17 at 13:11
  • In Swift 4, the line `text.substring(with: Range($0.range, in: text)!)` generates a deprecated warning. I looked here https://developer.apple.com/documentation/foundation/nsstring/1418469-substring# and I don't see anything about deprecation. This answer has a solution to a slightly different problem, but I can't apply it to silence the warning. https://stackoverflow.com/questions/45562662/how-can-i-use-string-slicing-subscripts-in-swift-4 – Adrian Oct 02 '17 at 13:48
  • @Adrian: You are right. The new version should work without warnings. Thanks for the notice! – Martin R Oct 02 '17 at 14:20
  • It would be great if you could provide a solution without to force-unwrap the range! – ixany Nov 21 '17 at 20:07
  • @ixany: Have a look! – Martin R Nov 21 '17 at 20:18
  • So much hassle just to extract a substring with a regex. It should be possible to do this with a single line of code. Unbelievable! Thanks! – Leszek Szary Jul 14 '18 at 16:50
  • I _like_ the `NSRange(text.startIndex..., in: text)` open range. +1 – Stan Apr 18 '19 at 00:08
  • Not working if I am trying to find multiple quoted words in a string: https://stackoverflow.com/questions/57852915/find-multiple-quoted-words-in-a-string-with-regex – nr5 Sep 10 '19 at 10:46
  • You can also remove the explicit unwrap by doing `return results.map { (text as NSString).substring(with: $0.range) }` I submitted an edit to the answer. – jangelsb Dec 14 '19 at 01:41
  • Still saving a lot of lives! :D – Codetard Sep 15 '20 at 11:58
66

My answer builds on top of given answers but makes regex matching more robust by adding additional support:

  • Returns not only matches but returns also all capturing groups for each match (see examples below)
  • Instead of returning an empty array, this solution supports optional matches
  • Avoids do/catch by not printing to the console and makes use of the guard construct
  • Adds matchingStrings as an extension to String

Swift 4.2

//: Playground - noun: a place where people can play

import Foundation

extension String {
    func matchingStrings(regex: String) -> [[String]] {
        guard let regex = try? NSRegularExpression(pattern: regex, options: []) else { return [] }
        let nsString = self as NSString
        let results  = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length))
        return results.map { result in
            (0..<result.numberOfRanges).map {
                result.range(at: $0).location != NSNotFound
                    ? nsString.substring(with: result.range(at: $0))
                    : ""
            }
        }
    }
}

"prefix12 aaa3 prefix45".matchingStrings(regex: "fix([0-9])([0-9])")
// Prints: [["fix12", "1", "2"], ["fix45", "4", "5"]]

"prefix12".matchingStrings(regex: "(?:prefix)?([0-9]+)")
// Prints: [["prefix12", "12"]]

"12".matchingStrings(regex: "(?:prefix)?([0-9]+)")
// Prints: [["12", "12"]], other answers return an empty array here

// Safely accessing the capture of the first match (if any):
let number = "prefix12suffix".matchingStrings(regex: "fix([0-9]+)su").first?[1]
// Prints: Optional("12")

Swift 3

//: Playground - noun: a place where people can play

import Foundation

extension String {
    func matchingStrings(regex: String) -> [[String]] {
        guard let regex = try? NSRegularExpression(pattern: regex, options: []) else { return [] }
        let nsString = self as NSString
        let results  = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length))
        return results.map { result in
            (0..<result.numberOfRanges).map {
                result.rangeAt($0).location != NSNotFound
                    ? nsString.substring(with: result.rangeAt($0))
                    : ""
            }
        }
    }
}

"prefix12 aaa3 prefix45".matchingStrings(regex: "fix([0-9])([0-9])")
// Prints: [["fix12", "1", "2"], ["fix45", "4", "5"]]

"prefix12".matchingStrings(regex: "(?:prefix)?([0-9]+)")
// Prints: [["prefix12", "12"]]

"12".matchingStrings(regex: "(?:prefix)?([0-9]+)")
// Prints: [["12", "12"]], other answers return an empty array here

// Safely accessing the capture of the first match (if any):
let number = "prefix12suffix".matchingStrings(regex: "fix([0-9]+)su").first?[1]
// Prints: Optional("12")

Swift 2

extension String {
    func matchingStrings(regex: String) -> [[String]] {
        guard let regex = try? NSRegularExpression(pattern: regex, options: []) else { return [] }
        let nsString = self as NSString
        let results  = regex.matchesInString(self, options: [], range: NSMakeRange(0, nsString.length))
        return results.map { result in
            (0..<result.numberOfRanges).map {
                result.rangeAtIndex($0).location != NSNotFound
                    ? nsString.substringWithRange(result.rangeAtIndex($0))
                    : ""
            }
        }
    }
}
Lars Blumberg
  • 14,526
  • 9
  • 73
  • 107
  • 1
    Good idea about the capture groups. But why is "guard" Swiftier than "do/catch"?? – Martin R Oct 17 '16 at 05:27
  • I agree with people such as http://nshipster.com/guard-and-defer/ who say _Swift 2.0 certainly seems to be encouraging a style of early return [...] rather than nested if statements_. The same holds true for nested do/catch statements IMHO. – Lars Blumberg Oct 17 '16 at 08:47
  • try/catch is the native error handling in Swift. `try?` can be used if you are only interested in the outcome of the call, not in a possible error message. So yes, `guard try? ..` is fine, but if you want to print the error then you need a do-block. Both ways are Swifty. – Martin R Oct 17 '16 at 08:58
  • I agree that you need do/catch in your example if you want to see the error in the console. Since I want to provide a function that can be reused in production code without modifying it (`print` is an unwanted side effect for me), `guard try?` becomes a little Swiftier (if you don't need the side effect) – modified the answer to clarify. Thanks! – Lars Blumberg Oct 17 '16 at 13:18
  • 3
    I have added unittests to your nice snippet, https://gist.github.com/neoneye/03cbb26778539ba5eb609d16200e4522 – neoneye Nov 28 '16 at 13:22
  • 2
    Was about to write my own based on the @MartinR answer until i saw this. Thanks! – Oritm May 11 '17 at 22:30
  • Don't you think it is time to add Swift 4 version? – Ahmad Mar 09 '18 at 22:56
  • If I use this on a regex which can find quoted words in a sentence/string like this: `"hi \"how\", \"are\", you"`. The result is this: `[["\"how\", \"are\"", "how\", \"are"]]`. here is the regex which I am using: `"\"(.*)\""` – nr5 Sep 09 '19 at 14:06
16

The fastest way to return all matches and capture groups in Swift 5

extension String {
    func match(_ regex: String) -> [[String]] {
        let nsString = self as NSString
        return (try? NSRegularExpression(pattern: regex, options: []))?.matches(in: self, options: [], range: NSMakeRange(0, nsString.length)).map { match in
            (0..<match.numberOfRanges).map { match.range(at: $0).location == NSNotFound ? "" : nsString.substring(with: match.range(at: $0)) }
        } ?? []
    }
}

Returns a 2-dimentional array of strings:

"prefix12suffix fix1su".match("fix([0-9]+)su")

returns...

[["fix12su", "12"], ["fix1su", "1"]]

// First element of sub-array is the match
// All subsequent elements are the capture groups
Ken Mueller
  • 2,126
  • 1
  • 15
  • 25
14

If you want to extract substrings from a String, not just the position, (but the actual String including emojis). Then, the following maybe a simpler solution.

extension String {
  func regex (pattern: String) -> [String] {
    do {
      let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpressionOptions(rawValue: 0))
      let nsstr = self as NSString
      let all = NSRange(location: 0, length: nsstr.length)
      var matches : [String] = [String]()
      regex.enumerateMatchesInString(self, options: NSMatchingOptions(rawValue: 0), range: all) {
        (result : NSTextCheckingResult?, _, _) in
        if let r = result {
          let result = nsstr.substringWithRange(r.range) as String
          matches.append(result)
        }
      }
      return matches
    } catch {
      return [String]()
    }
  }
} 

Example Usage:

"someText ⚽️ pig".regex("⚽️")

Will return the following:

["⚽️"]

Note using "\w+" may produce an unexpected ""

"someText ⚽️ pig".regex("\\w+")

Will return this String array

["someText", "️", "pig"]
Mike Chirico
  • 2,773
  • 1
  • 19
  • 19
9

I found that the accepted answer's solution unfortunately does not compile on Swift 3 for Linux. Here's a modified version, then, that does:

import Foundation

func matches(for regex: String, in text: String) -> [String] {
    do {
        let regex = try RegularExpression(pattern: regex, options: [])
        let nsString = NSString(string: text)
        let results = regex.matches(in: text, options: [], range: NSRange(location: 0, length: nsString.length))
        return results.map { nsString.substring(with: $0.range) }
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

The main differences are:

  1. Swift on Linux seems to require dropping the NS prefix on Foundation objects for which there is no Swift-native equivalent. (See Swift evolution proposal #86.)

  2. Swift on Linux also requires specifying the options arguments for both the RegularExpression initialization and the matches method.

  3. For some reason, coercing a String into an NSString doesn't work in Swift on Linux but initializing a new NSString with a String as the source does work.

This version also works with Swift 3 on macOS / Xcode with the sole exception that you must use the name NSRegularExpression instead of RegularExpression.

Rob Mecham
  • 563
  • 7
  • 8
6

Swift 4 without NSString.

extension String {
    func matches(regex: String) -> [String] {
        guard let regex = try? NSRegularExpression(pattern: regex, options: [.caseInsensitive]) else { return [] }
        let matches  = regex.matches(in: self, options: [], range: NSMakeRange(0, self.count))
        return matches.map { match in
            return String(self[Range(match.range, in: self)!])
        }
    }
}
shiami
  • 6,724
  • 13
  • 50
  • 67
  • 3
    Be careful with above solution: `NSMakeRange(0, self.count)` is not correct, because `self` is a `String` (=UTF8) and not an `NSString` (=UTF16). So the `self.count` is not necessarily the same as `nsString.length` (as used in other solutions). You can replace the range calculation with `NSRange(self.startIndex..., in: self)` – pd95 Jun 29 '20 at 22:27
5

@p4bloch if you want to capture results from a series of capture parentheses, then you need to use the rangeAtIndex(index) method of NSTextCheckingResult, instead of range. Here's @MartinR 's method for Swift2 from above, adapted for capture parentheses. In the array that is returned, the first result [0] is the entire capture, and then individual capture groups begin from [1]. I commented out the map operation (so it's easier to see what I changed) and replaced it with nested loops.

func matches(for regex: String!, in text: String!) -> [String] {

    do {
        let regex = try NSRegularExpression(pattern: regex, options: [])
        let nsString = text as NSString
        let results = regex.matchesInString(text, options: [], range: NSMakeRange(0, nsString.length))
        var match = [String]()
        for result in results {
            for i in 0..<result.numberOfRanges {
                match.append(nsString.substringWithRange( result.rangeAtIndex(i) ))
            }
        }
        return match
        //return results.map { nsString.substringWithRange( $0.range )} //rangeAtIndex(0)
    } catch let error as NSError {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

An example use case might be, say you want to split a string of title year eg "Finding Dory 2016" you could do this:

print ( matches(for: "^(.+)\\s(\\d{4})" , in: "Finding Dory 2016"))
// ["Finding Dory 2016", "Finding Dory", "2016"]
OliverD
  • 899
  • 12
  • 16
  • This answer made my day. I spent 2 hours searching for a solution that can satisfy regualr expression with the additional capturing of groups. – Ahmad Mar 09 '18 at 22:55
  • This works but it will crash if any range is not found. I modified this code so that the function returns `[String?]` and in the `for i in 0.. – stef Jun 16 '18 at 03:19
3

Most of the solutions above only give the full match as a result ignoring the capture groups e.g.: ^\d+\s+(\d+)

To get the capture group matches as expected you need something like (Swift4) :

public extension String {
    public func capturedGroups(withRegex pattern: String) -> [String] {
        var results = [String]()

        var regex: NSRegularExpression
        do {
            regex = try NSRegularExpression(pattern: pattern, options: [])
        } catch {
            return results
        }
        let matches = regex.matches(in: self, options: [], range: NSRange(location:0, length: self.count))

        guard let match = matches.first else { return results }

        let lastRangeIndex = match.numberOfRanges - 1
        guard lastRangeIndex >= 1 else { return results }

        for i in 1...lastRangeIndex {
            let capturedGroupIndex = match.range(at: i)
            let matchedString = (self as NSString).substring(with: capturedGroupIndex)
            results.append(matchedString)
        }

        return results
    }
}
valexa
  • 4,245
  • 28
  • 48
  • This is great if you're wanting just the first result, to get each result it needs `for index in 0.. – Geoff Feb 20 '18 at 16:41
  • the for clause should look like this: `for i in 1...lastRangeIndex { let capturedGroupIndex = match.range(at: i) if capturedGroupIndex.location != NSNotFound { let matchedString = (self as NSString).substring(with: capturedGroupIndex) results.append(matchedString.trimmingCharacters(in: .whitespaces)) } }` – CRE8IT Sep 17 '18 at 11:36
2

This is how I did it, I hope it brings a new perspective how this works on Swift.

In this example below I will get the any string between []

var sample = "this is an [hello] amazing [world]"

var regex = NSRegularExpression(pattern: "\\[.+?\\]"
, options: NSRegularExpressionOptions.CaseInsensitive 
, error: nil)

var matches = regex?.matchesInString(sample, options: nil
, range: NSMakeRange(0, countElements(sample))) as Array<NSTextCheckingResult>

for match in matches {
   let r = (sample as NSString).substringWithRange(match.range)//cast to NSString is required to match range format.
    println("found= \(r)")
}
Dalorzo
  • 19,312
  • 7
  • 50
  • 97
2

This is a very simple solution that returns an array of string with the matches

Swift 3.

internal func stringsMatching(regularExpressionPattern: String, options: NSRegularExpression.Options = []) -> [String] {
        guard let regex = try? NSRegularExpression(pattern: regularExpressionPattern, options: options) else {
            return []
        }

        let nsString = self as NSString
        let results = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length))

        return results.map {
            nsString.substring(with: $0.range)
        }
    }
0

Big thanks to Lars Blumberg his answer for capturing groups and full matches with Swift 4, which helped me out a lot. I also made an addition to it for the people who do want an error.localizedDescription response when their regex is invalid:

extension String {
    func matchingStrings(regex: String) -> [[String]] {
        do {
            let regex = try NSRegularExpression(pattern: regex)
            let nsString = self as NSString
            let results  = regex.matches(in: self, options: [], range: NSMakeRange(0, nsString.length))
            return results.map { result in
                (0..<result.numberOfRanges).map {
                    result.range(at: $0).location != NSNotFound
                        ? nsString.substring(with: result.range(at: $0))
                        : ""
                }
            }
        } catch let error {
            print("invalid regex: \(error.localizedDescription)")
            return []
        }
    }
}

For me having the localizedDescription as error helped understand what went wrong with escaping, since it's displays which final regex swift tries to implement.

Vasco
  • 623
  • 7
  • 7
0

update @Mike Chirico's to Swift 5

extension String{



  func regex(pattern: String) -> [String]?{
    do {
        let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpression.Options(rawValue: 0))
        let all = NSRange(location: 0, length: count)
        var matches = [String]()
        regex.enumerateMatches(in: self, options: NSRegularExpression.MatchingOptions(rawValue: 0), range: all) {
            (result : NSTextCheckingResult?, _, _) in
              if let r = result {
                    let nsstr = self as NSString
                    let result = nsstr.substring(with: r.range) as String
                    matches.append(result)
              }
        }
        return matches
    } catch {
        return nil
    }
  }
}
dengST30
  • 1,898
  • 10
  • 19