4

I've seen this question asked and answered for javascript regex, and the answer was long and very ugly. Curious if anyone has a cleaner way to implement in ruby.

Here's what I'm trying to achieve:

Test String: "foo bar baz"
Regex: /.*(foo).*(bar).*/
Expected Return: [[0,2],[4,6]]

So my goal is to be able to run a method, passing in the test string and regex, that will return the indices where each capture group matched. I have included both the starting and ending indices of the capture groups in the expected return. I'll be working on this and adding my own potential solutions here along the way too. And of course, if there's a way other than regex that would be cleaner/easier to achieve this, that's a good answer too.

Community
  • 1
  • 1
Jeff Escalante
  • 3,069
  • 1
  • 18
  • 30

2 Answers2

5
m = "foo bar baz".match(/.*(foo).*(bar).*/)
[1, 2].map{|i| [m.begin(i), m.end(i) - 1]}
# => [[0, 2], [4, 6]]
sawa
  • 156,411
  • 36
  • 254
  • 350
  • 2
    This is awesome - great answer and so quick! The only thing that bothers me about this is the array at the beginning of the map which would have to be manually set to match the number of capture groups. Maybe something like this would solve that? `1.upto(m.size-1).to_a.map{|i| [m.begin(i), m.end(i) - 1]}` – Jeff Escalante Jul 18 '13 at 17:28
  • 1
    You can do that, but you don't need `to_a` in between. – sawa Jul 18 '13 at 17:36
5

Something like this should work for a general amount of matches.

def match_indexes(string, regex)
  matches = string.match(regex)

  (1...matches.length).map do |index|
    [matches.begin(index), matches.end(index) - 1]
  end
end

string = "foo bar baz"

match_indexes(string, /.*(foo).*/)
match_indexes(string, /.*(foo).*(bar).*/)
match_indexes(string, /.*(foo).*(bar).*(baz).*/)
# => [[0, 2]]
# => [[0, 2], [4, 6]]
# => [[0, 2], [4, 6], [8, 10]]

You can have a look at the (kind of strange) MatchData class for how this works. http://www.ruby-doc.org/core-1.9.3/MatchData.html

Tal
  • 652
  • 1
  • 8
  • 17