1

Hey in Ruby how do you split on multiple white space or a tab character? I tried this

2.4.0 :003 > a = "b\tc\td"
 => "b\tc\td" 
2.4.0 :005 > a.strip.split(/([[:space:]][[:space:]]+|\t)/)
 => ["b", "\t", "c", "\t", "d"]

but the tabs themselves are getting turned into tokens and that's not what I want. The above should return

["b", "c", "d"]
Dave
  • 17,420
  • 96
  • 300
  • 582

3 Answers3

2

It happens because the group you used is a capturing one. See split reference:

If pattern contains groups, the respective matches will be returned in the array as well.

Use a non-capturing group (used only for grouping patterns) to avoid adding matched strings into the resulting array:

a.strip.split(/(?:[[:space:]][[:space:]]+|\t)/)
                ^^
Graham
  • 6,577
  • 17
  • 55
  • 76
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
0

In this instance you can use a character class that includes both spaces and tabs in your regular expression:

"b\tc\td".split /[ \t]+/

If you want to split on any whitespace, you can also use the [\s]+ notation, which matches all whitespace characters.

coreyward
  • 68,091
  • 16
  • 122
  • 142
0

There are some easy approaches than accepted solution:

a.strip.split("\s")

or

a.split("\s")

'\s' will take care for multiple whitespaces characters.

for above case you can simply use:

a = "b\tc\td" 
a.split("\t")    #=> ["b", "c", "d"]

or for combination of multiple spaces and tabs

a.gsub("\t", " ").split("\s")     #=> ["b", "c", "d"]
chitresh
  • 183
  • 2
  • 8