2

For the string Be there @ six.

Why does this work:

str.gsub! /\bsix\b/i, "seven"

But trying to replace the @ sign doesn't match:

str.gsub! /\b@\b/i, "at"

Escaping it doesn't seem to work either:

str.gsub! /\b\@\b/i, "at"
user3188544
  • 1,541
  • 11
  • 17
  • @aliteralmind That post doesn't appear to mention '@'. – Chuck Apr 09 '14 at 01:36
  • @aliteralmind I tried escaping it as follows and it still seemed to fail to match: `/\b\@\b/i` – user3188544 Apr 09 '14 at 01:36
  • Actually, it's a word-boundary issue. Not an escaping issue. Unless there's a word ending immediately before it, or starting immediately after it, it won't match. Relevant question in the [StackOverflow Regular Expression FAQ](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean/22944075#22937618): [\b:word boundary, and \B:non-word boundary](http://stackoverflow.com/a/6664167), listed under category "Anchors", about a 1/4 way down. – aliteralmind Apr 09 '14 at 01:40

2 Answers2

4

This is down to how \b is interpreted. \b is a "word boundary", wherein a zero-length match occurs if \b is preceded by or followed by a word character. The word characters are limited to [A-Za-z0-9_] and maybe a few other things, but @ is not a word character, so \b won't match just before it (and after a space). The space itself is not the boundary.

More about word boundaries...

If you need to replace the @ with surrounding whitespace, you can capture it after the \b and use backreferences. This captures preceding whitespace with \s* for zero or more space characters.

str.gsub! /\b(\s*)@(\s*)\b/i, "\\1at\\2"
=> "Be there at six"

Or to insist upon whitespace, use \s+ instead of \s*.

str = "Be there @ six."
str.gsub! /\b(\s+)@(\s+)\b/i, "\\1at\\2"
=> "Be there at six."

# No match without whitespace...
str = "Be there@six."
str.gsub! /\b(\s+)@(\s+)\b/i, "\\1at\\2"
=> nil

At this point, we're starting to introduce redundancies by forcing the use of \b. It could just as easily by done with /(\w+\s+)@(\s+\w+)/, foregoing the \b match for \w word characters followed by \s whitespace.

Update after comments:

If you want to treat @ like a "word" which may appear at the beginning or end, or inside bounded by whitespace, you may use \W to match "non-word" characters, combined with ^$ anchors with an "or" pipe |:

# Replace @ at the start, middle, before punctuation
str = "@ Be there @ six @."
str.gsub! /(^|\W+)@(\W+|$)/, '\\1at\\2'
=> "at Be there at six at."

(^|\W+) matches either ^ the start of the string, or a sequence of non-word characters (like whitespace or punctuation). (\W+|$) is similar but can match the end of the string $.

Community
  • 1
  • 1
Michael Berkowski
  • 253,311
  • 39
  • 421
  • 371
1

\b matches a word boundary, which is where a word character is next to a non-word character. In your string the @ has a space on each side, and neither @ or space are word characters so there is no match.

Compare:

'be there @ six'.gsub /\b@\b/, 'at'

produces

'be there @ six'

(i.e. no changes)

but

'be there@six'.gsub /\b@\b/, 'at' # no spaces around @

produces

"be thereatsix"

Also

'be there @ six'.gsub /@/, 'at' # no word boundaries in regex

produces

"be there at six"
matt
  • 74,317
  • 7
  • 140
  • 183
  • Matt problem is that `'be there@six'.gsub /@/, 'at'` would also match. Is there a way to get the same result as using a word boundary? – user3188544 Apr 09 '14 at 01:42
  • @user3188544 the simplest way is probably just to match on whitespace around the `@`. – matt Apr 09 '14 at 01:47
  • @user3188544 something like `/(?:\A|(?<=\W))@(?:(?=\W)|\z)/` might be a kind of “word boundary for `@`” but it really depends on exactly what you’re trying to do. – matt Apr 09 '14 at 02:01