1

I am looking to remove any words containing "oil". I thought \b grabs any word containing "oil" but seems to only replace the word itself:

String str = "foil boil oil toil hello";
str = str.replaceAll("\\boil\\b", "");

Output:

foil boil toil hello

Desired output:

hello

Aprillion
  • 16,686
  • 4
  • 48
  • 86
co_starr
  • 47
  • 1
  • 8
  • 1
    `\boil\b` grabs the word "oil" itself, when it's nested within word boundaries. Please see [Reference: What does this regex mean?](http://stackoverflow.com/questions/22937618/) – Unihedron Aug 02 '14 at 19:07

2 Answers2

3

Simply match with prefixing and suffixing [a-z]*!

Match (and replace):

/ ?[a-z]*oil[a-z]* ?/

View an online regex demo.

Unihedron
  • 10,251
  • 13
  • 53
  • 66
  • 1
    For matching this suffices, but for replacing you need to get rid of the spaces. hwnd's `.trim()` is a bit sneaky - it works, but only because there is only one word remaining. – Jongware Aug 02 '14 at 20:07
  • Ah -- that's better. But (I didn't try it) will it not remove the spaces on *either* side now? Will `xx boil yy` become `xxyy`? – Jongware Aug 03 '14 at 14:24
  • 1
    @Jongware For replacing you would have to match my regex and replace with `" "`. For every iteration of `replaceAll()` this would keep a space left in place. – Unihedron Aug 03 '14 at 14:25
  • Got it -- followed by a `.trim()`, so the output will not start or end with a space. That may be easier than expanding the regex beyond readability. – Jongware Aug 03 '14 at 14:28
2

A word boundary asserts that on one side there is a word character, and on the other side there is not.

You can use the following regex:

String s = "foil boil oil toil hello";
s = s.replaceAll("\\w*oil\\w*", "").trim();
System.out.println(s); //=> "hello"

Or if you want to be strict on just matching letters.

String s = "foil boil oil toil hello";
s = s.replaceAll("(?i)[a-z]*oil[a-z]*", "").trim();
System.out.println(s); //=> "hello"
hwnd
  • 65,661
  • 4
  • 77
  • 114
  • @co_starr: "Java"? Mention that in your post or tags (preferably both). "An" error? Mention which one, it may help. – Jongware Aug 02 '14 at 19:09
  • okay to add on to the regex, if I have something that like: http://t.co/6lyIr3mY2u http://t.co/ how can I remove that whole sequence? (it contains http: at the front it just hyperlinked when I posted it) – co_starr Aug 02 '14 at 19:13
  • 1
    @co_starr Put that in another question. Don't ask new questions in comments or answers. – Unihedron Aug 02 '14 at 19:19
  • @Unihedron will do ... in 90 minutes – co_starr Aug 02 '14 at 19:28