I have a string :

0000000000<table blalba>blaalb<tr>gfdg<td>kgdfkg</td></tr>fkkkkk</table>5555

I want to replace the text between table and /table with : "", to delete this text to display only 00000000005555.

When it is on one line, it works:

chaineHtml = chaineHtml.replaceFirst("[^<title>](.*)[</title>$", "");

But the same with table fails.

  • 13,254
  • 3
  • 30
  • 49
  • 135
  • 11

6 Answers6


This regex should work:

html = html.replaceAll("(?is)<table.+?/table>", "");

Where (?is) will make it match across multiple lines and ignore case.

But I suggest you should not manipulate HTML using regex as it can be error prone.

  • 664,788
  • 59
  • 469
  • 547

try this

s = s.replaceAll("<table.+/table>", "");
Evgeniy Dorofeev
  • 124,221
  • 27
  • 187
  • 258

I don't think that means what you think it means.

It is not "a string not equal to <table>". Rather, it means "a character not equal to < or t or a or b or l or e or >". "[^...]" is called a negative character class.

Change your regex to


replace it with


and it will give you the result you wish.

Please consider bookmarking The Stack Overflow Regular Expeession FAQ for future reference. The bottom section contains a list of online regex testers where you can try things out yourself. You may also want to check out the sections named "Character Classes" and, as mentioned by @anubhava: "General Information > Do not use regex to parse HTML"

  • 1
  • 1
  • 18,274
  • 16
  • 66
  • 102
String resultString = subjectString.replaceAll("<table.*?table>", "");


Match the characters “<table” literally «<table»
Match any single character that is not a line break character «.*?»
   Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the characters “table>” literally «table>»
Pedro Lobito
  • 75,541
  • 25
  • 200
  • 222
  • One line answers are not good style. Please consider adding explanation, particularly so other readers will understand your answer. – david.pfx Apr 16 '14 at 14:48

Don't use regex if you are not familiar with its concepts!

There is a simple plain java solution for your problem:

String begin = "<table";
String end = "</table>";
String s = "0000000001<table blalba>blaalb<tr>gfdg<td>kgdfkg</td></tr>fkkkkk</table>4555";
int tableIndex = s.indexOf(begin);
int tableEndIndex = s.indexOf(end, tableIndex);

while (tableIndex > -1) {
    s = s.substring(0, tableIndex) + s.substring(tableEndIndex + end.length());
    tableIndex = s.indexOf("<table");
    tableEndIndex = s.indexOf("</table>", tableIndex);
  • 2,044
  • 16
  • 27

Here is a brilliant solution I found somewhere: Using the Regex


to fit any character, including newlines because it fits any space or non-space characters. So in your case that would give:

s = s.replaceAll("<table[\\s\\S]+/table>", "");

the double backslashes are to escape the backslash.

Another possiblity is


which is any character (except newline) or newline which gives:

s = s.replaceAll("<table(.|\n)+/table>", "");

For some reason, on my computer, in certain combinations, when I use (.|\n)+ regex runs into a weird loop and goes into a stackoverflow:

Exception in thread "main" java.lang.StackOverflowError at java.lang.Character.codePointAt(Character.java:4668) at java.util.regex.Pattern$CharProperty.match(Pattern.java:3693)

which doesn't happen when I use [\s\S\]+ instead. I have no idea why though.

  • 13,836
  • 11
  • 76
  • 121