I am attempting to write a spoiler identification system so that any spoilers in a string are replaced with a specified spoiler character.
I want to match a string surrounded by square brackets, such that the contents within the square brackets is capture group 1, and the whole string including the surrounding brackets is the match.
I am currently using \[(.*?]*)\]
, a slight modification of the expression found in this answer here, as I also want nested square brackets to be a part of capture group 1.
The problem with that expression is that, although it works and matches the following:
Jim ate a [sandwich]
matches[sandwich]
withsandwich
as group 1Jim ate a [sandwich with [pickles and onions]]
matches[sandwich with [pickles and onions]]
withsandwich with [pickles and onions]
as group 1[[[[]
matches[[[[]
with[[[
as group 1[]]]]
matches[]]]]
with]]]
as group 1
However, if I want to match the following, it does not work as expected:
Jim ate a [sandwich with [pickles] and [onions]]
matches both:[sandwich with [pickles]
withsandwich with [pickles
as group 1[onions]]
withonions]
as group 1
What expression should I use such that it matches [sandwich with [pickles] and [onions]]
with sandwich with [pickles] and [onions]
as group 1?
EDIT:
As it seems impossible to achieve this in Java using regex, is there an alternative solution?
EDIT 2:
I also want to be able to split the string by each match found, so an alternative to regular expressions would be harder to implement due to String.split(regex)
being convenient. Here's an example:
Jim ate a [sandwich] with [pickles] and [dried [onions]]
matches all:[sandwich]
withsandwich
as group 1[pickles]
withpickles
as group 1[dried [onions]]
withdried [onions]
as group 1
And the split sentence should look like:
Jim ate a
with
and