-3

Suppose you have large file with strings. I have a pattern which matches a subset of those strings, say a substring. How can I display those strings with everything else removed after the matched pattern (e.g. substring)? Is this possible with regular expressions?

Example: "this is one nasty string nobody likes"

My pattern: "nasty string"

Expected result: "this is one nasty string"

Vladislavs Dovgalecs
  • 1,295
  • 2
  • 15
  • 25

3 Answers3

0

You can use capturing group and backreference.

For example, in Javascript:

"this is one nasty string nobody likes".replace(/(nasty string).*$/, '$1')
// => "this is one nasty string"

Alternatively, you can use positive lookbehind assertion if you regular expression engine support it.

>>> # Python
>>> import re
>>> re.sub('(?<=nasty string).*$', '', "this is one nasty string nobody likes")
'this is one nasty string'
falsetru
  • 314,667
  • 49
  • 610
  • 551
0

Perl syntax (convert to your language standards).

If you want to remove everything after the first match of your substring, then you can use non-greedy matching:

s/(^.*?substring).*$/$1/

If you want to remove everything after the last match, then the usual greedy matching will do:

s/(^.*substring).*$/$1/

Just replace substring with your expression.

Vasiliy
  • 14,686
  • 5
  • 56
  • 105
0

Regular expressions do not remove things, they just match. What you do then with the match is just about the language and the methods you have at hand, on the matching object/environment/whatever. So a matching RE for what you want is

/\A.*nasty string/

Then again it depennds on the language, in ruby it can be

/\A.*nasty string/.match(candidate)[0]

(non-destructive construction that returns the needed string without actually changing the original one)

rewritten
  • 14,591
  • 2
  • 39
  • 48