0

I need a regex that finds values other than !#! in a string.

If, for example, the string were:

Text1!#!Text2!#!Text#3!#!

it should return Text1, Text2, Text#3

[^!#!] would match the # in Text#3 as well, which I don't want.

abestrad
  • 801
  • 2
  • 12
  • 21
Prakash S
  • 3
  • 4
  • 1
    Does this answer your question? [Regex: match everything but specific pattern](https://stackoverflow.com/questions/1687620/regex-match-everything-but-specific-pattern) – Poul Bak Jun 25 '20 at 05:31
  • 1
    @Prakash S - Which regex flavor is used? – Armali Jun 25 '20 at 05:42
  • What's to become of `**`? If you split the string on `!#!`, for example, you get an array containing `Text1**`, `Text2`, `Text#3` and `**`. – Cary Swoveland Jun 25 '20 at 06:02
  • @CarySwoveland I suspect that the `**` were intended to be _markup_ for bold text, i.e. they are not part of the actual data. – Tim Biegeleisen Jun 25 '20 at 06:05
  • @TimBiegeleisen That's correct, Have edited the data correctly now – Prakash S Jun 25 '20 at 06:07
  • @PoulBak I tried that , but that does seem to be working for me – Prakash S Jun 25 '20 at 06:08
  • @PrakashS Are you using a programming language? – Tim Biegeleisen Jun 25 '20 at 06:09
  • @Armali am trying to use it with oracle REGEXP_SUBSTR – Prakash S Jun 25 '20 at 06:10
  • @TimBiegeleiseno am using it with oracle REGEXP_SUBSTR – Prakash S Jun 25 '20 at 06:11
  • 1
    What is your language from which you run your regex? Why use a regex at all and not simply split the string on your fixed pattern `!#!`, then loop over each element in the resulting array? – knittl Jun 25 '20 at 06:12
  • This type of problem is not well suited to Oracle (nor to most SQL databases). You should probably handle this requirement outside of Oracle. – Tim Biegeleisen Jun 25 '20 at 06:12
  • Prakash, note that `[^!#!]` is the same as `[^!#]` and matches any single character other than `!` and `#`. – Cary Swoveland Jun 25 '20 at 06:15
  • Previous developer has written it in regex way in one of the stored procedure and I'm trying to fix a bug here and don't have much experience in Oracle, so thought just changing the regex would solve the issue – Prakash S Jun 25 '20 at 06:19
  • @CarySwoveland Got it , is there any way to look for this specific pattern '!#!' in regex. So i have a many values which stored as a string and separated by '!#!' by this pattern, If i have '#' in one of the values then i won't get the correct data with [^!#!] – Prakash S Jun 25 '20 at 06:20
  • Would this column always have exactly three components to it? Or, could there sometimes be greater or fewer than 3? – Tim Biegeleisen Jun 25 '20 at 06:27
  • @TimBiegeleisen It could be more than 3 – Prakash S Jun 25 '20 at 06:28

2 Answers2

0

You could solve this problem using positive lookbehind and lookahead.

Simply:

  • match anyhting (.+?)
  • being at the beginning of the text or having the pattern in front (?<=^|!#!)
  • and being at the end of the text or having the pattern behind (?=!#!|$)

See your example here: https://rubular.com/r/f6BDr9CxeaQTIz using (?<=^|!#!)(.+?)(?=!#!|$)

Itchy
  • 1,847
  • 19
  • 33
  • Note that Oracle's regex engine does _not_ support lookarounds. While this might be a general solution, it will not work for Oracle. – Tim Biegeleisen Jun 25 '20 at 06:36
  • @TimBiegeleisen so we can't fix this issue with regex in Oracle then? – Prakash S Jun 25 '20 at 06:38
  • 1
    If your Oracle tables contain this type of data, then your data is pretty far from being normalized. You should fix your data model. – Tim Biegeleisen Jun 25 '20 at 06:41
  • Sorry, I don't know the limitations of Oracles regex engine. But maybe you can just match the expression with a simple pattern (e.g. `(.+?!#!)`) and then remove the pattern in a second step? – Itchy Jun 25 '20 at 06:42
0

You can use REGEXP_SUBSTR in conjunction with CONNECT_BY to split the string into words separated by !#!. We use the regex:

(.*?)(!#!|$)

which matches some number of characters lazily until it encounters either !#! or the end of string ($).

For example:

SELECT REGEXP_SUBSTR ('Text1!#!Text2!#!Text#3!#!',
                      '(.*?)(!#!|$)',
                      1,
                      LEVEL,
                      '',
                      1)
              AS VAL
FROM DUAL
CONNECT BY REGEXP_SUBSTR ('Text1!#!Text2!#!Text#3!#!',
                          '(.*?)(!#!|$)',
                          1,
                          LEVEL,
                          '',
                          1)
IS NOT NULL

Output:

VAL
Text1
Text2
Text#3

Demo on dbfiddle

Nick
  • 118,076
  • 20
  • 42
  • 73