1

I have a smiliar question to this. I am also using the string manipulation node.

Right now I have the following strings (in a column):

Order[NN(STTS)]
523:10[CARD(STTS)]
Euro12[NN(STTS)]

I want to have the output:

[NN(STTS)]
[CARD(STTS)]
[NN(STTS)]

How can I use stringManipulation to do so, right now I am using:

regexReplace($List(Term)$, "/(.*?)\[" , "[")

The output I get currently is:

?
?
?

If i am checking it online with the java regex: https://regex101.com/r/z6eOHv/1 The output looks fine: enter image description here

What is my mistake?

Community
  • 1
  • 1
PV8
  • 4,547
  • 3
  • 24
  • 52
  • 1
    Looks like you are looking to create a regex, but do not know where to get started. Please check [Reference - What does this regex mean](https://stackoverflow.com/questions/22937618) resource, it has plenty of hints. Also, refer to [Learning Regular Expressions](https://stackoverflow.com/a/2759417/3832970) post for some basic regex info. Once you get some expression ready and still have issues with the solution, please edit the question with the latest details and we'll be glad to help you fix the problem. – Wiktor Stribiżew Nov 08 '19 at 08:14
  • I added some more details, regex is my natural enemy – PV8 Nov 08 '19 at 08:49
  • Why `/` at the start? `regexReplace($List(Term)$, ".*?\[" , "[")`. I'd add `^` to match from the start only: `regexReplace($List(Term)$, "^.*?\[" , "[")`. Or even use `^[^\[]+` to replace with an empty string. Not sure the backslash should be doubled. – Wiktor Stribiżew Nov 08 '19 at 08:51
  • I still getting ? back, the error is: unclosed character class near index 4, with a pointer on [. – PV8 Nov 08 '19 at 08:57
  • Double the backslashes. `regexReplace($List(Term)$, "^[^\\[]+" , "")` – Wiktor Stribiżew Nov 08 '19 at 08:59

1 Answers1

2

A "quick fix" is regexReplace($List(Term)$, "(.*?)\\[" , "["): the / looks to be a remnant of the regex literal notation used in the online regex testing services, you do not need one here as Java regexps are defined with mere string literals, and the last [ should be double escaped in a string literal.

However, you may just use

regexReplace($List(Term)$, "^[^\\[]+" , "")

The regex string is ^[^\[]+, see the regex demo. It matches

  • ^ - start of string
  • [^\[]+ - 1 or more (+ quantifier matches 1 or more occurrences) characters other than [ (the [^...] is a negated character class matching all chars other than specified in the class).

Since the string literals support string escape sequences (like a tab, \t, or newline, \n) backslashes must be doubled to introduces single literal backslashes.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397