0

I'm trying to implement some kind of dynamic filtering.

Let's say I have a collection of objects. Each objects have same keys but different values.

Ex :

{
    "state":"time out",
    "displayState": "error"
}

I want to filter and categorize them following a pattern extracted from a string.

Ex (no meaning at all, just extrapolated) :

"displayState=error&(state!=aborted|(state=cancelled&state=timed out))"

I think the best way to instrument this string would be via regular expression. To be able to catch groups, operands and operators

Here's what I have for now :

([^|&!()=<>]*)([=!<>]{1,2})([^|&!()=<>]*)(?:([|&])\(?([^|!&()=<>]*)([=!<>]{1,2})([^|&!()=<>]*)\)?)?

It's basic and linear, my knowledge in regexp is a limited, so it's not doing what I need.

Basically I'm trying to group by () first and then by [.*][=><!][.*].

Catching groups, operands and operators in the same process.

- EDIT -

Thanks to Aniket's answer I was able to get a little further.

As stated in these answers, regular expressions can't do recursion, at least, not in Javascript.

So groups delimited by () cannot be isolated only by regexp, and needs some logic.

I've reviewed Aniket's regexp to clean catches

/([&|])?\(*(([a-zA-Z0-9 ]*)([!=<>]+)([a-zA-Z0-9 ]*))\)*/g

which will return

0 : {
    expression : displayState=error
    type : undefined
    operand1 : displayState
    operator : =
    operand2 : error
},
1 : {

    expression : &(state!=aborted
    type : &
    operand1 : state
    operator : !=
    operand2 : aborted
},
2 : {
    expression : |(state=cancelled
    type : |
    operand1 : state
    operator : =
    operand2 : cancelled
},
3 : {
    expression : |state=timed out))
    type : |
    operand1 : state
    operator : =
    operand2 : timed out
}

I'm working on isolating groups with javascript, and have a complete jsFiddle workflow.

I'll post my solution once it's working properly.

Community
  • 1
  • 1
Yoann
  • 2,857
  • 20
  • 32
  • `state=timed out` should it not be quoted like `state='timed out'`? – anubhava Oct 11 '14 at 04:46
  • It wouldn't change anything. I'm the one compiling the string, so the space isn't a reserved character nor to be escaped in this context. – Yoann Oct 11 '14 at 22:14
  • Difficult to understand what output you need from above expression. Your own regex is not doing it right by ignoring `(` and `)` and thus changing the meaning of whole expression. – anubhava Oct 12 '14 at 06:55
  • The output that @Aniket achieved is almost exactly what I needed. And, indeed my current RegExp isn't doing the job... otherwise I wouldn't be here in the first place. – Yoann Oct 12 '14 at 13:17
  • I had actually looked at his answer as well and it is clubbing `|(` and `&(` in one token, moreover it missing to capture `))`. Which will again change the evaluation of whole expression. – anubhava Oct 12 '14 at 14:52
  • That's right, but we won't be able to isolate groups delimited by `()` just with a regexp, at least not in javascript, as it doesn't allow recursive regular expressions. So ideally, we will isolate operands(`[a-zA-Z0-9`), operators (`[!=<>]`) and associations (`[&|]`) with a regexp, but grouping will have to be done programmatically. – Yoann Oct 12 '14 at 18:31

1 Answers1

1

This is what I have been able to come up with:

([\&\|\(]+)?([a-zA-Z0-9 ]*)([!=<>]+)([a-zA-Z0-9 ]*)

http://www.regexr.com/39mfq

This will give you the groups, operands and the operators. There is caveat here, that it will only be able to grab the first opening parenthesis, so you can add the closing ones yourself be checking the constructed groups.

Assuming I have this string

"displayState=error&(state!=aborted|(state=cancelled&state=timed out))"

From the regex given above, I will get the following groups:

displayState
=
error

&(
state
!=
aborted

|(
state
=
cancelled

&
state
=
timed out

It is fairly simple to compute this, as you can start by checking for and opening ( and if you find one, then you know that the expression ahead of it would be enclosed in it.

I know this isn't a great solution but it might help.

Aniket
  • 9,132
  • 4
  • 35
  • 61