0

I have a String that represents a condition:

weight ne 10

I would like to split this String into three parts and analysis them separately.

I used the Java split function which worked fine:

String[] parts = conditionString.split(" ");

However, the input value could possibly be something like this:

location eq "New York"

My code stopped working because of the space between New York.

I then tried Regex with this:

^(\\w)+?\\s(\\w)+?\\s([\\w\\s\"])+$"

and it did not work.

Can someone please suggest a way to deal with this?

ZZZ
  • 363
  • 1
  • 9
  • 1
    Instead of split use `"[^"]*"|\S+` for matching – anubhava May 03 '21 at 10:36
  • You can provide a limit to `String.split()` telling it the maximum number of array elements you want, e.g. `conditionString.split(" ", 3);`. Provided that the first 2 parts don't contain spaces you should be good. – Thomas May 03 '21 at 10:36
  • [Tokenizing a String but ignoring delimiters within quotes](https://stackoverflow.com/q/3366281) – Pshemo May 03 '21 at 10:50

2 Answers2

3

As suggested by anubhava, instead of split you can use the regex, "[^"]*"|\S+ for matching.

import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        //Test
        System.out.println(getTokens("weight ne 10"));
        System.out.println(getTokens("location eq \"New York\""));
    }
    static List<String> getTokens(String str){
        return Pattern.compile("\"[^\"]*\"|\\S+")
                .matcher(str)
                .results()
                .map(MatchResult::group)
                .collect(Collectors.toList());
    }
}

Output:

[weight, ne, 10]
[location, eq, "New York"]

Explanation of the regex:

  • " matches "
  • [^"]* matches non-" characters zero or more times
  • " matches "
  • | OR
  • \S+ matches non-whitespace one or more times
Arvind Kumar Avinash
  • 50,121
  • 5
  • 26
  • 72
2

You can use this regex, which splits on all space characters unless it's in between quotes

"\\s(?=([^\"]*\"[^\"]*\")*[^\"]*$)"

Example:

public static void main(String[] args) {
    String[] conditionString = {"weight ne 10", 
                                "location eq \"New York\"", 
                                "\"day of week\" eq monday"};

    for(String condition : conditionString){
        String[] parts = condition.split("\\s(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
        System.out.println(Arrays.toString(parts));
    }
}
Eritrean
  • 9,903
  • 1
  • 15
  • 20