0

I have a string in what is the best way to put the things in between $ inside a list in java?

String temp = $abc$and$xyz$;

how can i get all the variables within $ sign as a list in java [abc, xyz]

i can do using stringtokenizer but want to avoid using it if possible. thx

Shah
  • 4,500
  • 6
  • 23
  • 31

8 Answers8

9

Maybe you could think about calling String.split(String regex) ...

Riduidel
  • 21,048
  • 12
  • 78
  • 165
  • Its the recommended way to go ! StringTokenizer is there just for backwards compatibility. – Tom Aug 30 '10 at 15:28
4

The pattern is simple enough that String.split should work here, but in the more general case, one alternative for StringTokenizer is the much more powerful java.util.Scanner.

    String text = "$abc$and$xyz$";
    Scanner sc = new Scanner(text);

    while (sc.findInLine("\\$([^$]*)\\$") != null) {
        System.out.println(sc.match().group(1));
    } // abc, xyz

The pattern to find is:

\$([^$]*)\$
  \_____/     i.e. literal $, a sequence of anything but $ (captured in group 1)
     1                 and another literal $

The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.

(…) is used for grouping. (pattern) is a capturing group and creates a backreference.

The backslash preceding the $ (outside of character class definition) is used to escape the $, which has a special meaning as the end of line anchor. That backslash is doubled in a String literal: "\\" is a String of length one containing a backslash).

This is not a typical usage of Scanner (usually the delimiter pattern is set, and tokens are extracted using next), but it does show how'd you use findInLine to find an arbitrary pattern (ignoring delimiters), and then using match() to access the MatchResult, from which you can get individual group captures.

You can also use this Pattern in a Matcher find() loop directly.

    Matcher m = Pattern.compile("\\$([^$]*)\\$").matcher(text);
    while (m.find()) {
        System.out.println(m.group(1));
    } // abc, xyz

Related questions

Community
  • 1
  • 1
polygenelubricants
  • 348,637
  • 121
  • 546
  • 611
  • See example of a typical way to match `"quoted"` contents like `'this'` and `"o'my"` with regex ( http://stackoverflow.com/questions/3561353/matching-quote-contents/3561377#3561377 ) - you can do this with `Matcher` or `Scanner` as well. – polygenelubricants Aug 30 '10 at 15:41
1

Just try this one:temp.split("\\$");

khotyn
  • 934
  • 1
  • 8
  • 16
1

I would go for a regex myself, like Riduidel said.

This special case is, however, simple enough that you can just treat the String as a character sequence, and iterate over it char by char, and detect the $ sign. And so grab the strings yourself.

On a side node, I would try to go for different demarkation characters, to make it more readable to humans. Use $ as start-of-sequence and something else as end-of-sequence for instance. Or something like I think the Bash shell uses: ${some_value}. As said, the computer doesn't care but you debugging your string just might :)

As for an appropriate regex, something like (\\$.*\\$)* or so should do. Though I'm no expert on regexes (see http://www.regular-expressions.info for nice info on regexes).

extraneon
  • 22,016
  • 2
  • 42
  • 49
  • Whether or not human-readable delimiters matter depends on whether humans will ever read these strings! If you're asking a user to type these in, then yes, it's a curious delimiter. If this is something used internally or passed between modules, then it doesn't matter if it's human-readable. – Jay Aug 30 '10 at 15:57
  • @Jay a developer is also human. If it is a template and it needs change it better be readable, just like other code. – extraneon Aug 30 '10 at 16:20
1

Basically I'd ditto Khotyn as the easiest solution. I see you post on his answer that you don't want zero-length tokens at beginning and end.

That brings up the question: What happens if the string does not begin and end with $'s? Is that an error, or are they optional?

If it's an error, then just start with:

if (!text.startsWith("$") || !text.endsWith("$"))
  return "Missing $'s"; // or whatever you do on error

If that passes, fall into the split.

If the $'s are optional, I'd just strip them out before splitting. i.e.:

if (text.startsWith("$"))
  text=text.substring(1);
if (text.endsWith("$"))
  text=text.substring(0,text.length()-1);

Then do the split.

Sure, you could make more sophisticated regex's or use StringTokenizer or no doubt come up with dozens of other complicated solutions. But why bother? When there's a simple solution, use it.

PS There's also the question of what result you want to see if there are two $'s in a row, e.g. "$foo$$bar$". Should that give ["foo","bar"], or ["foo","","bar"] ? Khotyn's split will give the second result, with zero-length strings. If you want the first result, you should split("\$+").

Jay
  • 25,388
  • 9
  • 54
  • 105
0

You can use

String temp = $abc$and$xyz$;
String array[]=temp.split(Pattern.quote("$"));
List<String> list=new ArrayList<String>();
for(int i=0;i<array.length;i++){
list.add(array[i]);
}

Now the list has what you want.

Alexander Vogt
  • 17,075
  • 13
  • 45
  • 61
Santosh
  • 1,393
  • 11
  • 21
0

If you want a simple split function then use Apache Commons Lang which has StringUtils.split. The java one uses a regex which can be overkill/confusing.

Mike Q
  • 21,350
  • 19
  • 80
  • 124
0

You can do it in simple manner writing your own code. Just use the following code and it will do the job for you

import java.util.ArrayList; import java.util.List;

public class MyStringTokenizer {

/**
 * @param args
 */
public static void main(String[] args) {

    List <String> result = getTokenizedStringsList("$abc$efg$hij$");

    for(String token : result)
    {
        System.out.println(token);
    }

}

private static List<String> getTokenizedStringsList(String string) {

    List <String> tokenList = new ArrayList <String> ();

    char [] in = string.toCharArray();

    StringBuilder myBuilder = null;
    int stringLength = in.length;
    int start = -1;
    int end = -1;
    {
        for(int i=0; i<stringLength;)
        {
            myBuilder = new StringBuilder();
            while(i<stringLength && in[i] != '$')
                i++;
            i++;
            while((i)<stringLength && in[i] != '$')
            {

                myBuilder.append(in[i]);
                i++;
            }
            tokenList.add(myBuilder.toString());                
        }
    }
    return tokenList;
}

}

Saurabh
  • 7,703
  • 2
  • 20
  • 29