0

I'm trying to use the bucket sort algorithm to sort strings. The task says the runtime should be about 0.05 seconds, but mine takes over 9. Why is mine taking so long, and how can I make it faster. It has about 90000 names in the file. Am I even doing the bucket sort properly?

public static void bucketSortByLength() {
      String[] bucket = new String[14];
      String[] insideBucket;
      int index = 0;
      for(int i = 0; i <= 13; i++)
        bucket[i] = "";
      for(int i = 0; i < numNames; i++)
        bucket[names[i].length()] += names[i] + " ";
      for(int i = 0; i <= 13; i++){
        insideBucket = bucket[i].split("\\s+");
        for(String s : insideBucket)
          names[index++] = s;
      }
    }
Fianite
  • 309
  • 2
  • 8
  • Repeated concatenation kills performance. Use `StringBuilder`. Also, depending on the task, the desired result might not be 14 concatenated strings, but maybe a `String[][]` (or better, `ArrayList>`). – Amadan Nov 09 '15 at 01:59
  • I can't use StringBuilder for this particular case (limitations on what I can/can't use), but good suggestion. I don't understand how I'd implement a String[][] given that I'm only storing one set of strings per bucket, and the bucket id is just it's index? Can you elaborate? – Fianite Nov 09 '15 at 02:03
  • Your bucket is a string (`"A B C"`), not a set of strings. With `String[][]`, it could actually be a set of strings (`{"A", "B", "C"}`). – Amadan Nov 09 '15 at 02:05
  • This code doesn't appear to sort the strings, unless your definition of "sort" is different from mine. Are you supposed to just sort them by length without caring about putting them into alphabetical (lexicographic) order? If so, please edit your question and clarify. If you're supposed to put the strings in order, then your approach is totally wrong. Also, if there are any restrictions on what you can or can't use, please specify those in the question. – ajb Nov 09 '15 at 02:07
  • It worked! Got it done to 0.04! Thanks guys! – Fianite Nov 09 '15 at 02:22

2 Answers2

0

Your code is likely slow because it's doing something that can be really inefficient in Java: String values are immutable, so concatenation creates a whole new string each time.

You should use a data type that supports efficient appends for your buckets. Two options that come to mind are StringBuilder and LinkedList<String>. (The latter would have been better when I last used Java everyday, and probably still is.)

Will Angley
  • 1,382
  • 6
  • 11
0

As others have stated, this is inefficient because of the string concatenation.

A good rule of thumb:
Whenever you find your self concatenating strings inside a loop, it's best to use a StringBuilder.
See: When to use StringBuilder in Java

This code should give you the performance you are expecting:

public static void bucketSortByLength() {
    StringBuilder[] bucket = new StringBuilder[14];
    int index = 0;
    for(int i = 0; i <= 13; i++)
        bucket[i] = new StringBuilder();
    for(int i = 0; i < numNames; i++)
        bucket[names[i].length()].append(names[i]).append(' ');

    for(int i = 0; i <= 13; i++){
        String[] insideBucket = bucket[i].toString().split("\\s+");
        for(String s : insideBucket)
            names[index++] = s;
    }
}
Community
  • 1
  • 1
Andy Guibert
  • 34,857
  • 7
  • 32
  • 54