0
function binarySearch(value)
{
    var startIndex = 0,
        stopIndex = words.length - 1,
        middle = Math.floor((stopIndex + startIndex) / 2);

    while (words[middle] != value && startIndex < stopIndex) {
        // adjust search area
        if (value < words[middle]) {
            stopIndex = middle - 1;
        } else if (value > words[middle]) {
            startIndex = middle + 1;
        }

        // recalculate middle
        middle = Math.floor((stopIndex + startIndex) / 2);
    }
}

I am making a large list of words in the format of an array:

e.g. ["a","ab","abc","b"]

In alphabetical order. The problem I'm having is modifying my binary search algorithm to add the word in the correct place and then update?

Whats the best way performance-wise to add a word into an ordered array? And why is it the best way to do it?

Sean Bright
  • 109,632
  • 17
  • 131
  • 138
Lemex
  • 3,646
  • 14
  • 47
  • 82

2 Answers2

5

For an efficient binary search insertion, you'll want to have your binary search return something that indicates where the string would belong in the array if it is not found.

The accepted method of doing this in other languages is to return the bitwise complement of the index where the string belongs. The bitwise complement of 0 is -1, the bitwise complement of 1 is -2, 2 is -3, and so on. To get the bitwise complement of a number in JavaScript, use the ~ operator.

Example code:

/* 
    target: the object to search for in the array
    comparator: (optional) a method for comparing the target object type
    return value: index of a matching item in the array if one exists, otherwise the bitwise complement of the index where the item belongs
*/
Array.prototype.binarySearch = function (target, comparator) {
    var l = 0,
        h = this.length - 1,
        m, comparison;
    comparator = comparator || function (a, b) {
        return (a < b ? -1 : (a > b ? 1 : 0)); /* default comparison method if one was not provided */
    };
    while (l <= h) {
        m = (l + h) >>> 1; /* equivalent to Math.floor((l + h) / 2) but faster */
        comparison = comparator(this[m], target);
        if (comparison < 0) {
            l = m + 1;
        } else if (comparison > 0) {
            h = m - 1;
        } else {
            return m;
        }
    }
    return~l;
};

You can then use the binarySearch method to write your own binaryInsert function:

/*
    target: the object to insert into the array
    duplicate: (optional) whether to insert the object into the array even if a matching object already exists in the array (false by default)
    comparator: (optional) a method for comparing the target object type
    return value: the index where the object was inserted into the array, or the index of a matching object in the array if a match was found and the duplicate parameter was false 
*/
Array.prototype.binaryInsert = function (target, duplicate, comparator) {
    var i = this.binarySearch(target, comparator);
    if (i >= 0) { /* if the binarySearch return value was zero or positive, a matching object was found */
        if (!duplicate) {
            return i;
        }
    } else { /* if the return value was negative, the bitwise complement of the return value is the correct index for this object */
        i = ~i;
    }
    this.splice(i, 0, target);
    return i;
};

Once these methods are prototyped onto the array object, you can use them directly like so:

var arr = [];
arr.binaryInsert("Zebra");
arr.binaryInsert("Aardvark");
arr.binaryInsert("Mongoose");
alert(arr);
/* [ "Aardvark", "Mongoose", "Zebra" ] */

As the number of items increases, this will be signficantly faster than calling Array.sort()

Array Property Key Pollution

Note that prototyping methods onto the Array object as in the above code causes the methods to appear as enumerable properties of your arrays, which could interfere with any logic where you're enumerating all properties in a for(var i in arr) loop. Loops written in the format of for(var i=0; i<arr.length; i++) will still work as designed.

If you don't need to support Internet Explorer 8 or below, you can avoid calling Array.prototype directly and instead use Object.defineProperty as in the below examples.

Object.defineProperty(Array.prototype, "binarySearch", {
value: function (target, comparator) {
    var l = 0,
        h = this.length - 1,
        m, comparison;
    comparator = comparator || function (a, b) {
        return (a < b ? -1 : (a > b ? 1 : 0));
    };
    while (l <= h) {
        m = (l + h) >>> 1;
        comparison = comparator(this[m], target);
        if (comparison < 0) {
            l = m + 1;
        } else if (comparison > 0) {
            h = m - 1;
        } else {
            return m;
        }
    }
    return~l;
}
});

Object.defineProperty(Array.prototype, "binaryInsert", {
value: function (target, duplicate, comparator) {
    var i = this.binarySearch(target, comparator);
    if (i >= 0) {
        if (!duplicate) {
            return i;
        }
    } else {
        i = ~i;
    }
    this.splice(i, 0, target);
    return i;
}
});

This approach will avoid polluting the enumerable keys, so for(var i in arr) loops will still work as expected.

Thriggle
  • 6,799
  • 2
  • 24
  • 36
1

The best way is to use trees instead, since for an array such an operation is likely to have linear algorithmic complexity.

If you'd like to stick with the arrays, I suggest that you use the splice method for insertion.

Qnan
  • 3,534
  • 16
  • 15
  • Could you point me anywhere that will convert an array to a tree? Its a diconary bascily? over 600,000 words.. – Lemex Sep 11 '12 at 12:45
  • And i know how to insert, if you read the question its about insertying it in the right place as its in alphatecial order – Lemex Sep 11 '12 at 12:46
  • @LiamMcCann what are you actually trying to do with those words? – Qnan Sep 11 '12 at 12:46
  • Custom spell checker , will be moving to a tree format in the next month or so, but for now array has to be used – Lemex Sep 11 '12 at 12:47
  • @LiamMcCann why not use an associative array instead? – Qnan Sep 11 '12 at 12:48
  • Just a simple binary search insertion required for now – Lemex Sep 11 '12 at 13:00
  • @LiamMcCann If you want a quick solution, stuff the words in an array and sort it. I'm pretty sure this will work faster than the binary-search-and-insert approach. – Qnan Sep 11 '12 at 13:06
  • For 600,000 words + that wont be quick – Lemex Sep 11 '12 at 13:23
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/16554/discussion-between-qnan-and-liam-mccann) – Qnan Sep 11 '12 at 16:25