1

Array.prototype.sort()

In compareFunction(a, b), only when we need to exchange a and b's position, we return a positive value.

If negative if-statement in compareFunction is omitted, the Array.prototype.sort() still works, so why should developers write if-statement which returns a negative value?

var list = [4, 5, 3, 5, 6, 9, 1, 4, 2];
list = list.sort(function(a, b) {
  if (a > b) {
    return 1;
  }
});
console.log(list); // correct result
Serge K.
  • 4,923
  • 18
  • 27
Ssuching Yu
  • 165
  • 2
  • 12
  • The idea is, especially for numbers (or the ASCII code for characters), that an if statement is not necessary. You would just return `a - b`. Comparing numbers is an arithmetic operation. If the difference is negative then we know the first operand is less than the second. If it's zero then they are the same value. If greater than zero then the first operand is greater than the second – tiguchi Aug 19 '17 at 16:02
  • The idea of a comparison function is : {f, a, b, X | a∈X ∧ b∈X ∧ f(a,b)∈ℤ ∧ (ab ⇒ f(a,b)>0)} – Vivick Aug 19 '17 at 17:26
  • When you only define the case for `a>b` you only partially define a comparison function (because your set of input only partially defines an order relation) – Vivick Aug 19 '17 at 17:31

4 Answers4

3

The main problem here is that you've invented your own definition of the comparison function and are basing your question off of that:

In compareFunction(a, b), only when we need to exchange a and b's position, we return a positive value.

This is incorrect. "When we need to exchange a and b's position" is an implementation detail, and you are confusing implementation with interface.

The compareFunction is not responsible for indicating when two elements should be swapped. It is responsible for accurately conveying the relationship of two elements. What the sort algorithm does with that information is up to the implementer. If you only return the correct value some of the time, then you can't expect a correct result all of the time.

For example, a sort implementer could implement the sort like this (based off the example at https://www.nczonline.net/blog/2012/09/17/computer-science-in-javascript-insertion-sort/). If I run it with a valid comparison function, it produces the correct result:

function insertionSort(items, compare) {

  var len = items.length, // number of items in the array
    value, // the value currently being compared
    i, // index into unsorted section
    j; // index into sorted section

  for (i = 0; i < len; i++) {

    // store the current value because it may shift later
    value = items[i];

    for (j = i - 1; j > -1 && compare(value, items[j]) < 0; j--) {
      items[j + 1] = items[j];
    }

    items[j + 1] = value;
  }

  return items;
}

console.log(insertionSort([4,2,6,1,7,2], (l, r) => l - r));

If I instead run it with your comparison function, it does nothing:

function insertionSort(items, compare) {

  var len = items.length, // number of items in the array
    value, // the value currently being compared
    i, // index into unsorted section
    j; // index into sorted section

  for (i = 0; i < len; i++) {

    // store the current value because it may shift later
    value = items[i];

    for (j = i - 1; j > -1 && compare(value, items[j]) < 0; j--) {
      items[j + 1] = items[j];
    }

    items[j + 1] = value;
  }

  return items;
}

console.log(insertionSort([4,2,6,1,7,2], function(a, b) {
    if (a > b) {
        return 1;
    }
}));
JLRishe
  • 90,548
  • 14
  • 117
  • 150
1

This works in your case because you didn't test all the possibilities. But, if you look inside the implementation you'll see that the engine doesn't use the same algorithm on short arrays (ie. length <= 10) than on longer arrays. In fact, insertion sort is used on short arrays while QuickSort is being used on long arrays.

Since your implementation must define which number is higher, beneath or equal to another, it'll fail in your case when it comes to longer arrays because you forgot to implement the 'beneath' case (and equal case is implied, because your function will return undefined when b >= a which will be interpreted as 0), so QuickSort will fail to correctly sort your array because it cannot know when a number is less than another while the insertion sort will work thanks to it's algorithm which relies on the 'more than' comparison if I understood it correctly.

See my examples below :

var shortList = [9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
    list = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0];
    
console.log('Works : ', shortList.sort(function(a, b) {
  if (a > b) {
    return 1;
  }
})); // You're being lucky on this one. Insertion sort.

console.log('Doesnt work : ', list.sort(function(a, b) {
  if (a > b) {
    return 1;
  }
})); // QuickSort

console.log('Works : ', list.sort(function(a, b) {
  if (a > b) {
    return 1;
  } else if (a < b) {
    return -1;
  }
  
  return a - b; // Can be reduced to 'return a - b';
})); // QuickSort
Serge K.
  • 4,923
  • 18
  • 27
0

If you don't follow the specs, then you'll most probably see inconsistencies across engines, as browsers (for example) don't know how to handle it. Chrome, Firefox and Node.js seem to be kind enough to sort the array as you expect, but Safari does not sort it for example:

[4, 5, 3, 5, 6, 9, 1, 4, 2]

I wish all these browsers simply fail when specs are not fulfilled, like "Error: RTM".

Funk Forty Niner
  • 73,764
  • 15
  • 63
  • 131
sp00m
  • 44,266
  • 23
  • 127
  • 230
0

Having all three cases (a < b, a = b, a > b) allows you to have a Total Ordering. But if you specify just one case – just a < b, you end up with a Weak Ordering. The difference is rooted with Mathematics – but to simplify things: with former scheme, you do care about how to order elements that tie with each other, and with the latter scheme, you don't necessarily care.

var list = [
    { age: 65, name: 'Tony'},
    { age: 24, name: 'Joe'},
    { age: 24, name: 'Susan' } // Joe and Susan are tied,
    { age: 5, name: 'Alice'},
];

Suppose we sort the above list of employees by their age. With a total order, we are guaranteed to have the order: Alice, Joe, Susan, Tony. So although Joe and Susan have the same age, their relative order is preserved after the sort. With a Weak ordering however, we will have Alice first and Tony last, but the order of Joe and Susan is left open for choosing. These employees have the same age and so are tied (equal). That's bad for weak orderings as weak orderings don't specify how to order ties in the result – it's ambiguous! So with a weak ordering: we may end up with the result: Alice, Susan, Joe, Tony. When a sorting algorithm preserves the order of ties, we say it is a stable sort.

If your sorting function is like Arrays.prototype.sort() and expects a total ordering, you should always provide all three cases! If you don't do this: 1. tied elements may not be sorted correctly, 2. the browser may be confused and sorting algorithm may fail to correctly sort all the elements (even when there aren't any ties!).

// (*) Array.prototype.sort expects total ordering 
// ... so three cases needed
sort(list, function(a, b) { // (*)
    if (a.age < b.age) return -1;
    if (a.age > b.age) return  1;
    return 0;
});

You would only provide one case if the sorting function you're using is expecting a weak ordering. The C++ Standard Library gives a perfect example of a function that expects a weak ordering.

// (*) C++ STL uses weak orderings ...
// ... so only one case needed
struct Employee { int age; string name };
vector<Employee> employees = {
    { 65, "Tony" },
    { 24, "Joe" },
    { 24, "Susan" },
    { 5, "Alice" }
};
struct Sorter {
    bool operator()(const Employee &e1, const Employee &e2) const {
        return e1.age < e2.age; // (*)
    }
};
sort(employees.begin(), employees.end(), Sorter());

Here, you just need to provide just one case a < b in the sorting function. When you sort the elements and there is a tie, it is left open as to how the tied elements should be arranged. From C++ Reference for sort:

comp ... The value returned indicates whether the element passed as first argument is considered to go before the second in the specific strict weak ordering it defines ... Equivalent elements are not guaranteed to keep their original relative order

It turns out the original designer of the C++ Standard Library, Alexander Stepanov, perhaps did not want to use weak orderings and would've preferred to use total orderings (like Javascript!) – probably in order to prevent these ambiguities with ties. In fact many other languages (including Java and Python) use total orderings for their sorting functions. Total orderings are great because they remove ambiguity, so you should make the effect to provide all three cases.

James Lawson
  • 6,643
  • 39
  • 40