24

I've been trying to calculate median but still I've got some mathematical issues I guess as I couldn't get the correct median value and couldn't figure out why. Here's the code;

class StatsCollector {

    constructor() {
        this.inputNumber = 0;
        this.average = 0;

        this.timeout = 19000;

        this.frequencies = new Map();
        for (let i of Array(this.timeout).keys()) {
            this.frequencies.set(i, 0);
        }
    }

    pushValue(responseTimeMs) {
        let req = responseTimeMs;
        if (req > this.timeout) {
            req = this.timeout;
        }

        this.average = (this.average * this.inputNumber + req) / (this.inputNumber + 1);

        console.log(responseTimeMs / 1000)
        let groupIndex = Math.floor(responseTimeMs / 1000);
        this.frequencies.set(groupIndex, this.frequencies.get(groupIndex) + 1);

        this.inputNumber += 1;
    }

    getMedian() {
        let medianElement = 0;
        if (this.inputNumber <= 0) {
            return 0;
        }
        if (this.inputNumber == 1) {
            return this.average
        }
        if (this.inputNumber == 2) {
            return this.average
        }
        if (this.inputNumber > 2) {
            medianElement = this.inputNumber / 2;
        }

        let minCumulativeFreq = 0;
        let maxCumulativeFreq = 0;
        let cumulativeFreq = 0;
        let freqGroup = 0;
        for (let i of Array(20).keys()) {
            if (medianElement <= cumulativeFreq + this.frequencies.get(i)) {
                minCumulativeFreq = cumulativeFreq;
                maxCumulativeFreq = cumulativeFreq + this.frequencies.get(i);
                freqGroup = i;
                break;
            }
            cumulativeFreq += this.frequencies.get(i);
        }

        return (((medianElement - minCumulativeFreq) / (maxCumulativeFreq - minCumulativeFreq)) + (freqGroup)) * 1000;
    }

    getAverage() {
        return this.average;
    }

}

Here's the snapshot of the results when I enter the values of

342,654,987,1093,2234,6243,7087,20123

enter image description here

The correct result should be;

Median: 1663.5

msanford
  • 10,127
  • 8
  • 56
  • 83
Muhammed Bayram
  • 253
  • 1
  • 2
  • 5
  • maybe look [here](https://stackoverflow.com/questions/25305640/find-median-values-from-array-in-javascript-8-values-or-9-values) – Radek Hofman Jul 25 '17 at 16:57
  • 3
    To calculate the median, you have to sort the values and pick the middle one. – Pointy Jul 25 '17 at 16:58
  • 2
    That's not a median. The median should be in the set. – jmargolisvt Jul 25 '17 at 16:58
  • My first guess would be that you have a rounding error. – victor Jul 25 '17 at 16:58
  • 1
    The Median is the middle number of the sorted list if there are a odd number of values, if there are an even number the median is the mid point or average of the central two values. – Mark B Jul 25 '17 at 17:06
  • 1
    Possible duplicate of [find median values from array in javascript (8 values or 9 values)](https://stackoverflow.com/questions/25305640/find-median-values-from-array-in-javascript-8-values-or-9-values) – str Jul 25 '17 at 17:09

10 Answers10

55

Change your median method to this:

function median(values){
  if(values.length ===0) return 0;

  values.sort(function(a,b){
    return a-b;
  });

  var half = Math.floor(values.length / 2);

  if (values.length % 2)
    return values[half];

  return (values[half - 1] + values[half]) / 2.0;
}

fiddle

Azim Saiyed
  • 214
  • 2
  • 13
jdmdevdotnet
  • 1
  • 1
  • 18
  • 43
6

The solutions above - sort then find middle - are fine, but slow on large data sets. Sorting the data first has a complexity of n x log(n).

There is a faster median algorithm, which consists in segregating the array in two according to a pivot, then looking for the median in the larger set. Here is some javascript code, but here is a more detailed explanation

// Trying some array
alert(quickselect_median([7,3,5])); // 2300,5,4,0,123,2,76,768,28]));

function quickselect_median(arr) {
   const L = arr.length, halfL = L/2;
   if (L % 2 == 1)
      return quickselect(arr, halfL);
   else
      return 0.5 * (quickselect(arr, halfL - 1) + quickselect(arr, halfL));
}

function quickselect(arr, k) {
   // Select the kth element in arr
   // arr: List of numerics
   // k: Index
   // return: The kth element (in numerical order) of arr
   if (arr.length == 1)
      return arr[0];
   else {
      const pivot = arr[0];
      const lows = arr.filter((e)=>(e<pivot));
      const highs = arr.filter((e)=>(e>pivot));
      const pivots = arr.filter((e)=>(e==pivot));
      if (k < lows.length) // the pivot is too high
         return quickselect(lows, k);
      else if (k < lows.length + pivots.length)// We got lucky and guessed the median
         return pivot;
      else // the pivot is too low
         return quickselect(highs, k - lows.length - pivots.length);
   }
}

Astute readers will notice a few things:

  1. I simply transliterated Russel Cohen's Python solution into JS, so all kudos to him.
  2. There are several small optimisations worth doing, but there's parallelisation worth doing, and the code as is is easier to change in either a quicker single-threaded, or quicker multi-threaded, version.
  3. This is the average linear time algorithm, there is more efficient a deterministic linear time version, see Russel's post for details, including performance data.

ADDITION 19 Sept. 2019:

One comment asks whether this is worth doing in javascript. I ran the code in JSPerf and it gives interesting results.

  • if the array has an odd number of elements (one figure to find), sorting is 20% slower that this "fast median" proposition.

  • if there is an even number of elements, the "fast" algorithm is 40% slower, because it filters through the data twice, to find elements number k and k+1 to average. It is possible to write a version of fast median that doesn't do this.

The test used rather small arrays (29 elements in the jsperf test). The effect appears to be more pronounced as arrays get larger. A more general point to make is: it shows these kinds of optimisations are worth doing in Javascript. An awful lot of computation is done in JS, including with large amounts of data (think of dashboards, spreadsheets, data visualisations), and in systems with limited resources (think of mobile and embedded computing).

boisvert
  • 3,444
  • 2
  • 25
  • 50
  • I'm trying to understand wether it makes sense for javascript. This quickselect algorithm seems to be trying to implement a Quicksort algorithm by hand. In javascript the type of sort algorithm depending on the size of array and the browser too. When you use Array.sort() an optimized sort algorithm is chosen in the background. I coube be mistaken, off course, what do you think about it ? – Thiago C. S Ventura Jul 26 '19 at 13:22
  • My answer is probably a product of my bend as a computing *educator*, more than *practitioner* - I teach the thing, so there's a fantastic lesson in this. Whether the algorithm above is a good idea depends on why you're doing this, as always. How big, the arrays? Are lots of them? Will you need the data sorted at some point, or need other stats like quartiles etc? do you use other libraries that change the tool selection? Is time performance a big factor to you? Is resources a big factor to you? – boisvert Aug 16 '19 at 17:44
  • 1
    @ThiagoC.SVentura your comment prompted to test if the difference would be visible in JSPerf. I add the results to the answer. – boisvert Sep 19 '19 at 13:30
5

`

var arr = {  
  max: function(array) {
    return Math.max.apply(null, array);
  },

  min: function(array) {
    return Math.min.apply(null, array);
  },

  range: function(array) {
    return arr.max(array) - arr.min(array);
  },

  midrange: function(array) {
    return arr.range(array) / 2;
  },

  sum: function(array) {
    var num = 0;
    for (var i = 0, l = array.length; i < l; i++) num += array[i];
    return num;
  },

  mean: function(array) {
    return arr.sum(array) / array.length;
  },

  median: function(array) {
    array.sort(function(a, b) {
      return a - b;
    });
    var mid = array.length / 2;
    return mid % 1 ? array[mid - 0.5] : (array[mid - 1] + array[mid]) / 2;
  },

  modes: function(array) {
    if (!array.length) return [];
    var modeMap = {},
      maxCount = 1,
      modes = [array[0]];

    array.forEach(function(val) {
      if (!modeMap[val]) modeMap[val] = 1;
      else modeMap[val]++;

      if (modeMap[val] > maxCount) {
        modes = [val];
        maxCount = modeMap[val];
      }
      else if (modeMap[val] === maxCount) {
        modes.push(val);
        maxCount = modeMap[val];
      }
    });
    return modes;
  },

  variance: function(array) {
    var mean = arr.mean(array);
    return arr.mean(array.map(function(num) {
      return Math.pow(num - mean, 2);
    }));
  },

  standardDeviation: function(array) {
    return Math.sqrt(arr.variance(array));
  },

  meanAbsoluteDeviation: function(array) {
    var mean = arr.mean(array);
    return arr.mean(array.map(function(num) {
      return Math.abs(num - mean);
    }));
  },

  zScores: function(array) {
    var mean = arr.mean(array);
    var standardDeviation = arr.standardDeviation(array);
    return array.map(function(num) {
      return (num - mean) / standardDeviation;
    });
  }
};

`

Dps
  • 61
  • 3
5

Here's another solution:

function median(numbers) {
    const sorted = numbers.slice().sort((a, b) => a - b);
    const middle = Math.floor(sorted.length / 2);

    if (sorted.length % 2 === 0) {
        return (sorted[middle - 1] + sorted[middle]) / 2;
    }

    return sorted[middle];
}

console.log(median([4, 5, 7, 1, 33]));
JBallin
  • 4,767
  • 33
  • 38
1

TypeScript Answer 2020:

// Calculate Median 
const calculateMedian = (array: Array<number>) => {
  // Check If Data Exists
  if (array.length >= 1) {
    // Sort Array
    array = array.sort((a: number, b: number) => {
      return a - b;
    });

    // Array Length: Even
    if (array.length % 2 === 0) {
      // Average Of Two Middle Numbers
      return (array[(array.length / 2) - 1] + array[array.length / 2]) / 2;
    }
    // Array Length: Odd
    else {
      // Middle Number
      return array[(array.length - 1) / 2];
    }
  }
  else {
    // Error
    console.error('Error: Empty Array (calculateMedian)');
  }
};

jefelewis
  • 769
  • 4
  • 24
0

For better performance in terms of time complexity, use MaxHeap - MinHeap to find the median of stream of array.

hien711
  • 99
  • 2
  • 3
0

Simpler & more efficient

const median = dataSet => {
  if (dataSet.length === 1) return dataSet[0]
  const sorted = ([ ...dataSet ]).sort()
  const ceil = Math.ceil(sorted.length / 2)
  const floor = Math.floor(sorted.length / 2)
  if (ceil === floor) return sorted[floor]
  return ((sorted[ceil] + sorted[floor]) / 2)
}
0

Short and sweet.

Array.prototype.median = function () {
  return this.slice().sort((a, b) => a - b)[Math.floor(this.length / 2)]; 
};

Usage

[4, 5, 7, 1, 33].median()

Works with strings as well

["a","a","b","b","c","d","e"].median()
user1949536
  • 474
  • 5
  • 6
0

Simple solution:

function calcMedian(array) {
  const {
    length
  } = array;

  if (length < 1)
    return 0;

  //sort array asc
  array.sort((a, b) => a - b);

  if (length % 2) {
    //length of array is odd
    return array[(length + 1) / 2 - 1];
  } else {
    //length of array is even
    return 0.5 * [(array[length / 2 - 1] + array[length / 2])];
  }
}

console.log(2, calcMedian([1, 2, 2, 5, 6]));
console.log(3.5, calcMedian([1, 2, 2, 5, 6, 7]));
console.log(9, calcMedian([13, 9, 8, 15, 7]));
console.log(3.5, calcMedian([1, 4, 6, 3]));
console.log(5, calcMedian([5, 1, 11, 2, 8]));
Lars Flieger
  • 477
  • 2
  • 15
0

Simpler, more efficient, and easy to read

  1. cloned the data to avoid alterations to the original data.
  2. sort the list of values.
  3. get the middle point.
  4. get the median from the list.
  5. return the median.

function getMedian(data) {
    const values = [...data];
    const v   = values.sort( (a, b) => a - b);
    const mid = Math.floor( v.length / 2);
    const median = (v.length % 2 !== 0) ? v[mid] : (v[mid - 1] + v[mid]) / 2; 
    return median;
}
  • 1
    While your answer may solve the question, [including an explanation](https://meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. You can edit your answer to add explanations and give an indication of what limitations and assumptions apply. - [From Review](https://stackoverflow.com/review/late-answers/28443910) – Adam Marshall Mar 02 '21 at 16:08
  • You write "simpler & more efficient", but it would be good to know what you compare with, since many have already answered with similar code. Essentially the `sort` operation determines the execution time, and most answers have used that. – trincot Mar 03 '21 at 16:16