0

I'm trying to brainstorm ways to calculate a solution (in js) to the following scenario:

Say I'm given a standard timeseries with a decent amount of variance (e.g. https://codepen.io/quirkules/pen/wvBYarM). These timeseries will change based on the data set.

Example data:

var data = [
  {
    "date": "30/04/2012", // DD/MM/YYYY
    "close": 14
  },
  {
    "date": "1/05/2012",
    "close": 2
  },
  {
    "date": "2/05/2012",
    "close": 14
  },
  {
    "date": "3/05/2012",
    "close": 5
  },
  {
    "date": "4/05/2012",
    "close": 14
  }

I want to be able to identify the correct Y value that will be exceeded X % of the time. For example, if the range of dates were 50 days, and I wanted to know what value will be exceeded 50% of the time, that would mean this Y value is exceeded for a total of 25 days.

Note: depending on the nature of the dataset, this might need to be the 'closest value' to the desired Y value. Taking the previous example again, the constraints of a graph might mean we can only find a Y value that is exceeded 40% of the time through best efforts.

I'm assuming this would be a linear regression-type of problem, but I haven't used it in this way before. I'm also not sure if this can be solved using pure JS or using another library (e.g. tensorflow) but I'd be open to ideas.

Any assistance would be appreciated

Quirk
  • 125
  • 11
  • 1
    there may be faster ways, but the way that comes to mind: sort the array in descending order by value. Now `sortedArray[Math.floor(targetPct*sortedArray.length) + 1]` will be the value that is exceeded `targetPct` of the time. There may be edge cases if there are a bunch of entries with the same value – Brandon Jan 17 '20 at 21:09
  • 1
    have a look at this https://stackoverflow.com/questions/20811131/javascript-remove-outlier-from-an-array you're looking for outliers. You don't need to use regression, regression is a way to find a best fit line. – Happy Machine Jan 17 '20 at 21:19
  • thanks @Brandon & Happy Machine, I figured there was an easier way – Quirk Jan 17 '20 at 21:24

1 Answers1

1

If the data is not huge, a good old for loop will do it easy enough: https://codepen.io/Alexander9111/pen/rNaqwrg

document.querySelector("#search").addEventListener('click', function(e){
    var search_val = document.querySelector("#input").value;
    var times_over_val = 0;
    for (i = 0; i < data.length; i++){
        if (i >= search_val){
            times_over_val++;
        }
    }
    document.querySelector("#output").innerText = "The data was over the value of: " + search_val + ", " + Math.round(100 * (times_over_val/data.length),2) + "% of the time (" + times_over_val + " out of " + data.length + ")";
});

Note I had modified your HTML:

<div id="search-container">
    <input id="input" value=""/>
    <button id="search">Search</button>
    <span id="output">Enter a value in the input then click search</span>
</div>
<div id="container">
</div>

And one line of your d3 code:

var svg = d3.select("#container").append("svg")
Alex L
  • 3,643
  • 1
  • 7
  • 19