0

I'm trying to find which percentile of the finishing position of a runner in a race, eventually I want to use it in a query like this:

Percentile(runners/placing) = "Percentile of the runner who came in 3rd place out of 10 runners = 27" 
UPDATE Scores SET ScoreFinal = (IIF(Percentile >= 50, Score * 2, Score * 3))

I'm currently using a peice of VB code to find the percentile, not sure ho to do it in SQL:

    Public Shared Function Get_Percentile(ByVal runners As Integer, position As Integer) As Double

    Dim smplTstScores(runners) As Integer

    For i = 1 To runners
        smplTstScores(i) = i
    Next

    Dim smplData As IEnumerable(Of Integer) = smplTstScores.OrderBy(Function(n) n)
    Dim scoresLess As Integer = smplData.Count(Function(n) n < position)
    Dim p As Double = Math.Round(scoresLess / smplTstScores.Length * 100) 'the percentile

    Return p

End Function
user2096512
  • 439
  • 6
  • 13
  • Not really sure what you are trying to do here. Are you saying you have 10 runners and the person in question came in 3rd place? If so what are you trying to calculate? – Sean Lange Mar 11 '20 at 20:18
  • I'm trying to calculate what percentile the runner came, out of the 10 runners. My VB function reports it as 27, which sounds about right, I.E it came in the top 30% of runners. – user2096512 Mar 11 '20 at 20:25
  • percentile is usually reported the opposite of how you are doing it. The runner in first place would be in the 100th percentile. Runner in last place would be in the 0th percentile. Runner in third would be in 70th percentile – Kevin Mar 11 '20 at 20:36
  • Actually your right, I find the percentile by subtracting the value the VB function returns from 100, so in this case it would be 73. – user2096512 Mar 12 '20 at 14:46

2 Answers2

1

So basically 3 / 10? In sql you have to be careful as this will always return 0 because of integer division. Simply multiple either value by 1.0

See this example to see what I mean.

declare @Position int = 3
    , @Runners int = 10

select @Position / @Runners --integer math will use whole numbers
    , @Position / (@Runners * 1.0) --multiplying by 1.0 forces this to decimal division.

I would think that 3rd place should be 70%. So to accomplish that simply use:

1 - (@Position / (@Runners * 1.0))
Sean Lange
  • 30,535
  • 3
  • 21
  • 37
  • No I dont think its as simple as dividing one by the other, that returns 0.3, the anser should be 27. – user2096512 Mar 11 '20 at 20:31
  • 1
    I don't follow. How is third place 27 with 10 runners? – Sean Lange Mar 11 '20 at 20:36
  • I'm not sure, I copied the VB code from another post, I just need some SQL that does the same thing! – user2096512 Mar 12 '20 at 14:41
  • There is no chance that VB code returns 27 when you pass in 3 as the position and 10 as the number of runners. It is rather overly complicated way to get there but that logic would simply by 7 / 10. And no matter how you slice that it will be .7 – Sean Lange Mar 12 '20 at 14:55
0

Edited to add: I understand the goal here is to execute an UPDATE statement that awards greater points to competitors above the 50th percentile, that is, more or less, above the median, than to those below it. If you actually want the percentiles in your final result, indeed you'll need to do something more.

I originally didn't see the tag for SQL Server. SQL Server definitely isn't my first language; but I hope you find these pointers useful. SQL Server offers the TOP clause with the fine additions PERCENT and WITH TIES. These could be really handy for you! https://docs.microsoft.com/en-us/sql/t-sql/queries/top-transact-sql?view=sql-server-ver15 If you really do require percentiles, PERCENT_RANK looks like the way to go! https://docs.microsoft.com/en-us/sql/t-sql/functions/percent-rank-transact-sql?view=sql-server-ver15 This past answer looks like the simplest median: Function to Calculate Median in SQL Server


Original answer: To solve your problem as stated, you don't need the percentile, just the median. A previous answer gives you ways to get that: Simple way to calculate median with MySQL

Depending on where you're coming from and going, some SQL's SELECT ... LIMIT clauses also allow you to specify an offset, meaning you could pick a single record out of the middle. https://dev.mysql.com/doc/refman/8.0/en/select.html (you'll have to search for "LIMIT", sorry.)

In most cases, I'd argue the nice distinction about calculating median for an even number of values by averaging the middle two is not worth the effort. Maybe your use case is scoring a larger competition where fairness demands exactitude. In that case, though, I think the question you have to answer is what happens if there's a tie at the 50pctl. For example, ten runners in the race and all cross the finish line abreast!