1

Time of execution: foo(1) >>> foo(2) >> foo(3)

roughly: 1427349 >>> 14757 >> 1362

foo(3) is the most optimized algorithm among the three, so I'm not surprised it's the fastest. What's surprising to me is that foo(2) is so much faster than foo(1). My impression is that foo(2) sorts, while foo(1) is operating similarly to foo(3). May I know what is the cause the slowdown for foo(1)? Show me what's under the hood. Thanks!

void Main()
{   
    Random r = new Random();
    for(int i = 0; i < array.Length; i++)
    {
        array[i] = new A(r.Next(int.MaxValue));
    }   

    foo(1); 
    foo(2);
    foo(3); 
}

A[] array = new A[10000];
static Stopwatch sw = new Stopwatch();

public void foo(int s)
{
    sw.Reset();
    sw.Start();

    switch(s)
    {
        case 1:
            array.First(x => (x.value == array.Max(y => y.value))).Dump();
            break;
        case 2:
            array.OrderBy(x => x.value)
            .Last()
            .Dump();    
            break;
        case 3:
            {           
                int max = array[0].value;
                int index = 0;
                int i = 0;
                for(; i < array.Length; i++)
                {
                    if(array[i].value >= max)
                    {
                        max = array[i].value;
                        index = i;
                    }
                }
                array[index].Dump();
            }
            break;
    }

    sw.Stop();
    sw.Dump();
}
class A
{
    public int value;
    public A(int value)
    {
        this.value = value;
    }
}

Code testing was in linqpad, so you can ignore the .Dump() method.

blizpasta
  • 2,574
  • 3
  • 23
  • 31
  • 1
    You might be interested in http://stackoverflow.com/questions/1101841/linq-how-to-perform-max-on-a-property-of-all-objects-in-a-collection-and-retu – Karl Knechtel Dec 02 '10 at 10:33
  • Your methodology is poor (although your conclusions are probably roughly correct). You need to be timing thousands or millions of iterations of each approach, not just one. – LukeH Dec 02 '10 at 10:36
  • @LukeH: I agree. Just wanted to get some rough order of magnitude for the timing out, which I verified is most likely to be correct by testing the different permutations of foo(1);foo(2);foo(3). There's more to testing that I need to learn. – blizpasta Dec 02 '10 at 14:23

1 Answers1

11

The first is O(N²), because you iterate over the array once for each outer iteration. The second is O(N log N), because you are sorting first. The last is O(N), because you iterate over the array in a single pass with no loop inside each iteration.

Try this:

        case 1:
            var max = array.Max(x => x.value);
            array.First(x => x.value == max).Dump();
            break;

It should now be comparable with the third case, though not quite, since you have to traverse the array 1.5 times, on average (assuming only one element has the max value).

Marcelo Cantos
  • 167,268
  • 37
  • 309
  • 353
  • 9
    +1, although you can do it in a single O(n) pass with LINQ: `array.Aggregate((a, x) => x.value > a.value ? x : a);` – LukeH Dec 02 '10 at 10:44
  • 1
    @LukeH: Good answer! Why not make it one? – Marcelo Cantos Dec 02 '10 at 10:45
  • I made a wrong assumption that array.Max is cached though there's no good reason for it to be. Perhaps it could or might be an optimization in the future for certain situations though. I particularly like your answer. – blizpasta Dec 02 '10 at 14:56
  • 1
    @blizpasta: Indeed, your assuption was incorrect. The maximum value *cannot* be cached because the maximum value might be *changing*. Someone could be changing the contents of the array on another thread, for instance, changing the maximum. The compiler has no reason to believe that any two calls to Max will return the same value; if you have knowledge that the array is not changing then *you* should write code to take advantage of that knowledge; the compiler does not have that knowledge. – Eric Lippert Dec 02 '10 at 15:57
  • @blizpasta: Yes, the compiler could optimise this if a range of assumptions proved valid. The array must not be accessible via another thread; the values must not be modified during iteration (imagine calling a function from inside First's lambda instead of just comparing two value); the semantics of the lambda and of `Max` must undergo a fairly sophisticated data-flow analysis to confirm that Max will always return the same value. This would be insanely difficult to implement in a JIT compiler that has stringent time constraints. – Marcelo Cantos Dec 02 '10 at 21:34