10

I took the plunge this afternoon and began studying LINQ, so far just mucking around with LINQ on collections. One of the first things I tried was to implement QSort.

Now -- ignoring the fact that I could just use an ORDERBY and that this is a very silly qsort implementation -- what I came up with was this:

public class lqsort
{
    public static List<int> QSLinq(List<int> _items)
    {
        if (_items.Count <= 1)
            return _items;

        int _pivot = _items[0];

        List<int> _less = (from _item in _items where _item < _pivot select _item).ToList();
        List<int> _same = (from _item in _items where _item == _pivot select _item).ToList();
        List<int> _greater = (from _item in _items where _item > _pivot select _item).ToList();

        return (QSLinq(_less).Concat(_same.Concat(QSLinq(_greater)))).ToList();
    }
}

The only thing that really bugs me is all of the casting involved. Are there any LINQ tricks I might use? Or am I just using LINQ for things it wasn't intended for?

Dana
  • 28,425
  • 17
  • 58
  • 72

6 Answers6

9

Just change the type of the parameter to IEnumerable and use the var construct instead of your List<int> for your local variables.

This will make your QSLinq method better because it will accept more types of parameters, for example int[], as well as List<int>.

See the new method:

    public static IEnumerable<int> QSLinq(IEnumerable<int> _items)
    {
        if (_items.Count() <= 1)
            return _items;

        var _pivot = _items.First();

        var _less = from _item in _items where _item < _pivot select _item;
        var _same = from _item in _items where _item == _pivot select _item;
        var _greater = from _item in _items where _item > _pivot select _item;

        return QSLinq(_less).Concat(QSLinq(_same)).Concat(QSLinq(_greater));
    }

Hope this helps.

Alfred B. Thordarson
  • 4,280
  • 8
  • 36
  • 37
  • This is the worst thing - while I was typing the answer Panos completed and posted his answer - exactly the same as mine :-( – Alfred B. Thordarson Oct 08 '08 at 22:21
  • Is there any difference other than syntax between this explicit LINQ representation as opposed to the Lambda expressions in Panos' version? Is either more efficient than the other? – Skeolan Oct 08 '08 at 22:45
  • I'm going to accept this answer, 'cause it's what it's basically what I wanted. But the other answers were really cool! – Dana Oct 09 '08 at 15:30
9

Fun Question! This doesn't outperform OrderBy, but it does limit the repeated enumerations some.

    public static IEnumerable<int> QSort2(IEnumerable<int> source)
    {
        if (!source.Any())
            return source;
        int first = source.First();
        return source
            .GroupBy(i => i.CompareTo(first))
            .OrderBy(g => g.Key)
            .SelectMany(g => g.Key == 0 ? g : QSort2(g));
    }

I inadvertently generated a stackoverflow during development, as I QSorted when the Key == 0.


Just for fun I tested these solutions. I commited the cardinal performance testing sin (testing in debug mode), but I don't think that affects the big O effect we'll see. Here is the input (reversed input is worst case for quicksort)

IEnumerable<int> source = Enumerable.Range(0, 1000).Reverse().ToList();
  • The triple concat-where solution took 71000 milliseconds.
  • My solution took 330 milliseconds
  • OrderBy.ToArray took 15 milliseconds (the timer's resolution, so actual time is probably less)
Amy B
  • 100,846
  • 20
  • 127
  • 174
3

How about this? (If I understand well you don't like the .ToList calls)

public static IEnumerable<int> QSLinq(IEnumerable<int> items)
{
    if (items.Count() <= 1)
        return items;

    int pivot = items.First();

    return QSLinq(items.Where(i => i < pivot))
        .Concat(items.Where(i => i == pivot))
        .Concat(QSLinq(items.Where(i => i > pivot)));
}

Disclaimer based on Jon answer: Do NOT use this algorithm in a real problem. It is very inefficient.

Community
  • 1
  • 1
Panos
  • 18,335
  • 6
  • 43
  • 54
  • Unfortunately it's still very inefficient. Take the "less than" recursive call to QSLinq - it will have to iterate through its parameter three times, once for each Where call. Iterating through those means iterating through the original list. The same is true of the "more than" recursive call. – Jon Skeet Oct 08 '08 at 22:24
  • (Continued) Now the recursive "less than" call within the second level of recursion needs to iterate through the results of the first level of recursion, etc. Basically it ends up iterating through the list a hideous number of times. I'm too sleepy to work out the complexity now, but it's nasty. – Jon Skeet Oct 08 '08 at 22:26
  • I agree Jon, but the problem here is not the efficiency. No one is going to use this algorithm. We can always use .OrderBy() (which I hope does not use this :) – Panos Oct 08 '08 at 22:29
  • If efficiency is of no concern then I have no problems with it :) I think it's worth considering *why* efficiency is so hard to come by when looking at sequences though. – Jon Skeet Oct 08 '08 at 22:32
  • "Do NOT use this algorithm in a real problem. It is very inefficient." Very inefficient is lightly said, it's O(n^3*log(n)) complexity :) – Pop Catalin Oct 09 '08 at 01:12
  • 1
    Since items is IEnumerable, it would be much better to use Any() instead of Count(). – Amy B Oct 09 '08 at 02:30
2

All the casting involved? I don't see any casting. What I do see is calls to "ToList" which are hideously inefficient. Basically LINQ is designed to work over sequences, which intrinsically don't allow you to work in place (or in a duplicate space) in the way that quicksort tends to. Basically you've got an awful lot of data copying going on here :(

Jon Skeet
  • 1,261,211
  • 792
  • 8,724
  • 8,929
  • Yeah, I should have said type conversion instead of casting. Thanks for the tips. I knew I was bastardizing linq, but I'm mainly playing around with the syntax right now. – Dana Oct 08 '08 at 23:37
2

Here's another solution using Aggregate. I call it: Group and Go Fish. This one takes 170 ms by my 1000 reversed elements test.

    public static IEnumerable<int> QSort3(IEnumerable<int> source)
    {
        if (!source.Any())
            return source;
        int first = source.First();

        QSort3Helper myHelper = 
          source.GroupBy(i => i.CompareTo(first))
          .Aggregate(new QSort3Helper(), (a, g) =>
              {
                  if (g.Key == 0)
                      a.Same = g;
                  else if (g.Key == -1)
                      a.Less = g;
                  else if (g.Key == 1)
                      a.More = g;
                  return a;
              });
        IEnumerable<int> myResult = Enumerable.Empty<int>();
        if (myHelper.Less != null)
            myResult = myResult.Concat(QSort3(myHelper.Less));
        if (myHelper.Same != null)
            myResult = myResult.Concat(myHelper.Same);
        if (myHelper.More != null)
            myResult = myResult.Concat(QSort3(myHelper.More));

        return myResult;
    }

    public class QSort3Helper
    {
        public IEnumerable<int> Less;
        public IEnumerable<int> Same;
        public IEnumerable<int> More;
    }

Why is this faster than my solution using OrderBy? I guess it's a little more resistent to the worst case.

Amy B
  • 100,846
  • 20
  • 127
  • 174
0

The chosen answer is broken because it includes QSLinq(_same) instead of just _same in the returned collection and results in a StackOverflowException. I'll be using the fixed version as the control. If the solution can use copying then the speed can be drastically increased. Usage of threads instead of parallel actually decreases performance slightly for copying variants. Usage of threads for non-copying variants increases performance slightly. Parallel and non-copying performance difference from the control is negligent.

all variants

copying variants

fastest copying:

private static List<int> quickie7(List<int> ites)
{
    if (ites.Count <= 1)
        return ites;
    var piv = ites[0];
    List<int> les = new List<int>();
    List<int> sam = new List<int>();
    List<int> mor = new List<int>();
    Enumerable.Range(0, 3).AsParallel().ForAll(i =>
    {
        switch (i)
        {
            case 0: les = (from _item in ites where _item < piv select _item).ToList(); break;
            case 1: sam = (from _item in ites where _item == piv select _item).ToList(); break;
            case 2: mor = (from _item in ites where _item > piv select _item).ToList(); break;
        }
    });
    var _les = new List<int>();
    var _mor = new List<int>();
    Enumerable.Range(0, 2).AsParallel().ForAll(i =>
    {
        switch (i)
        {
            case 0: _les = quickie7(les); break;
            case 1: _mor = quickie7(mor); break;
        }
    });
    List<int> allofem = new List<int>();
    allofem.AddRange(_les);
    allofem.AddRange(sam);
    allofem.AddRange(_mor);
    return allofem;
}

fastest non copying:

public static IEnumerable<int> QSLinq3(IEnumerable<int> _items)
{
    if (_items.Count() <= 1)
        return _items;
    var _pivot = _items.First();
    IEnumerable<int> _less = null;
    IEnumerable<int> _same = null;
    IEnumerable<int> _greater = null;
    ConcurrentBag<ManualResetEvent> finishes = new ConcurrentBag<ManualResetEvent>();
    Enumerable.Range(0, 3).AsParallel().ForAll(i =>
    {
        var fin = new ManualResetEvent(false);
        finishes.Add(fin);
        (new Thread(new ThreadStart(() =>
        {
            if (i == 0)
                _less = from _item in _items where _item < _pivot select _item;
            else if (i == 1)
                _same = from _item in _items where _item == _pivot select _item;
            else if (i == 2)
                _greater = from _item in _items where _item > _pivot select _item;
            fin.Set();
        }))).Start();
    });
    finishes.ToList().ForEach(k => k.WaitOne());
    return QSLinq(_less).Concat(_same).Concat(QSLinq(_greater));
}
fartwhif
  • 192
  • 1
  • 10