69

I have been searching for a performance benchmarking between Contains, Exists and Any methods available in the List<T>. I wanted to find this out just out of curiosity as I was always confused among these. Many questions on SO described definitions of these methods such as:

  1. LINQ Ring: Any() vs Contains() for Huge Collections
  2. Linq .Any VS .Exists - Whats the difference?
  3. LINQ extension methods - Any() vs. Where() vs. Exists()

So I decided to do it myself. I am adding it as an answer. Any more insight on the results is most welcomed. I also did this benchmarking for arrays to see the results

Community
  • 1
  • 1
harshit
  • 3,448
  • 3
  • 23
  • 54

3 Answers3

76

According to documentation:

List.Exists (Object method)

Determines whether the List(T) contains elements that match the conditions defined by the specified predicate.

IEnumerable.Any (Extension method)

Determines whether any element of a sequence satisfies a condition.

List.Contains (Object Method)

Determines whether an element is in the List.

Benchmarking:

CODE:

    static void Main(string[] args)
    {
        ContainsExistsAnyShort();

        ContainsExistsAny();
    }
    
    private static void ContainsExistsAny()
    {
        Console.WriteLine("***************************************");
        Console.WriteLine("********* ContainsExistsAny ***********");
        Console.WriteLine("***************************************");

        List<int> list = new List<int>(6000000);
        Random random = new Random();
        for (int i = 0; i < 6000000; i++)
        {
            list.Add(random.Next(6000000));
        }
        int[] arr = list.ToArray();

        find(list, arr);
    }

    private static void ContainsExistsAnyShort()
    {
        Console.WriteLine("***************************************");
        Console.WriteLine("***** ContainsExistsAnyShortRange *****");
        Console.WriteLine("***************************************");

        List<int> list = new List<int>(2000);
        Random random = new Random();
        for (int i = 0; i < 2000; i++)
        {
            list.Add(random.Next(6000000));
        }
        int[] arr = list.ToArray();

        find(list, arr);
    }

    private static void find(List<int> list, int[] arr)
    {
        Random random = new Random();
        int[] find = new int[10000];
        for (int i = 0; i < 10000; i++)
        {
            find[i] = random.Next(6000000);
        }

        Stopwatch watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 10000; rpt++)
        {
            list.Contains(find[rpt]);
        }
        watch.Stop();
        Console.WriteLine("List/Contains: {0:N0}ms", watch.ElapsedMilliseconds);

        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 10000; rpt++)
        {
            list.Exists(a => a == find[rpt]);
        }
        watch.Stop();
        Console.WriteLine("List/Exists: {0:N0}ms", watch.ElapsedMilliseconds);

        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 10000; rpt++)
        {
            list.Any(a => a == find[rpt]);
        }
        watch.Stop();
        Console.WriteLine("List/Any: {0:N0}ms", watch.ElapsedMilliseconds);

        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 10000; rpt++)
        {
            arr.Contains(find[rpt]);
        }
        watch.Stop();
        Console.WriteLine("Array/Contains: {0:N0}ms", watch.ElapsedMilliseconds);

        Console.WriteLine("Arrays do not have Exists");

        watch = Stopwatch.StartNew();
        for (int rpt = 0; rpt < 10000; rpt++)
        {
            arr.Any(a => a == find[rpt]);
        }
        watch.Stop();
        Console.WriteLine("Array/Any: {0:N0}ms", watch.ElapsedMilliseconds);
    }

RESULTS

***************************************
***** ContainsExistsAnyShortRange *****
***************************************
List/Contains: 96ms
List/Exists: 146ms
List/Any: 381ms
Array/Contains: 34ms
Arrays do not have Exists
Array/Any: 410ms
***************************************
********* ContainsExistsAny ***********
***************************************
List/Contains: 257,996ms
List/Exists: 379,951ms
List/Any: 884,853ms
Array/Contains: 72,486ms
Arrays do not have Exists
Array/Any: 1,013,303ms
Community
  • 1
  • 1
harshit
  • 3,448
  • 3
  • 23
  • 54
  • Just keep in mind that though Contains seem to be the fastest, LINQ 2 SQL has a limitation of ~2100 objects in the list, so it would be good for shorter lists. – Giannis Paraskevopoulos Sep 06 '13 at 07:20
  • @jyparask Even for the larger lists Contains seems good. However, I have updated the code and timings for the shorter list too. The result is as you predicted. – harshit Sep 06 '13 at 08:56
  • I ran your benchmark and I obtained that List.Exists was actually slightly faster than List.Contains, 45ms vs 55ms. The rest seemed consistent with your results. Tested on .NET 4.5 using Visual Studio 2013 in 32-bit in Release mode with optimizations. – Asik Sep 15 '14 at 20:26
  • What do you mean: "Arrays do not have Exists"? I think it has Exists(): https://stackoverflow.com/a/22928748/4608491 – 123iamking Sep 03 '18 at 08:38
  • "Arrays do not have Exists" ? `Array.Exists(` https://docs.microsoft.com/en-us/dotnet/api/system.array.exists?view=netframework-4.7.2 – Mark Schultheiss Mar 09 '21 at 16:52
71

The fastest way is to use a HashSet. The Contains for a HashSet is O(1).

I took you code and added a benchmark for HashSet<int> The performance cost of HashSet<int> set = new HashSet<int>(list); is nearly zero.

void Main()
{
    ContainsExistsAnyShort();

    ContainsExistsAny();
}

private static void ContainsExistsAny()
{
    Console.WriteLine("***************************************");
    Console.WriteLine("********* ContainsExistsAny ***********");
    Console.WriteLine("***************************************");

    List<int> list = new List<int>(6000000);
    Random random = new Random();
    for (int i = 0; i < 6000000; i++)
    {
        list.Add(random.Next(6000000));
    }
    int[] arr = list.ToArray();
    HashSet<int> set = new HashSet<int>(list);

    find(list, arr, set);

}

private static void ContainsExistsAnyShort()
{
    Console.WriteLine("***************************************");
    Console.WriteLine("***** ContainsExistsAnyShortRange *****");
    Console.WriteLine("***************************************");

    List<int> list = new List<int>(2000);
    Random random = new Random();
    for (int i = 0; i < 2000; i++)
    {
        list.Add(random.Next(6000000));
    }
    int[] arr = list.ToArray();
    HashSet<int> set = new HashSet<int>(list);

    find(list, arr, set);

}

private static void find(List<int> list, int[] arr, HashSet<int> set)
{
    Random random = new Random();
    int[] find = new int[10000];
    for (int i = 0; i < 10000; i++)
    {
        find[i] = random.Next(6000000);
    }

    Stopwatch watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        list.Contains(find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("List/Contains: {0}ms", watch.ElapsedMilliseconds);

    watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        list.Exists(a => a == find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("List/Exists: {0}ms", watch.ElapsedMilliseconds);

    watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        list.Any(a => a == find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("List/Any: {0}ms", watch.ElapsedMilliseconds);

    watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        arr.Contains(find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("Array/Contains: {0}ms", watch.ElapsedMilliseconds);

    Console.WriteLine("Arrays do not have Exists");

    watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        arr.Any(a => a == find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("Array/Any: {0}ms", watch.ElapsedMilliseconds);

    watch = Stopwatch.StartNew();
    for (int rpt = 0; rpt < 10000; rpt++)
    {
        set.Contains(find[rpt]);
    }
    watch.Stop();
    Console.WriteLine("HashSet/Contains: {0}ms", watch.ElapsedMilliseconds);
}

RESULTS

***************************************
***** ContainsExistsAnyShortRange *****
***************************************
List/Contains: 65ms
List/Exists: 106ms
List/Any: 222ms
Array/Contains: 20ms
Arrays do not have Exists
Array/Any: 281ms
HashSet/Contains: 0ms
***************************************
********* ContainsExistsAny ***********
***************************************
List/Contains: 120522ms
List/Exists: 250445ms
List/Any: 653530ms
Array/Contains: 40801ms
Arrays do not have Exists
Array/Any: 522371ms
HashSet/Contains: 3ms
Hossein Narimani Rad
  • 27,798
  • 16
  • 81
  • 109
wertzui
  • 3,673
  • 2
  • 26
  • 36
  • 1
    I was looking for exactly such a thing. I was like "Holy Molly" when I saw the performance on the HashSet/Contains. Definitely going to try that out in my environment of 1000 .. 5000 items. – Matthis Kohli May 19 '16 at 07:45
  • 1
    Would have been nice to see the performance of a HashMap as well here. – Matthis Kohli May 19 '16 at 08:05
  • Do you know if the Any for a HashSet is aslo O(1)? – David Létourneau May 21 '18 at 21:15
  • Any does not exist on `HashSet` but is an extension Method for `IEnumerable` and is O(1) without a predicate and O(n) with a predicate. Herse is the source: https://github.com/Microsoft/referencesource/blob/master/System.Core/System/Linq/Enumerable.cs#L1288 – wertzui May 23 '18 at 05:28
  • "Arrays do not have Exists" ? `Array.Exists(` https://docs.microsoft.com/en-us/dotnet/api/system.array.exists?view=netframework-4.7.2 – Mark Schultheiss Mar 09 '21 at 16:51
  • Just a quick point, having order O(1) doesn't necessarily mean that an algorithm is fast, it just means that its speed doesn't depend on the number of elements over which it is applied. It might still take a very long time to complete. – DrMcCleod May 16 '21 at 07:34
4

It is worth mentioning that this comparison is a bit unfair, since the Array class doesn't own the Contains() method. It uses an extension method for IEnumerable<T> via a sequential Enumerator, hence it is not optimized for Array instances. On the other side, HashSet<T> has its own implementation fully optimized for all sizes.

To compare fairly you could use the static method int Array.IndexOf() which is implemented for Array instances, even though it uses a for loop slightly more efficient that an Enumerator.

Using a fair comparison algorithm, the performance for small sets of up to 5 elements of HashSet<T>.Contains() is similar to the Array.IndexOf() but it is much more efficient for larger sets.

Theodor Zoulias
  • 15,834
  • 3
  • 19
  • 54
Lucky Brain
  • 1,241
  • 10
  • 13