C# match arrays

Question

I have two lists of arrays (arrays contains always pair of int):

List<int[]> a= new List<int[]>
{ 
    new int[2] {0, 1}, 
    new int[2] {5, 3}, 
    new int[2] {1, 3}, 
    new int[2] {5, 0},
};


List<int[]> b= new List<int[]>
{ 
    new int[2] {0, 1}, 
    new int[2] {5, 3},
};

What I would like to get are ids of b list elements in list a.

Is there a better way than looping through list a , checking array elements with if statement and if true add those elements?

Another issue is that condition is true if pair is 0 1 flipped. So 0;1 and 0;1 is true and 0;1 and 1;0 is true;

Welcome to Stack Overflow. This is not a good way to ask a question here. Did you try anything so far to solve your problem? Show your effort first so people might show theirs. Please read [FAQ](http://stackoverflow.com/tour), [How to Ask](http://stackoverflow.com/help/how-to-ask) and [help center](http://stackoverflow.com/help) as a start. — Nahuel Ianni, Jun 26 '17 at 14:24
As a comment says I tried looping through array a and check if it contains b elements. But this operation takes time for larger loops. I thought maybe there is in-built something for this kind of situations. Desired outcome would be array of b.Length. So in this situation -> ids {0,1} — Petras Vestartas, Jun 26 '17 at 14:30

fharreau · Answer 1 · 2017-06-26T15:35:40.780

First, let's say I am absolutely not a math specialist. I just trust Wikipedia :D.

List<int[]> a = new List<int[]>
{
    new int[2] {0, 1},
    new int[2] {5, 3},
    new int[2] {1, 3},
    new int[2] {5, 0},
};

List<int[]> b = new List<int[]>
{
    new int[2] {0, 1},
    new int[2] {5, 3},
};

var aIds = new Dictionary<double, int>(a.Count);

for (int i = 0; i < a.Count; i++)
{
    var pair = a[i];

    // id for (a;b)
    var id1 = 0.5 * (pair[0] + pair[1]) * (pair[0] + pair[1] + 1) + pair[1];
    // id for (b;a)
    var id2 = 0.5 * (pair[1] + pair[0]) * (pair[1] + pair[0] + 1) + pair[0];

    aIds[id1] = i;
    aIds[id2] = i;
}

var intersection = new List<int>();

foreach (var pair in b)
{
    int id;
    if (aIds.TryGetValue(0.5 * (pair[0] + pair[1]) * (pair[0] + pair[1] + 1) + pair[1], out id))
    {
        intersection.Add(id);
    }
}

Reminder from the MSDN:

The T:System.Collections.Generic.Dictionary generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the T:System.Collections.Generic.Dictionary class is implemented as a hash table.

This often a way to get a huge performance gain when dealing with large sets.

If wikipedia said the truth, this should produce a unique identifier for each pair in the dictionary (or two if the order does not matter). Then, you get your solution in O(na) + O(nb) I guess (one loop for the hashing, and another one to find the intersection).

I get this solution from this answer. There is probably some good tips to learn within the thread.

I think the OP wants the output to be the indexes of the arrays, not the arrays themselves. — adam0101, Jun 26 '17 at 14:57
@adam0101 this time, it will get the index in the list a of element in list b. That's what I understand from the question. Am I correct? — fharreau, Jun 26 '17 at 15:13
@PetrasVestartas Just added support for your last condition: `Another issue is that condition is true if pair is 0 1 flipped. So 0;1 and 0;1 is true and 0;1 and 1;0 is true;` — fharreau, Jun 26 '17 at 15:18
It is very nice approach, I did not expect that it is possible to use mathematical approach like this. — Petras Vestartas, Jun 26 '17 at 16:02
@PetrasVestartas Just to feed my curiosity, how is the gain using this approach? Also, if this answers your question, you can accept it by clicking the check mark. — fharreau, Jun 26 '17 at 16:08

Codor · Answer 2 · 2017-06-26T19:32:09.530

0

The desired result is a bit difficult to tell from the question; however, apparently the individual arrays (which have exactly two elements each) are considered to be equal if theay have the same entries regardless of sequence. Such a predicate can be implemented in a rather elementary fashion as follows.

Func<int[], int[], bool> ArrayEqual = (x,y) =>
    x.Distinct().OrderBy( z => z ).SequenceEqual( y. Distinct().OrderBy( z => z ) );

The arrays from a which also occur in b (using the notion of equality above) can be determined using Linq as follows.

var Result = a.Where( iA => b.Any( iB => ArrayEqual( iA, iB ) ) );

Edit

If the goal is to reduce the complexity from quadratic time to something better than quadratic time, I suggest to following approach. First, all individual arrays have to be sorted which can be done in linear time in total as each individual array has 2 (which is a constant) number of elements. Next, both a and b have to be sorted lexicographically, which can be done in O( n log n ) time, as again the comparison of two individual arrays can be done in constant time. Next, the desired output can be done using two indices to either list; in each step, it is possible to check in constant time whether two individual arrays are equal or not, which means that an element of a is either accepted or rejected for the output. In each iteration, one of the two indices can be increased; the genration of the output itself can be done in linear time. In total, this yields an O( n log n ) runtime bound.

edited Jun 26 '17 at 19:32

answered Jun 26 '17 at 14:32

Codor

16,805
9
30
51

As far as I know (I am not very familliar how Linq works under the hood), I don't think this solution will reduce the complexity of the operation. You just shortened the produced code. – fharreau Jun 26 '17 at 14:37
This is going to perform *dramatically worse* than the OP's described solution, which they said was too slow. – Servy Jun 26 '17 at 14:37
@fharreau Not only is this not faster, but by sorting each array N^2 times it's actually notably slower. – Servy Jun 26 '17 at 14:38
Checking if two arrays are equal is not a constant time operation. – Servy Jun 26 '17 at 14:47
Please fix the error associated with the first statement: "A local or parameter named 'a' cannot be declared in this scope because that name is used in an enclosing local scope to define a local or parameter" – usefulBee Jun 26 '17 at 14:48
@Servy Checking if two arrays are equal is indeed a constant time operation, provided that the length of the arrays is bounded by a constant. – Codor Jun 26 '17 at 14:48
@Codor 1) You didn't state that assumption. 2) That isn't an assumption you can make. – Servy Jun 26 '17 at 14:50
@Servy I included _as each individual array has 2 (which is a constant) number of elements_ and although this assumption does not hold in general, the single example given in the original question has this property. The informal description of the notion of equality also suggests that the individual arrays have exactly two elements. – Codor Jun 26 '17 at 14:53
@Servy: OP mentioned the arrays will **always** have two elements. Assuming his code sample, he is using a literal `2` to denote the array length. That seems like a valid assumption then. – Flater Jun 26 '17 at 14:53

score 0 · Answer 3 · answered Jun 26 '17 at 14:49

Some comments

What I would like to get are ids of b list elements in list a.

Nowhere in your question are there IDs mentioned, also not in the data example. I am going to assume you want the indexes (i.e. the position in list a of the matched element.

arrays contains always pair of int

I suggest using a Tuple<int,int>. Which I will use in a code example, for the sake of easier processing. But I'll make it so that you can start off with your initial arrays, in order to maximize compatibility with your code.

Another issue is that condition is true if pair is 0 1 flipped. So 0;1 and 0;1 is true and 0;1 and 1;0 is true;

The solution therefore is to order each element pair (not the list of element itself), so that we match on integer values regardless of their position.

If you need to preserve the order of the numbers (for other purposes), then you'll have to make a copy of the numbers (for the purpose of this matching algorithm), and only sort that copy.

The actual question

Is there a better way than looping through list a , checking array elements with if statement and if true add those elements?

Not really. You are going to have to match every element A to every element B.
However, if you are talking about large data sets; you might gain a bit of performance by doing some preprocessing (mainly ordering of entries etc.).

It somewhat depends on what you mean with "a better way". Are you trying to:

Maximize the speed of processing large data sets?
Improve the readability of the code?
Shorten the code itself?

Depending on what your focus is, you will need a different approach. And that is not clear from your question as it is currently phrased.

I simplified my problem to this one: I have an index pair, or more simply two integers int a and int b. and I have a list of pairs arrays: List idPairs. What would be an approach to find indices of pair a-b in idPairs? I tried idPairs.Contains(new int[2]{a,b}) but this only returns bool value. Is there a method to find indices in List collection? First I want to find something that works only then maximize the speed of code. — Petras Vestartas, Jun 26 '17 at 15:46
Instead of `idPairs.Contains(new int[2]{a,b})`, try using `idPairs.IndexOf(new int[2]{a,b})`. This returns an integer with the index position (if it does not exist, it returns -1). You might need to cast your list to an array; I'm not sure. If that is the case: `idPairs.ToArray().IndexOf(new int[2]{a,b})` (although I would suggest casting it **once** and then using that array for all iterations; but it's too verbose to put in a comment here on SO :)) — Flater, Jun 26 '17 at 15:55
for now I am doing this like this. Is there a faster way to find index, not linq operation that looks faster but actually faster code or approach: for(int i = 0; i < ngonPairs.Count; i++) { if ((ngonPairs[i][0] == a || ngonPairs[i][0] == b) && (ngonPairs[i][1] == a || ngonPairs[i][1] == b)) { edgesID.Add(i); break; } } — Petras Vestartas, Jun 26 '17 at 15:59
@PetrasVestartas: Instead of checking each combination, it might be better to order both pairs by size (so that [1,5] stays [1,5] and [5,1] turns into [1,5]) and then you only match the first elements and the second elements (no cross checking). But I have a suspicicion that the optimization here is negligible. I also suggest asking these things in new questions, as comment threads are not the intention on SO, and they are not particularly easy for reading code samples :) — Flater, Jun 26 '17 at 16:07

C# match arrays

3 Answers3

Some comments

The actual question