4

NOTE

The question has been edited following the good advise from @Kaddath to highlight the fact that the ordering doesn't have to be alphabetical but depending on the position of items inside the arrays.


I have an array of arrays where each of the arrays are based on a given ordering but they can differ a bit.

For example, the base ordering is X -> D -> H -> B and here is my array of arrays:

const arrays = [
  ['X', 'D', 'H', 'B'],
  ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
  ['X', 'M', 'D', 'H', 'B'],
  ['X', 'H', 'T'],
  ['X', 'D', 'H', 'B']
]

I would like to merge all arrays into a single one and remove duplicates but by keeping the ordering. In my example the result would be ['X', 'M', 'D', 'K', 'Z', 'H', 'T', 'B', 'A'].

In the example we can see that M is between X and D inside the third array and it is so placed between X and D in the final output.

I know conflicts may arise but here are the following rules:

  • Every items should appear in the final output.
  • If an item is appearing in multiple arrays at different positions, the first appearance is the right one (skip others).

What I've done so far is merging all of these arrays into a single one by using

const merged = [].concat.apply([], arrays);

(cf. https://stackoverflow.com/a/10865042/3520621).

And then getting unique values by using this code snippet from https://stackoverflow.com/a/1584377/3520621 :

Array.prototype.unique = function() {
    var a = this.concat();
    for(var i=0; i<a.length; ++i) {
        for(var j=i+1; j<a.length; ++j) {
            if(a[i] === a[j])
                a.splice(j--, 1);
        }
    }

    return a;
}; 
const finalArray = merged.unique();

But my result is this:

[
  "X",
  "D",
  "H",
  "B",
  "K",
  "Z",
  "A",
  "M",
  "T"
]

Any help is welcome!

Thanks.

MHogge
  • 4,504
  • 12
  • 45
  • 86
  • 1
    Can't you order the array after it has been merged? – Jerodev Dec 11 '18 at 09:23
  • 2
    You can sort it i.e. `finalArray.sort()` – Satpal Dec 11 '18 at 09:23
  • I don't see how you could do any other way than sorting them afterwards. If you think a little, "keep ordering" in your case leads to conflicts, do you want to keep the first array ordering, or the second if they have different orderings? the third one? which criteria must apply? – Kaddath Dec 11 '18 at 09:26
  • The data is not "sortable". Their is a base arrays, in this example `['A', 'B', 'C', 'D']` but it could be `['X', '1', 'D', 'EE']` (anything else) and the result should keep the order of the base array but adding items between existing ones (like the `A-bis`' is added between `A` and `B`, not because of its alphabetical sorting but because it appears between those 2 items inside one of the following arrays). – MHogge Dec 11 '18 at 09:44
  • 2
    Then i think you should edit your post so that it doesn't look like an alphabetical ordering, and precise that the order that must apply depends on the order of the array of arrays (apply first array order, then the second, etc), if that is the case. You have to be conscious that following array orderings can conflict with ones already applied, and precise if it must be ignored or override existing one if that happens. – Kaddath Dec 11 '18 at 10:08

10 Answers10

4

const arrays = [
  ['X', 'D', 'H', 'B'],
  ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
  ['X', 'M', 'D', 'H', 'B'],
  ['X', 'H', 'T'],
  ['X', 'D', 'H', 'B']
];
const result = [];
arrays.forEach(array => {
  array.forEach((item, idx) => {
    // check if the item has already been added, if not, try to add
    if(!~result.indexOf(item)) {
      // if item is not first item, find position of his left sibling in result array
      if(idx) {
        const result_idx = result.indexOf(array[idx - 1]);
        // add item after left sibling position
        result.splice(result_idx + 1, 0, item);
        return;
      }
      result.push(item);
    }
  });
});
console.log('expected result', ['X', 'M', 'D', 'K', 'Z', 'H', 'T', 'B', 'A'].join(','));
console.log(' current result',result.join(','));
ponury-kostek
  • 6,780
  • 4
  • 17
  • 26
3

Flatten, remove duplicates and sort could be simpler:

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D'],
];
console.log(
  arrays
    .flat()
    .filter((u, i, all) => all.indexOf(u) === i)
    .sort((a, b) => a.localeCompare(b)),
);

Or event simpler according to Mohammad Usman's now deleted post:

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D'],
];
console.log(
  [...new Set([].concat(...arrays))].sort((a, b) =>
    a.localeCompare(b),
  ),
);
HMR
  • 30,349
  • 16
  • 67
  • 136
  • It seems to be what I'm searching for except that my sorting shouldn't be alphabetical but more like "You have a base array ['A', 'B', 'C', 'D'], keep that order and add items between existing ones, depending on following arrays". Not sure I make myself understood. But I suppose I need to change the `sort` function in order to achieve my goal. – MHogge Dec 11 '18 at 09:41
  • @MHogge So take from arrays[0] the first, if it is 'A' then take from others everything starting with 'A', then take next from arrays[0] and do the same. – HMR Dec 11 '18 at 09:50
  • It is not depending on an alphabetical sorting, `A-bis` could have been renamed as `XXX`, it would still have to be placed between `A` and `B` because its where it has been found in one of the following arrays. See my comment on the original post, maybe its clearer. – MHogge Dec 11 '18 at 09:53
2

You can use .concat() with Set to get the resultant array of unique values:

const data = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];

const result = [...new Set([].concat(...data))].sort((a, b) => a.localeCompare(b));

console.log(result);
.as-console-wrapper { max-height: 100% !important; top: 0; }
Mohammad Usman
  • 30,882
  • 16
  • 80
  • 78
  • The result is not what I expect, see in my question: I would like to get `['A', 'A-bis', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis', 'E']` as a result. – MHogge Dec 11 '18 at 09:25
2

Create a single array using array#concat and then using Set get the unique values from this array then sort the array.

const arrays = [ ['A', 'B', 'C', 'D'], ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'], ['A', 'A-bis', 'B', 'C', 'D'], ['A', 'C', 'E'], ['A', 'B', 'C', 'D'] ],
      result = [...new Set([].concat(...arrays))].sort();
console.log(result);
Hassan Imam
  • 16,414
  • 3
  • 29
  • 41
1
  1. merge [].concat.apply([], arrays)
  2. find uniq [...new Set(merged)]
  3. sort .sort()

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];


let merged = [].concat.apply([], arrays);  // merge array

let sort = [...new Set(merged)].sort(); // find uniq then sort

console.log(sort);
Shiv Kumar Baghel
  • 2,086
  • 4
  • 14
  • 28
1

Fun problem to solve; I think I only partly succeeded.

  • I ignored the "underspecified" example of B -> A -> T vs T -> B -> A
  • It's very inefficient

Still posting cause I think it might help you get things right. Here's my approach:

Step 1: create a naive index

We're creating an object that, for each unique element in the nested arrays, tracks which it has succeeded or preceded:

{
  "X": { prev: Set({}), next: Set({ "D", "H", "B", "K", "Z", "A", "M", "T" })
  "M": { prev: Set({ "X" }), next: Set({ "D", "H", "B" })
  // etc.
}

I named it "naive" because these Sets only contain information of one level deep.

I.e.: they only report relations between elements that were in the same array. They cannot see the M comes before the K because they were never in the same array.

Step 2: join the indexes recursively

This is where I ignored all big-O concerns one might have . I merge the index recursively: The next of M is a join of the next of D, H, B. Recurse until you found an element that has no next, i.e. the T or A.

Step 3: create a sorter that respects the sort index:

const indexSorter = idx => (a, b) => 
    idx[a].next.has(b) || idx[b].prev.has(a) ? -1 :
    idx[a].prev.has(b) || idx[b].next.has(a) ?  1 :
                                                0 ;

This function creates a sort method that uses the generated index to look up the sort order between any two elements.

Bringing it all together:

(function() {


  const naiveSortIndex = xss => xss
    .map(xs =>
      // [ prev, cur, next ]
      xs.map((x, i, xs) => [
        xs.slice(0, i), x, xs.slice(i + 1)
      ])
    )

    // flatten
    .reduce((xs, ys) => xs.concat(ys), [])

    // add to index
    .reduce(
      (idx, [prev, cur, next]) => {
        if (!idx[cur])
          idx[cur] = {
            prev: new Set(),
            next: new Set()
          };

        prev.forEach(p => {
          idx[cur].prev.add(p);
        });

        next.forEach(n => {
          idx[cur].next.add(n);
        });

        return idx;
      }, {}
    );

  const expensiveSortIndex = xss => {
    const naive = naiveSortIndex(xss);

    return Object
      .keys(naive)
      .reduce(
        (idx, k) => Object.assign(idx, {
          [k]: {
            prev: mergeDir("prev", naive, k),
            next: mergeDir("next", naive, k)
          }
        }), {}
      )
  }

  const mergeDir = (dir, idx, k, s = new Set()) =>
    idx[k][dir].size === 0 
      ? s 
      : Array.from(idx[k][dir])
          .reduce(
            (s, k2) => mergeDir(dir, idx, k2, s),
            new Set([...s, ...idx[k][dir]])
          );

  // Generate a recursive sort method based on an index of { key: { prev, next } }
  const indexSorter = idx => (a, b) =>
    idx[a].next.has(b) || idx[b].prev.has(a) ? -1 :
    idx[a].prev.has(b) || idx[b].next.has(a) ? 1 :
    0;

  const uniques = xs => Array.from(new Set(xs));


  // App:
  const arrays = [
    ['X', 'D', 'H', 'B'],
    ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
    ['X', 'M', 'D', 'H', 'B'],
    ['X', 'H', 'T'],
    ['X', 'D', 'H', 'B']
  ];

  const sortIndex = expensiveSortIndex(arrays);
  const sorter = indexSorter(sortIndex);

  console.log(JSON.stringify(
    uniques(arrays.flat()).sort(sorter)
  ))

}())

Recommendations

I suppose the elegant solution to the problem might be able to skip all the merging of Sets by using a linked list / tree-like structure and injecting elements at the right indexes by traversing until an element of its prev/next is found.

user3297291
  • 19,011
  • 1
  • 24
  • 39
  • In fact this is great and it's a good start but even if the "unspecified" item is not really an issue to me, the efficiency might be one. I might have to apply this mechanism to large arrays and it's already taking ~2sec for small ones. Unfortunately I'm not really familiar with javascript but I will have a look at your recommendation and see if I manage to get something more efficient. Thanks for your help anyway! – MHogge Dec 11 '18 at 15:04
  • Let me know if you manage to improve it! Very curious. I might dig in a bit deeper if I have some more time later today/this week. – user3297291 Dec 11 '18 at 15:06
  • You can check the solution of @ponury-kostek. It is working fine to me. – MHogge Dec 20 '18 at 08:49
1

Every array is in fact a set of rules that tells what is the relative order between the elements. Final list should return all elements while respecting relative order defined by all rules.

Some solutions have solved the initial request, some even didn't solve that one (all that suggest using sort kind of missed the point of the question). Nevertheless, none proposed a generic solution.

The problem

If we look at the problem asked in the OP, this is how the rules define what is the relative position between elements:

   M    K -> Z    T
  ^ \  ^      \  ^
 /   v/        v/
X -> D ------> H -> B -> A

So, it is easy to see that our array starts with X. Next element can be both D and M. But, D requires M to already be in array. That is why we will put M as our next element, and then D. Next, D points to both K and H. But since H has some other predecessor that are not collected until now, and K has none (actually it has D, but it is already collected in the list), we will put K and Z, and only then H.

H points to both T and B. It actually doesn't matter which one we put first. So, last three elements can be in any of the following three orders:

  • T, B, A
  • B, A, T
  • B, T, A

Let us also take into account a little bit more complex case. Here are the rules:

['10', '11', '12', '1', '2'],
['11', '12', '13', '2'],
['9', '13'],
['9', '10'],

If we draw the graph using those rules we would get following:

   --------------> 13 ----
  /                ^      \
 /                /        v
9 -> 10 -> 11 -> 12 > 1 -> 2

What is specific about this case? Two things:

  • Only in the last rule we "find out" that the number 9 is the beginning of the array
  • There are two non direct paths from 12 to 2 (one over the number 1, second over the number 13).

Solution

My idea is to create a node from each element. And then use that node to keep track of all immediate successors and immediate predecessors. After that we would find all elements that don't have predecessors and start "collecting" results from there. If we came to the node that has multiple predecessors, but some of them are not collected, we would stop recursion there. It can happen that some of the successors is already collected in some other path. We would skip that successor.

function mergeAndMaintainRelativeOrder(arrays/*: string[][]*/)/*: string[]*/ {
    /*
    interface NodeElement {
        value: string;
        predecessor: Set<NodeElement>;
        successor: Set<NodeElement>;
        collected: boolean;
    }
    */
    const elements/*: { [key: string]: NodeElement }*/ = {};
    // For every element in all rules create NodeElement that will
    // be used to keep track of immediate predecessors and successors
    arrays.flat().forEach(
        (value) =>
            (elements[value] = {
                value,
                predecessor: new Set/*<NodeElement>*/(),
                successor: new Set/*<NodeElement>*/(),
                // Used when we form final array of results to indicate
                // that this node has already be collected in final array
                collected: false,
            }),
    );

    arrays.forEach((list) => {
        for (let i = 0; i < list.length - 1; i += 1) {
            const node = elements[list[i]];
            const nextNode = elements[list[i + 1]];

            node.successor.add(nextNode);
            nextNode.predecessor.add(node);
        }
    });

    function addElementsInArray(head/*: NodeElement*/, array/*: string[]*/) {
        let areAllPredecessorsCollected = true;
        head.predecessor.forEach((element) => {
            if (!element.collected) {
                areAllPredecessorsCollected = false;
            }
        });
        if (!areAllPredecessorsCollected) {
            return;
        }
        array.push(head.value);
        head.collected = true;
        head.successor.forEach((element) => {
            if (!element.collected) {
                addElementsInArray(element, array);
            }
        });
    }

    const results/*: string[]*/ = [];
    Object.values(elements)
        .filter((element) => element.predecessor.size === 0)
        .forEach((head) => {
            addElementsInArray(head, results);
        });
    return results;
}

console.log(mergeAndMaintainRelativeOrder([
    ['X', 'D', 'H', 'B'],
    ['X', 'D', 'K', 'Z', 'H', 'B', 'A'],
    ['X', 'M', 'D', 'H', 'B'],
    ['X', 'H', 'T'],
    ['X', 'D', 'H', 'B'],
]));


console.log(mergeAndMaintainRelativeOrder([
    ['10', '11', '12', '1', '2'],
    ['11', '12', '13', '2'],
    ['9', '13'],
    ['9', '10'],
]));

Big O

If we say that n is the number of the rules, and m is number of elements in each rule, complexity of this algorithm is O(n*m). This takes into account that Set implementation for the JS is near O(1).

dugokontov
  • 3,752
  • 1
  • 21
  • 25
0

Use a BST for this. Add in all elements to the bst and then traverse in-order.

function BST(){
  this.key = null;
  this.value = null;
  this.left = null;
  this.right = null;

  this.add = function(key}{
   const val = key;

   key = someOrderFunction(key.replace(/\s/,''));
   if(this.key == null) {
      this.key = key;
      this.val = val;

   } else if(key < this.key) {
      if(this.left){
        this.left.add(val);
      } else {
        this.left = new BST();
        this.left.key = key;
        this.left.val = val;
      }
   } else if(key > this.key) {

      if(this.right){
        this.right.add(val);
      } else {
        this.right= new BST();
        this.right.key = key;
        this.right.val = val;
      }
   }

   this.inOrder = function(){
      const leftNodeOrder = this.left ? this.left.inOrder() : [],
            rightNodeOrder = this.right? this.right.inOrder() : [];
      return leftNodeOrder.concat(this.val).concat(this.rightNodeOrder);

   }

}

// MergeArrays uses a BST to insert all elements of all arrays
// and then fetches them sorted in order
function mergeArrays(arrays) {
    const bst = new BST();
    arrays.forEach(array => array.forEach( e => bst.add(e)));
    return bst.inOrder();
}
0

I would just flatten the arrays, map them as keys to an object (thus removing the doubles), and then sort the final result

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
];

const final = Object.keys( arrays.flat().reduce( (aggregate, entry) => {
  aggregate[entry] = '';
  return aggregate;
}, {} ) ).sort( (x1, x2) => x1.localeCompare(x2) );

console.log( final );
Icepickle
  • 12,014
  • 3
  • 29
  • 43
0

To your code, after the merge you need to remove the duplicates. So you will get the unique array.

Use the array.sort, to sort the array.

I hope this will solve the issue.

const arrays = [
  ['A', 'B', 'C', 'D'],
  ['A', 'B', 'B-bis', 'B-ter', 'C', 'D', 'D-bis'],
  ['A', 'A-bis', 'B', 'C', 'D'],
  ['A', 'C', 'E'],
  ['A', 'B', 'C', 'D']
]

const merged = [].concat.apply([], arrays);

const unique = Array.from(new Set(merged))


const sorted = unique.sort()

console.log("sorted Array", sorted)

// Single Line
      const result = [...new Set([].concat(...arrays))].sort();
      
 console.log("sorted Array single line", result)
DILEEP THOMAS
  • 6,108
  • 3
  • 21
  • 56