3

I'd like to write a tool that works on some tree-structured data. (In fact it will work on a tree-like subset of a git revision DAG, but that's not important for this question). In particular I want an algorithm that reconstructs a subset of the tree consisting of all the "join points" of a given input set.

Specifically what I think I want is

  • We have some type H that has a "lowest common ancestor" function, lca on it. This gives H a tree-like structure.

  • The algorithm takes some subset S of H as input.

  • The output should be a multi-way tree t with nodes labelled by values of H.

  • t should satisfy the properties

    • All s in S label some node of t

    • The leaves of t can only be labelled by elements of S

    • Any element h for H labels no more than one node of t

    • If h1 labels n1 and h2 labels n2 then lca(h1, h2) labels the lowest common ancestor of n1 and n2 in t.

My question is: "Is this a known problem with known algorithms?". I suspect it is. It seems quite similar to a topological sort. I have an idea for an algorithm based on merge sort but if known algorithms already exist there's no reason to come up with my own.

Tom Ellis
  • 8,007
  • 21
  • 45
  • 1
    Speaking in terms of _least common ancestor_, would the root of the tree be _an ancestor_ of all of its non-root nodes? I believe the structure you describe is also termed as a [semilattice](https://en.wikipedia.org/wiki/Semilattice). I think the desired output would be termed as _semilattice hull_ of the input. – Codor Nov 10 '17 at 14:20
  • 1
    Yes, the structure is a semilattice. "Semilattice hull" is nice nomenclature! – Tom Ellis Nov 13 '17 at 13:13
  • You would really save my day if you could tell me whether _semilattical_ is an actual word. – Codor Nov 13 '17 at 13:24

1 Answers1

1

I don't know what you call it, but I'd first compare all pairs of elements to construct the partial order for the tree, then do a topological sort, then construct the tree from that. (The point of sorting it is that now you know that the first element is the root, and each element in turn will be a leaf.)

The subject reminded me of cladistics algorithms, http://bio.slu.edu/mayden/systematics/bsc420520lect12.html. However those are both easier and harder. Easier because it is easy to tell upon inspection which forms are close to another. Harder because the challenge is that you don't know the LCA. So pursuing that might be an interesting side track but is probably not very helpful.

btilly
  • 35,214
  • 3
  • 46
  • 74
  • This is a nice simple algorithm. I was originally hoping for an O(n log n) algorithm but I suspect you can't determine whether or not the graph is completely disconnected in less than O(n^2) operations. – Tom Ellis Nov 13 '17 at 13:17
  • I don't understand how the topological order could be generated by evaluation of the least common ancestor function; please clarify. – Codor Nov 13 '17 at 13:23
  • 1
    @Codor `a < b` iff `lca(a, b) = a` defines a partial order. That's all that you need for a topological sort. – btilly Nov 13 '17 at 16:50