25

Is there an algorithm to find a spanning tree of an undirected graph which minimizes the number of vertices connected to more than one edge?

For example, given a 4 x 4 grid graph, we want to find a spanning tree like that on the left (which has 7 vertices connected to more than one edge) rather than that on the right (which has 12):

4 x 4 grid graph

Edit: Would this problem be simpler if we consider only planar graphs (or even only grid graphs)?

David Eisenstat
  • 52,844
  • 7
  • 50
  • 103
user200783
  • 12,313
  • 10
  • 58
  • 110
  • Do you care about minimising the average degree of the whole spanning tree too, or just minimising the number of non-leaf nodes? – aportr Jul 27 '15 at 07:58
  • @jabolotai - Could you please define what you mean by "the average degree of the whole spanning tree"? – user200783 Jul 27 '15 at 08:39
  • If you compare two spanning trees and one has a vertex of degree 5 while the other is identical except instead has a vertex of degree 4, would you treat one as a better solution, or would you consider them equal? – aportr Jul 27 '15 at 09:05
  • @jabolotai - I need only to minimize the number of vertices connected to more than one edge (i.e. to minimize the number of non-leaf nodes). – user200783 Jul 27 '15 at 09:26
  • 14
    This is Maximum Leaf Spanning Tree problem. It is NP-Hard. Googling for an exact solution immediately gives [this pdf](http://tcs.rwth-aachen.de/~langer/pub/maxleaf-iwpec09.pdf). – Evgeny Kluev Jul 27 '15 at 10:13
  • 1
    I think that @EvgenyKluev has mostly answered the question. The name of the problem should allow you to read known techniques. Given that it is NP-hard, you won't get an answer with definitely good solution, I suppose. – stgatilov Jul 27 '15 at 12:00
  • 3
    Is the graph planar by any chance? – David Eisenstat Jul 27 '15 at 23:53
  • 1
    @Evgeny Kluev, please post this as an answer response. It provides a current high quality answer (via link) provides specific detail as to why no better answer is to be expected (NP-Hard) and provides a well known problem name to allow future searchers to find improved answers (in case one is discovered). – Speed8ump Jul 28 '15 at 21:47
  • 1
    @DavidEisenstat - Yes, in fact, all of the graphs which I am interested in are planar. – user200783 Jul 31 '15 at 12:13

3 Answers3

4

As Evgeny notes in the comments, this is known as the maximum leaf spanning tree problem. I've linked to the Wikipedia article on the very closely related connected dominating set problem, which is the problem of finding a minimum set of vertices that (i) induce a connected subgraph (ii) satisfy the proposition that, for all other vertices v, some vertex in the set is adjacent to v. The two problems shown to be solution-equivalent by observing that, given a spanning tree, we can construct a connected dominating set by dropping the leaves (vertices with exactly one connection), and given a connected dominating set, we can extract a spanning tree of the induced subgraph and attaching the other vertices as leaves.

Unfortunately, both problems are NP-hard, and they stay NP-hard under a restriction to planar graphs. I'm not familiar with the literature on connected dominating set in particular, but my guess is that there are three angles.

  1. Provably "fast" exponential-time exact algorithms / approximation algorithms.
  2. Exact algorithms that are not provably fast (e.g., integer programming) but good in practice.
  3. Heuristics.

#1 may look like a strange grouping, but what tends to happen in the planar graph literature is that the exact algorithms get used as a subroutine inside the approximation algorithms, often via a technique due to Brenda Baker known as shifting. One of the properties of planar graphs is that a parameter called treewidth is bounded by O(sqrt(n)) instead of n, and there are dynamic programs whose running time exponent is a function of the much smaller treewidth. (E.g., on grids, you can run a DP row by row. The tree-decomposition machinery generalizes this to arbitrary planar graphs.)

It's hard to advise you on the best course without knowing what the instances look like and maybe even without experimenting with them. I'd probably go with door #2, but I'm not sure what a good formulation would look like. The good news is that most of the algorithmic complexity is abstracted into the solver library that you'll be using. Here's a formulation of unknown quality.

For all vertices v, let x_v be 1 if v is a non-leaf and 0 if v is a leaf. The dominating set part is easy.

minimize sum_v x_v
subject to
for all v, sum_{w such that w = v or w ~ v} x_w >= 1
for all v, x_v in {0, 1}

Here I'm using ~ to mean "is adjacent to". Enforcing the connectivity constraint is trickier. The simplest approach that I can think of is to solve the integer program as is, then look for two vertices s and t that are both chosen but not connected in the solution, compute a minimum vertex separator U between s and t among separators not including a chosen vertex, enter a constraint

(1 - x_s) + (1 - x_t) + sum_{v in U} x_v >= 1

and then try again.

I'd be more hopeful about an approach that uses exponentially many variables, but it may be significantly harder to implement (row and column generation). Choose a vertex r that will be forced as a non-leaf (guess or try all possibilities). There is one variable y_P for each simple path P with r as an endpoint.

minimize sum_v x_v
subject to
for all v, for all P having v as an interior vertex,
  x_v >= y_P
for all v, sum_{P having v as an endpoint} y_P >= 1
for all v, x_v in {0, 1}
for all P, y_P in {0, 1}
David Eisenstat
  • 52,844
  • 7
  • 50
  • 103
0

Not that I know of.

You could use a Breadth First Search approach, adding all unvisited vertices to a queue and visiting the next vertex in the queue. Meanwhile you'd add vertices and their edges to a Priority Queue based upon the number of possible edges branching off of the connecting vertex. Then go through the PQ recursively, adding the best edge every time. You'd just have to detract any edges that contain already used vertices. Then check if there were any higher priority edges on the last vertex, and, if so, backtrack.

It's an ugly concept and might be worse in implementation though.

nickRise
  • 96
  • 1
  • 6
0

For a 4x4 I would only need 7 vertices connected to more than 1 edge, which would give me 9 leaf nodes.

x-o-x x
  |   |
x-o-x o
  |   |
o-o-o-o
| | | |
x x x x

As the dimensions get larger you have to expand on the above pattern.

For a 10x10 you'd have 59 leaf nodes

x-o-x x-o-x x-o-x x
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
x-o-x x-o-x x-o-x o
  |     |     |   |
o-o-o-o-o-o-o-o-o-o
| | | | | | | | | |
x x x x x x x x x x

For grids where the Rows <> Cols you'd have to try the pattern in both orientations to see which yields the best results.

Louis Ricci
  • 19,594
  • 5
  • 43
  • 60