20

I have a DAG (with costs/weights per edge) and want to find the longest path between two sets of nodes. The two sets of start and target nodes are disjoint and small in size compared to the total number of nodes in the graph.

I know how to do this efficiently between one start and target node. With multiple, I can list all paths from every start to every target node and pick the longest one – but that takes quadratic number of single path searches. Is there a better way?

Hossein Narimani Rad
  • 27,798
  • 16
  • 81
  • 109
starmole
  • 4,680
  • 21
  • 46

1 Answers1

17

I assume that you want the longest path possible that starts in any of the nodes from the first set and ends in any of the nodes in the second set. Then you can add two virtual nodes:

  • The first node has no predecessors and its successors are the nodes from the first set.

  • The second node has no successors and its predecessors are the nodes from the second set.

All the newly added edges should have zero weight.

The graph would still be a DAG. Now if you use the standard algorithm to find the longest path in the DAG between the two new nodes, you’ll get the longest path that starts in the first set and ends in the second set, except that there will be an extra zero-weighted edge at the beginning and an extra zero-weighted edge at the end.

By the way, this solution is essentially executing the algorithm from all the nodes from the first set, but in parallel as opposed to the sequential approach your question suggests.

Palec
  • 10,298
  • 7
  • 52
  • 116
  • I hope I did not accept the answer too early, because intuitively this makes a lot of sense and is amazingly easy! On the other hand I still want to test it against the brute force solution. But as far as I can think this will totally work! Thanks! – starmole Apr 17 '15 at 02:02
  • If it’s clear that my solution works, the only thing that could go wrong is performance. Topological sort takes linear time in the size of the graph (nodes + edges), so the two added nodes with a few edges make virtually no difference. If a node lies on a path from a start and an end node, my algorithms processes it once, the brute-force algorithm once for each pair of start and end nodes it lies between. Also my solution needs initialization and postprocessing only once, while brute-force algo needs to get the actual candidate longest path for each pair to avoid extreme memory consumption. – Palec Apr 17 '15 at 10:43