9

I am trying to understand why Prim and Kruskal have different time complexities when it comes to sparse and dense graphs. After using a couple of applets that demonstrate how each works, I am still left a little confused about how the density of the graph affects the algorithms. I hope someone could give me a nudge in the right direction.

templatetypedef
  • 328,018
  • 92
  • 813
  • 992
tommy
  • 373
  • 3
  • 5
  • 11
  • Do you have any examples of different time complexities? All I can find are upper limits that work in both cases... – wds Jan 06 '10 at 09:30
  • dense : Prim = O(N2), Kruskal = O(N2*log(N)) sparse : Prim=O(N2), Kruskal = O(N log(N)) – tommy Jan 06 '10 at 09:35
  • Unless I am remembering this wrong, with the right data structures you shouldn't have any N^2 terms in there. Regardless, the answer to the OP is to go find a copy of Cormen, which has a runtime analysis and comparison of the two. – abeyer Jan 06 '10 at 09:58
  • that depends on what N is, it looks likely to be ~= number of verticies in this case – jk. Jan 06 '10 at 10:09

3 Answers3

4

Wikipedia gives the complexity of these algorithms in terms of E, the number of edges, and V, the number of vertices, which is a good practice because it lets you do exactly this sort of analysis.

Kruskal's algorithm is O(E log V). Prim's complexity depends on which data structure you use for it. Using an adjacency matrix, it's O(V2).

Now if you plug in V2 for E, behold you get the complexities that you cited in your comment for dense graphs, and if you plug in V for E, lo you get the sparse ones.

Why do we plug in V2 for a dense graph? Well even in the densest possible graph you can't have as many as V2 edges, so clearly E = O(V2).

Why do we plug in V for a sparse graph? Well, you have to define what you mean by sparse, but suppose we call a graph sparse if each vertex has no more than five edges. I would say such graphs are pretty sparse: once you get up into the thousands of vertices, the adjacency matrix would be mostly empty space. That would mean that for sparse graphs, E ≤ 5 V, so E = O(V).

Jason Orendorff
  • 37,255
  • 3
  • 56
  • 91
1

Are these different complexities with respect to the number of vertices by any chance?

there is often, a slightly handwavy, argument that says for a sparse graph, the number of edges E = O(V) where V is the number of verticies, for a dense graph E = O(V^2). as both algotrithms potentially have complexity that depends on E, when you convert this to comlexity that depends on V you get different complexities depending on dense or sparse graphs

edit:

different data structures will also effect the complexity of course wikipedia has a break down on this

jk.
  • 12,972
  • 5
  • 33
  • 49
  • I don't think that argument is handwavy. It seems straightforward to formalize it. Pick any *k* and call graphs where E ≤ *k* V "sparse". Now you've greatly limited the problem domain, asymptotically anyway: no matter what *k* you pick, as V gets large, a tiny fraction of graphs will be sparse. An algorithm that's O(E log V) will be O(V log V) on sparse graphs. – Jason Orendorff Jan 06 '10 at 15:39
  • its handwavy as i don't believe there is a strict definition of what sparse is, and i've seen it described as E=V^k for k somewhere between 1 and 2. – jk. Jan 06 '10 at 15:48
  • but yes i think its generally accepted that with E=V^k sparse is goign to mean k~=1 and dense is going to mean k~=2 – jk. Jan 06 '10 at 15:51
  • Oh, I see. But that's just a terminology thing, namely, the word "sparse" doesn't have any one, precise standard meaning. The argument is blameless. – Jason Orendorff Jan 06 '10 at 16:04
0

Algorithms by Cormen et al does indeed give an analysis, in both cases using a sparse representation of the graph. With Kruskal's algorithm (link vertices in disjoint components until everything joins up) the first step is to sort the edges of the graph, which takes time O(E lg E) and they simply establish that nothing else takes longer than this. With Prim's algorithm (extend the current tree by adding the closest vertex not already on it) they use a Fibonacci heap to store the queue of pending vertices and get O(E + V lgV), because with a Fibonacci tree decreasing the distance to vertices in the queue is only O(1) and you do this at most once per edge.

mcdowella
  • 18,736
  • 2
  • 17
  • 24