6

I have a directed graph G given as a list of adjacency lists:

newtype Graph Int = Graph [(Int, [Int])]

G has n vertices and m edges. I'm trying to implement BFS algorithm in Haskell that runs in O(m) time (maybe amortized), but best solution I was able to come up with runs in O(m * log n) and uses data structure from Data.Map module.

My idea of linear solution is as follows: Use structure from Data.Sequence as efficient FIFO queue and do everything as imperative BFS would do, but I'm stuck at point where I have to mark nodes as visited.

My question is: Is it possible to implement BFS in Haskell (or any other purely functional language) that runs in O(m)? And if It's not, what argument can you use to prove such statement?

KCH
  • 2,648
  • 2
  • 19
  • 22
  • Are you asking if such an algorithm is possible, or specifically about implementing it in Haskell? – Scott Hunter Nov 04 '14 at 21:51
  • @ScottHunter: I was thinking about any purely functional language (I edited my question) – KCH Nov 04 '14 at 21:53
  • Is your problem implementing a `O(1)` queue? – alternative Nov 04 '14 at 21:54
  • lazy BFS queue for *structural* trees is [easy](http://stackoverflow.com/a/21240679/849891) (see also [this answer with the cycle detection](http://stackoverflow.com/a/20162027/849891)). – Will Ness Nov 05 '14 at 09:45

1 Answers1

6

I'm presuming your problem is that you can't implement a good queue.

Take a look at Data.Sequence - it should do fine for a double ended queue, because operations towards the end of a sequence are incredibly fast. Adding an element to either end is O(1) and removing an element from either end is O(1).

Once you have the queue, it should perform just as well as a DFS would.

Instead of using a Map Int [Int] you can probably get away with a Vector Int [Int] (if your vertices are integers from 1 to n)

To mark nodes as checked you can use an IntSet.

This should get you O(V + E).

bfs :: V.Vector [Int] -> Int -> [Int]
bfs graph start = go IS.empty graph $ S.singleton start

go :: IS.IntSet Int -> V.Vector [Int] -> S.Sequence Int -> [Int]
go seen graph queue = 
  case S.viewL queue of
    S.EmptyL -> []
    vertex S.:< rest = vertex:(go seen' graph queue')
      where neighbors = filter (not . IS.member seen) (graph V.! vertex)
            seen' = S.insert vertex seen
            queue' = queue S.>< S.fromList neighbors

Note that the way we build this list is totally lazy! So if you only needed for example the first half of the BFS it would not calculate the rest.

alternative
  • 12,098
  • 5
  • 39
  • 41
  • Implementing 0(1) queue is not a problem, because as you said it's already in the standard library. I have a problem with marking nodes as visited. – KCH Nov 04 '14 at 21:57
  • @KCH Hmm, let me look up the performance of `Set` and `HashSet`. Obviously an mvector of bools would work as well – alternative Nov 04 '14 at 21:58
  • @KCH `IntSet` should work for your needs, it has essentially constant time operations, since they are capped at the number of bits in an `Int` – alternative Nov 04 '14 at 22:01
  • Thank you for mentioning this structure, I didn't know about it. It solves problem of practical implementation, but theoretical complexity is still O(m * log n). – KCH Nov 04 '14 at 22:17
  • @KCH where you are getting your `log` from? None of the data structures I indicated have an operation that takes `log` time. – alternative Nov 04 '14 at 22:39
  • @KCH I added an example implementation but I haven't even tried compiling it. But it should have linear running time. – alternative Nov 04 '14 at 22:50
  • @algernative: In case of IntSet, log n is hidden in expression O(min(n, W = log n)) describing insert cost. – KCH Nov 04 '14 at 23:04
  • @KCH I suppose you can argue that. However, its not quite log n, the moment n > 64, it turns into 64, and log n =/= 64. If you really, really care about getting rid of this, you can run the whole thing through the ST monad... Not advised. This might also help: https://hackage.haskell.org/package/bitset-1.0/docs/Data-BitSet.html – alternative Nov 04 '14 at 23:06
  • 1
    @KCH Oops, linked to old version of docs. It appears that this is `O(1)`: https://hackage.haskell.org/package/bitset-1.4.8/docs/Data-BitSet-Dynamic.html. Also keep in mind that when you start considering the length of an integer in bits as a parameter, you are going down a rabbit hole. Do you consider addition of two `Int`s to be `O(1)` or `O(log n)`? – alternative Nov 04 '14 at 23:11
  • @alternative why would you advice not using St? Sure, in this case there wouldn't be a benefit – Cubic Nov 07 '14 at 01:59
  • @Cubic makes the code ugly & less readable. Also makes it less lazy than we want. – alternative Nov 07 '14 at 02:03
  • @alternative "Also keep in mind that when you start considering the length of an integer in bits as a parameter, you are going down a rabbit hole" actually, you're just using a better model. Sure, adding two integers might be O(1) in the [RAM model](http://en.wikipedia.org/wiki/Random-access_machine) (which is close to how real CPUs behave), but even in that model a data structure like `IntSet` would have O(word size), not O(1) operations – Niklas B. Feb 16 '15 at 20:52