6

Please forgive me if I misused the big words in the title; I'm not too knowledgeable about them but hope they describe my problem. I wrote an elaborate scheme to try and encode strings according to these requirements. For strings of length 10^4 and higher, the code I wrote is quite slow, and I'm wondering - since it processes chunks of 200 at a time (albeit moving only one character forward sometimes to take the next chunk), could it be modified to output the result faster or in a more linear fashion (e.g., immediately output the result for each 200 characters processed). Any help with that or other noticeable optimizations would be appreciated.

Per tel's suggestion, I simplified my example:

encode xs = encode' xs [] where
  encode' []     result = result
  encode' (z:zs) result
    | null test = encode' zs (result ++ [z])
    | otherwise = encode' (drop numZsProcessed zs) (result ++ processed)
   where test = ..some test
         toProcess = take 200 (z:zs)
         processed = ..do something complicated with toProcess
         numZsProcessed = ..number of z's processed
Community
  • 1
  • 1
גלעד ברקן
  • 21,095
  • 3
  • 19
  • 57
  • 2
    It's very possible to write constant time, constant space online algorithms in Haskell. It'd be much easier to explain those mechanisms with a simpler example, though. Could you create a simple instance of your problem? Either that or move the question to Code Review. – J. Abrahamson Jun 03 '13 at 14:49
  • @tel thank you for your suggestion. I tried to simplify my example. How does it look now? – גלעד ברקן Jun 03 '13 at 16:32
  • 1
    It's noticeable that the above code (a) is tail-recursive, building up an accumulator which is only delivered once all of the input has been consumed, and (b) computes a left-nested tower of ++ applications, which gives a slowdown as ++ takes time linear in the length of its first argument. Try to make your code deliver as large a prefix of the input as possible, as soon as possible, putting recursive calls to the right of ++. – pigworker Jun 03 '13 at 17:03
  • @pigworker Thank you for your suggestion - could you please give an example modification to point me in the right direction? Since the output does get delivered only very slowly, I assume that the code did not consume all of the input before starting output, no? – גלעד ברקן Jun 03 '13 at 17:10
  • Try using "naive" recursion rather than tail recursion. – luqui Jun 03 '13 at 17:16
  • A tail recursive list function must consume all the input before giving any output. The slowness you are experiencing is probably from from the left-nested chain of `(++)`. – luqui Jun 03 '13 at 17:17
  • Looking at your code, you have three cases. In two of these cases, the right-hand side is a recursive call. In the base case, the accumulator is given. That means no output can be delivered until the base case is reached. The accumulator is, by that stage, a large (and in this case, avoidably inefficient) computation which (thanks to laziness) is still to be performed. Better to go directly with `encode [] = []` as the base case, and step cases of form `encode (z:zs) |` *condition* `=` *someOutput* `++ encode` *restOfInput*, which will give output as soon as it is available. – pigworker Jun 03 '13 at 17:22
  • @luqui thanks, it's working as expected now! I got so used to writing the "tail-recursive" form and did not understand it's implications. Thanks! – גלעד ברקן Jun 03 '13 at 17:25
  • @pigworker thanks, it's working as expected now! I got so used to writing the "tail-recursive" form and did not understand it's implications. Thanks! – גלעד ברקן Jun 03 '13 at 17:26

1 Answers1

6

Haskell and tail recursion don't get along as well as other functional languages and tail recursion. Let's do some manual reduction on some very simple code to see what's going on with tail recursion. Here's a tail-recursive implementation of map (1+).

go [] r = r
go (x:xs) r = go xs (r ++ [1+x])

Also we must keep in mind the definition of (++):

[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

Now let's reduce go [1,2,3,4,5] []. Keep in mind that [x,y,z] is notation for x:(y:(z:[])), or for short x:y:z:[].

go [1,2,3,4,5] []
go [2,3,4,5] ([] ++ [2])   -- 2 here is actually the thunk 1+1, but
                           -- for compactness I am reducing earlier
go [3,4,5] (([] ++ [2]) ++ [3])
go [4,5] ((([] ++ [2]) ++ [3]) ++ [4])
go [5] (((([] ++ [2]) ++ [3]) ++ [4]) ++ [5])
go [] ((((([] ++ [2]) ++ [3]) ++ [4]) ++ [5]) ++ [6])
(((([] ++ [2]) ++ [3]) ++ [4]) ++ [5]) ++ [6]
((([2] ++ [3]) ++ [4]) ++ [5]) ++ [6]
(((2:([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
((2:(([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
(2:((([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
2:(((([] ++ [3]) ++ [4]) ++ [5]) ++ [6])    -- first observable output
2:((([3] ++ [4]) ++ [5]) ++ [6])
2:((3:([] ++ [4]) ++ [5]) ++ [6])
2:(3:(([] ++ [4]) ++ [5]) ++ [6])
2:3:((([] ++ [4]) ++ [5]) ++ [6])           -- second observable output
2:3:(([4] ++ [5]) ++ [6])
2:3:(4:([] ++ [5]) ++ [6])
2:3:4:(([] ++ [5]) ++ [6])                  -- third observable output
2:3:4:([5] ++ [6])
2:3:4:5:([] ++ [6])                         -- fourth observable output
2:3:4:5:6:[]                                -- final output

See how each item in the output needs to work its way outward from a deeply nested series of parentheses? This causes it to take quadratic time in the size of the input to get all the output. You'll also see a behavior that the first few items are yielded slowly, and it gets faster and faster as you reach the end of the list. This reduction explains that.

The main performance problem here is appending the new element to the end of the list, which takes time proportional to the size of the list you are appending to. A better way is to cons on the front, which is a constant-time operation. This will cause the output to come out reversed, so you need to reverse the result.

go [] r = reverse r
go (x:xs) r = go xs ((1+x):r)

reverse xs = rev xs []      -- roughly from the report prelude
rev [] r = r
rev (x:xs) r = rev xs (x:r)

And, let's reduce:

go [1,2,3,4,5] []
go [2,3,4,5] [2]
go [3,4,5] [3,2]
go [4,5] [4,3,2]
go [5] [5,4,3,2]
go [] [6,5,4,3,2]
reverse [6,5,4,3,2]
rev [6,5,4,3,2] []
rev [5,4,3,2] [6]
rev [4,3,2] [5,6]
rev [3,2] [4,5,6]
rev [2] [3,4,5,6]
rev [] [2,3,4,5,6]
[2,3,4,5,6]          -- first and all observable output!

So this is clearly less work than the first version. This is the style is used in strict languages like Scheme and ML, because it has good memory performance. However, it has some disadvantages:

  • All the input must be consumed before any output can be produced. Indeed the entire computation is performed before anything results are produced.
  • Because of this, it does not ever yield any output when given an infinite list.
  • It involves reverse, which is takes an extra O(n) time and has nothing to do with what we are doing (what does reversing have to do with adding one to every element and preserving the order?).

In a lazy language like Haskell, we can do better. Strangely, and beautifully, the way we do is by writing it even more naively.

go [] = []
go (x:xs) = (1+x):go xs

and reduce:

go [1,2,3,4,5]
2:(go [2,3,4,5])     -- first observable output
2:3:(go [3,4,5])     -- second observable output
2:3:4:(go [4,5])     -- third observable output
2:3:4:5:(go [6])     -- fourth observable output
2:3:4:5:6:(go [])    -- fifth observable output
2:3:4:5:6:[]         -- final output

It takes even less work, and it starts yielding output before even looking at the remainder of the list, so it has good performance in a stream computation and works on infinite inputs. And the implementation is about as simple and obvious as you could hope for.

I hope this gives you some intuition for how tail recursion works in Haskell. For your example, I suggest removing the tail recursion and rewriting in a naive style similar to our final go, using the intuition I hope I suggested from this post to yield "as large a prefix of the input as possible, as soon as possible" (notice that returning x:xs immediately yields x, even if there is some more work to be done to compute xs -- that's laziness in (non-)action).

luqui
  • 57,324
  • 7
  • 134
  • 191