Accumulators, conj and recursion

Question

I've solved 45 problems from 4clojure.com and I noticed a recurring problem in the way I try to solve some problems using recursion and accumulators.

I'll try to explain the best I can what I'm doing to end up with fugly solutions hoping that some Clojurers would "get" what I'm not getting.

For example, problem 34 asks to write a function (without using range) taking two integers as arguments and creates a range (without using range). Simply put you do (... 1 7) and you get (1 2 3 4 5 6).

Now this question is not about solving this particular problem.

What if I want to solve this using recursion and an accumulator?

My thought process goes like this:

I need to write a function taking two arguments, I start with (fn [x y] )
I'll need to recurse and I'll need to keep track of a list, I'll use an accumulator, so I write a 2nd function inside the first one taking an additional argument:

(fn [x y]
((fn g [x y acc] ...) x y '())

(apparently I can't properly format that Clojure code on SO!?)

Here I'm already not sure I'm doing it correctly: the first function must take exactly two integer arguments (not my call) and I'm not sure: if I want to use an accumulator, can I use an accumulator without creating a nested function?

Then I want to conj, but I cannot do:

(conj 0 1)

so I do weird things to make sure I've got a sequence first and I end up with this:

(fn
   [x y]
   ((fn g [x y acc] (if (= x y) y (conj (conj acc (g (inc x) y acc)) x)))
    x
    y
    '()))

But then this produce this:

(1 (2 (3 4)))

Instead of this:

(1 2 3 4)

So I end up doing an additional flatten and it works but it is totally ugly.

I'm beginning to understand a few things and I'm even starting, in some cases, to "think" in a more clojuresque way but I've got a problem writing the solution.

For example here I decided:

to use an accumulator
to recurse by incrementing x until it reaches y

But I end up with the monstrosity above.

There are a lot of way to solve this problem and, once again, it's not what I'm after.

What I'm after is how, after I decided to cons/conj, use an accumulator, and recurse, I can end up with this (not written by me):

#(loop [i %1
        acc nil]
  (if (<= %2 i)
    (reverse acc)
    (recur (inc i) (cons i acc))))

Instead of this:

((fn
  f
  [x y]
  (flatten
   ((fn
     g
     [x y acc]
     (if (= x y) acc (conj (conj acc (g (inc x) y acc)) x)))
    x
    y
    '())))
 1
 4)

I take it's a start to be able to solve a few problems but I'm a bit disappointed by the ugly solutions I tend to produce...

Don't be a afraid to throw away bad solutions. If you start to see your code become unwieldy, take a step back and think it over. When it doesn't *feel* right, it probably isn't. — Jeremy, May 19 '12 at 15:05
@JeremyHeiler: ok but the "idea" isn't that bad, it's the "realization"/actual code that is bad. For example the short solution using and accumulator+recursivity was writing by someone who solved the 150 4clojure problems (and some of them are really not trivial). So my idea doesn't seem to bad: but I cannot (yet) cleanly implement my ideas. I take it it takes some time for the puzzle pieces to fall into place :-/ — Cedric Martin, May 19 '12 at 15:13
It certainly does. Just keep practicing and playing with code! — Jeremy, May 19 '12 at 15:13
I'm a bit surprised by the upvotes and favorites: I may not be the only one facing the same kind of issue when trying to learn Clojure (or any Lisp dialect for that matter). — Cedric Martin, May 20 '12 at 10:38

andrew cooke · Accepted Answer · 2012-05-21T13:41:52.527

i think there are a couple of things to learn here.

first, a kind of general rule - recursive functions typically have a natural order, and adding an accumulator reverses that. you can see that because when a "normal" (without accumulator) recursive function runs, it does some work to calculate a value, then recurses to generate the tail of the list, finally ending with an empty list. in contrast, with an accumulator, you start with the empty list and add things to the front - it's growing in the other direction.

so typically, when you add an accumulator, you get a reversed order.

now often this doesn't matter. for example, if you're generating not a sequence but a value that is the repeated application of a commutative operator (like addition or multiplication). then you get the same answer either way.

but in your case, it is going to matter. you're going to get the list backwards:

(defn my-range-0 [lo hi] ; normal recursive solution
  (if (= lo hi)
    nil
    (cons lo (my-range-0 (inc lo) hi))))

(deftest test-my-range-1
  (is (= '(0 1 2) (my-range-0 0 3))))

(defn my-range-1 ; with an accumulator
  ([lo hi] (my-range-1 lo hi nil))
  ([lo hi acc]
    (if (= lo hi)
      acc
      (recur (inc lo) hi (cons lo acc)))))

(deftest test-my-range-1
  (is (= '(2 1 0) (my-range-1 0 3)))) ; oops!  backwards!

and often the best you can do to fix this is just reverse that list at the end.

but here there's an alternative - we can actually do the work backwards. instead of incrementing the low limit you can decrement the high limit:

(defn my-range-2
  ([lo hi] (my-range-2 lo hi nil))
  ([lo hi acc]
    (if (= lo hi)
      acc
      (let [hi (dec hi)]
        (recur lo hi (cons hi acc))))))

(deftest test-my-range-2
  (is (= '(0 1 2) (my-range-2 0 3)))) ; back to the original order

[note - there's another way of reversing things below; i didn't structure my argument very well]

second, as you can see in my-range-1 and my-range-2, a nice way of writing a function with an accumulator is as a function with two different sets of arguments. that gives you a very clean (imho) implementation without the need for nested functions.

also you have some more general questions about sequences, conj and the like. here clojure is kind-of messy, but also useful. above i've been giving a very traditional view with cons based lists. but clojure encourages you to use other sequences. and unlike cons lists, vectors grow to the right, not the left. so another way to reverse that result is to use a vector:

(defn my-range-3 ; this looks like my-range-1
  ([lo hi] (my-range-3 lo hi []))
  ([lo hi acc]
    (if (= lo hi)
      acc
      (recur (inc lo) hi (conj acc lo)))))

(deftest test-my-range-3 ; except that it works right!
  (is (= [0 1 2] (my-range-3 0 3))))

here conj is adding to the right. i didn't use conj in my-range-1, so here it is re-written to be clearer:

(defn my-range-4 ; my-range-1 written using conj instead of cons
  ([lo hi] (my-range-4 lo hi nil))
  ([lo hi acc]
    (if (= lo hi)
      acc
      (recur (inc lo) hi (conj acc lo)))))

(deftest test-my-range-4
  (is (= '(2 1 0) (my-range-4 0 3))))

note that this code looks very similar to my-range-3 but the result is backwards because we're starting with an empty list, not an empty vector. in both cases, conj adds the new element in the "natural" position. for a vector that's to the right, but for a list it's to the left.

and it just occurred to me that you may not really understand what a list is. basically a cons creates a box containing two things (its arguments). the first is the contents and the second is the rest of the list. so the list (1 2 3) is basically (cons 1 (cons 2 (cons 3 nil))). in contrast, the vector [1 2 3] works more like an array (although i think it's implemented using a tree).

so conj is a bit confusing because the way it works depends on the first argument. for a list, it calls cons and so adds things to the left. but for a vector it extends the array(-like thing) to the right. also, note that conj takes an existing sequence as first arg, and thing to add as second, while cons is the reverse (thing to add comes first).

all the above code available at https://github.com/andrewcooke/clojure-lab

update: i rewrote the tests so that the expected result is a quoted list in the cases where the code generates a list. = will compare lists and vectors and return true if the content is the same, but making it explicit shows more clearly what you're actually getting in each case. note that '(0 1 2) with a ' in front is just like (list 0 1 2) - the ' stops the list from being evaluated (without it, 0 would be treated as a command).

+1, that is a great answer (other answers are great too). I'll need some time to digest it all but some things already "clicked" in for me : ) — Cedric Martin, May 20 '12 at 10:37

Joost Diepenmaat · Answer 2 · 2012-05-19T15:38:50.493

After reading all that, I'm still not sure why you'd need an accumulator.

((fn r [a b]
    (if (<= a b) 
       (cons a (r (inc a) b)))) 
  2 4)
=> (2 3 4)

seems like a pretty intuitive recursive solution. the only thing I'd change in "real" code is to use lazy-seq so that you won't run out of stack for large ranges.

how I got to that solution:

When you're thinking of using recursion, I find it helps to try and state the problem with the fewest possible terms you can think up, and try to hand off as much "work" to the recursion itself.

In particular, if you suspect you can drop one or more arguments/variables, that is usually the way to go - at least if you want the code to be easy to understand and debug; sometimes you end up compromising simplicity in favor of execution speed or reducing memory usage.

In this case, what I thought when I started writing was: "the first argument to the function is also the start element of the range, and the last argument is the last element". Recursive thinking is something you kind of have to train yourself to do, but a fairly obvious solution then is to say: a range [a, b] is a sequence starting with element a followed by a range of [a + 1, b]. So ranges can indeed be described recursively. The code I wrote is pretty much a direct implementation of that idea.

addendum:

I've found that when writing functional code, accumulators (and indexes) are best avoided. Some problems require them, but if you can find a way to get rid of them, you're usually better off if you do.

addendum 2:

Regarding recursive functions and lists/sequences, the most useful way to think when writing that kind of code is to state your problem in terms of "the first item (head) of a list" and "the rest of the list (tail)".

OK that is great but could you explain *how* you end up writing that once you decide you'll use recursivity? Note that the other solution (using an accumulator) was written by someone who solved the 150 4clojure problems (and some of the harder ones are quite hard), so it's not that far-fetched to use an accumulator : ) — Cedric Martin, May 19 '12 at 15:10

Nathan Hughes · Answer 3 · 2012-05-19T20:29:47.733

If I solved this using an accumulator I would do something like:

user=> (defn my-range [lb up c]
         (if (= lb up)
           c
           (recur (inc lb) up (conj c lb))))
#'user/my-range

then call it with

#(my-range % %2 [])

Of course, I'd use letfn or something to get around not having defn available.

So yes, you do need an inner function to use the accumulator approach.

My thought process is that once I'm done the answer I want to return will be in the accumulator. (That contrasts with your solution, where you do a lot of work on finding the ending-condition.) So I look for my ending-condition and if I've reached it, I return the accumulator. Otherwise I tack on the next item to the accumulator and recur for a smaller case. So there are only 2 things to figure out, what the end-condition is, and what I want to put in the accumulator.

Using a vector helps a lot because conj will append to it and there's no need to use reverse.

I'm on 4clojure too, btw. I've been busy so I've fallen behind lately.

+1... And nice "score" on 4clojure : ) I'll post a new question regarding the "inner function" vs "different set of arguments". — Cedric Martin, May 20 '12 at 10:40

score 3 · Answer 4 · answered May 19 '12 at 17:11

I cannot add to the already good answers you have received, but I will answer in general. As you go through the Clojure learning process, you may find that many but not all solutions can be solved using Clojure built-ins, like map and also thinking of problems in terms of sequences. This doesn't mean you should not solve things recursively, but you will hear -- and I believe it to be wise advice -- that Clojure recursion is for solving very low level problems you cannot solve another way.

I happen to do a lot of .csv file processing, and recently received a comment that nth creates dependencies. It does, and use of maps can allow me to get at elements for comparison by name and not position.

I'm not going to throw out the code that uses nth with clojure-csv parsed data in two small applications already in production. But I'm going to think about things in a more sequency way the next time.

It is difficult to learn from books that talk about vectors and nth, loop .. recur and so on, and then realize learning Clojure grows you forward from there.

One of the things I have found that is good about learning Clojure, is the community is respectful and helpful. After all, they're helping someone whose first learning language was Fortran IV on a CDC Cyber with punch cards, and whose first commercial programming language was PL/I.

score 1 · Answer 5 · answered May 19 '12 at 16:48

1

It looks like your question is more about "how to learn" then a technical/code problem. You end up writing that kind of code because from whatever way or source you learned programming in general or Clojure in specific has created a "neural highway" in your brain that makes you thinking about the solutions in this particular way and you end up writing code like this. Basically whenever you face any problem (in this particular case recursion and/or accumulation) you end up using that "neural highway" and always come up with that kind of code .

The solution for getting rid of this "neural highway" is to stop writing code for the moment, keep that keyboard away and start reading a lot of existing clojure code (from existing solutions of 4clojure problem to open source projects on github) and think about it deeply (even read a function 2-3 times to really let it settle down in your brain). This way you would end up destroying your existing "neural highway" (which produce the code that you write now) and will create a new "neural highway" that would produce the beautiful and idiomatic Clojure code. Also, try not to jump to typing code as soon as you saw a problem, rather give yourself some time to think clearly and deeply about the problem and solutions.

answered May 19 '12 at 16:48

Ankur

32,475
2
42
69

it's interesting but actually I think that it's a lack of any "Lisp neural highway" that makes me write code like this: I kinda "see" what I want the solution to look like (my idea is basically the same as the one from the user who solved all 150 problems) but then... I start playing at the REPL and end up with the wrong solution but then realize I "know" a (super crappy) to solve my issue. For example I end up with (1 (2 (3 4))) and think: *"ah, I can just flatten that"*. Of course it's rubbish : ( – Cedric Martin May 20 '12 at 10:32
Hmm.. your solution end up creating another problem (the nested list) and then you solve this new problem and hopefully end up with required solution and if not then you solve the new resulting problem :) and in the end you somehow manage to solve the original problem but the intermediate problems makes your code like that. Looks like you are missing to "view the whole problem" back again and again while solving the problem and rather continue to sort of apply brute force – Ankur May 20 '12 at 15:36
-1... That was precisely my question. I specifically stated that I understand the problem and that I understand what is required to solve it but that it's the implementation that causes an issue. You keep rewriting exactly what I wrote only to emphasis that I *"don't get it"*. Sorry but your answer is not constructive. You're apparently not willing to help here: the only thing you're willing to do is re-state that: "I don't get it". And you're doing it again in the comment: -1. – Cedric Martin May 21 '12 at 13:18

Accumulators, conj and recursion

5 Answers5