How to duplicate the behavior of predefined length/2 in SWI-Prolog?

Question

I'm trying to duplicate the behavior of the standard length/2 predicate. In particular, I want my predicate to work for bounded and unbounded arguments, like in the example below:

% Case 1
?- length(X, Y).
X = [],
Y = 0 ;
X = [_G4326],
Y = 1 ;
X = [_G4326, _G4329],
Y = 2 ;
X = [_G4326, _G4329, _G4332],
Y = 3 .

% Case 2    
?- length([a,b,c], X).
X = 3.

% Case 3
?- length(X, 4).
X = [_G4314, _G4317, _G4320, _G4323].

% Case 4
?- length([a,b,c,d,e], 5).
true.

The plain&simple implementation:

my_length([], 0).
my_length([_|T], N) :- my_length(T, X), N is 1+X.

has some problems. In Case 3, after producing the correct answer, it goes into an infinite loop. Could this predicate be transformed into a deterministic one? Or non-deterministic that halts with false?

YES! But using red cut. See: https://stackoverflow.com/a/15123016/1545971

After some time, I've managed to code a set of predicates, that mimic the behavior of the build-in length/2. my_len_tail is deterministic and works correct in all Cases 1-4. Could it be done simpler?

my_len_tail(List, Len) :- var(Len)->my_len_tailv(List, 0, Len);
                          my_len_tailnv(List, 0, Len).

my_len_tailv([], Acc, Acc).
my_len_tailv([_|T], Acc, Len) :-
    M is Acc+1,
    my_len_tailv(T, M, Len).

my_len_tailnv([], Acc, Acc) :- !. % green!
my_len_tailnv([_|T], Acc, Len) :-
    Acc<Len,
    M is Acc+1,
    my_len_tailnv(T, M, Len).

As @DanielLyons suggested in the comments, one can use clpfd to defer less than check. But it still leaves one problem: in Case 3 (my_len_clp(X, 3)) the predicate is nondeterministic. How it could be fixed?

:-use_module(library(clpfd)).
my_len_clp(List, Len) :- my_len_clp(List, 0, Len).

my_len_clp([], Acc, Acc).
my_len_clp([_|T], Acc, Len) :-
    Acc#<Len,
    M is Acc+1,
    my_len_clp(T, M, Len).

It can be fixed using zcompare/3 from the CLP(FD) library. See: https://stackoverflow.com/a/15123146/1545971

It will be especially interesting if someone can provide a solution without resorting to vendor extensions or clp(fd). — Daniel Lyons, Feb 27 '13 at 20:19
Thanks for the suggestion @DanielLyons . I've updated my post. But the problem remains open. — damisan, Feb 27 '13 at 20:45
+1 for `my_len_clp/3` which happens to be remarkably efficient! — false, Feb 28 '13 at 13:41

score 5 · Accepted Answer · answered Feb 27 '13 at 21:50

5

In SWI-Prolog, the nondeterminism issue can be solved with CLP(FD)'s zcompare/3, which reifies the inequality to a term that can be used for indexing:

:- use_module(library(clpfd)).

my_length(Ls, L) :-
        zcompare(C, 0, L),
        my_length(Ls, C, 0, L).

my_length([], =, L, L).
my_length([_|Ls], <, L0, L) :-
        L1 #= L0 + 1,
        zcompare(C, L1, L),
        my_length(Ls, C, L1, L).

Your example is now deterministic (since recent versions of SWI-Prolog perform just-in-time indexing):

?- my_length(Ls, 3).
Ls = [_G356, _G420, _G484].

All serious Prolog implementations ship with CLP(FD), and it makes perfect sense to use it here. Ask your vendor to also implement zcompare/3 or a better alternative if it is not already available.

answered Feb 27 '13 at 21:50

mat

39,707
3
42
68

It does make sense to use it here, but it is didactically interesting to know how one solves this sort of problem without it. – Daniel Lyons Feb 27 '13 at 22:07
I find this application more explicative than the documentation entry. – CapelliC Feb 27 '13 at 22:10
CLP(FD) is definitely useful! I've done a `trace/1` of a query `my_length(X,1)`. The call tree was ~30 levels deep and there was _a lot of_ unfication steps. As usual, we have speed/power tradeoff. Is `L1 #= L0 + 1` somehow better than `L1 is L0 + 1` in this case? – damisan Feb 27 '13 at 23:12
3

A general note: When teaching about predicates that involve integers, it is in my opinion didactically best to use finite domain constraints since they are true relations that can be used in all directions. Without constraints, you would have to introduce non-monotonic predicates for handling integers. `L1 #= L0 + 1` gives you a more general predicate than using `is/2`, try for example the most general query `my_length(Ls, C, L0, L)` with both versions. The speed difference is easy to measure (it is typically negligible in serious applications) and you gain a lot of generality. – mat Feb 27 '13 at 23:17
1

One remark about `zcompare(C, 0, L)` above: this involves the creation of a frozen goal that immediately thereafter will be executed. It is such example that make me think to prefer a nondeterminate solution. – false Feb 28 '13 at 13:40

score 3 · Answer 2 · edited May 23 '17 at 12:25

For a set of test cases, please refer to this table and to the current definition in the prologue. There are many more odd cases to consider.

Defining length/2 with var/nonvar, is/2 and the like is not entirely trivial, because (is)/2 and arithmetic comparison is so limited. That is, they produce very frequently instantiation_errors instead of succeeding accordingly. Just to illustrate that point: It is trivial to define length_sx/2 using successor-arithmetics.

length_sx([], 0).
length_sx([_E|Es], s(X)) :-
   length_sx(Es, X).

This definition is pretty perfect. It even fails for length_sx(L, L). Alas, successor arithmetics is not supported efficiently. That is, an integer i requires O(i) space and not O(log i) as one would expect.

The definition I would have preferred is:

length_fd([],0).
length_fd([_E|Es], L0) :-
   L0 #> 0,
   L1 #= L0-1,
   length_fd(Es, L1).

Which is the most direct translation. It is quite efficient with a known length, but otherwise the overhead of constraints behind shows. Also, there is this asymmetry:

?- length_fd(L,0+0).
false.

?- length_fd(L,0+1).
L = [_G919] ;
false.

However, your definition using library(clpfd) is particularly elegant and efficient even for more elaborate cases.. It isn't as fast as the built-in length...

?- time(( length_fd(L,N),N=1000 )).
% 29,171,112 inferences, 4.110 CPU in 4.118 seconds (100% CPU, 7097691 Lips)
L = [_G67, _G98, _G123, _G159, _G195, _G231, _G267, _G303, _G339|...],
N = 1000 .

?- time(( my_len_clp(L,N),N=10000 )).
% 1,289,977 inferences, 0.288 CPU in 0.288 seconds (100% CPU, 4484310 Lips)
L = [_G67, _G79, _G82, _G85, _G88, _G91, _G94, _G97, _G100|...],
N = 10000 .

?- time(( length(L,N),N=10000 )).
% 30,003 inferences, 0.006 CPU in 0.006 seconds (100% CPU, 4685643 Lips)
L = [_G67, _G70, _G73, _G76, _G79, _G82, _G85, _G88, _G91|...],
N = 10000 .

... but then it is able to handle constraints correctly:

?- N in 1..2, my_len_clp(L,N).
N = 1,
L = [_G1439] ;
N = 2,
L = [_G1439, _G1494] ;
false.

?- N in 1..2, length(L,N).
N = 1,
L = [_G1445] ;
N = 2,
L = [_G1445, _G1448] ;
*LOOPS*

score 1 · Answer 3 · edited Jul 03 '18 at 18:22

1

I am not especially confident in this answer but my thinking is no, you have to do some extra work to make Prolog do the right thing for length/2, which is a real shame because it's such a great "tutorial" predicate in the simplest presentation.

I submit as proof, the source code to this function in SWI-Prolog and the source in GNU Prolog. Neither of these is a terse, cute trick, and it looks to me like they both work by testing the arguments and then deferring processing to different internal functions depending on which argument is instantiated.

I would love to be wrong about this though. I have often wondered why it is, for instance, so easy to write member/2 which does the right thing but so hard to write length/2 which does. Prolog isn't great at arithmetic, but is it really that bad? Here's hoping someone else comes along with a better answer.

edited Jul 03 '18 at 18:22

num1

4,498
4
27
48

answered Feb 27 '13 at 21:03

Daniel Lyons

21,545
2
48
73

As for `member/2`: Its implementation is not **that** easy to describe as you suggest. [Here is the current definition](http://www.complang.tuwien.ac.at/ulrich/iso-prolog/prologue#member). To cover its meaning the format of definitions built-in of predicates had to be changed, and a very contrived notion of a [list prefix](http://www.complang.tuwien.ac.at/ulrich/iso-prolog/dtc2#7.1.6.9) was needed. Easy? I would say not. – false Feb 28 '13 at 14:04
1

Arithmetics based on expression evaluation as in `(is)/2` is a very ad-hoc addition to the Prolog language. If you want it pure, either us s(X)-notation or constraints. – false Feb 28 '13 at 14:08
@false I'm not sure I'm following you. The `member/2` definition is less code to write and less work to understand than a non-clpfd definition of `length/2`. Even with clpfd, it's longer and more complex than the naive definition of `member/2`. What knowledge must you possess to understand `member/2` that you don't also need for `length/2`? – Daniel Lyons Feb 28 '13 at 14:55
Look at the definition of `member/2` above: You claim that `member(a,[a|nonlist])` is easy to understand? `length/2` does not have such odd properties at all, which by no means implies that it is easy to define. – false Feb 28 '13 at 14:58
... and I still do not like [the definition of `length/2`](http://www.complang.tuwien.ac.at/ulrich/iso-prolog/prologue#length) – false Feb 28 '13 at 14:59
@false My point is simply that it is not intuitive why `length/2` should be harder to write or understand than `member/2`. You have to understand `[X|Xs]` to understand either one, so why should `length` be harder? – Daniel Lyons Feb 28 '13 at 15:00
1

`member/2` requires an odd notion and `length/2` does not require such an odd notion at all. So `length/2` is easier to define/specify. But it is much more difficult to implement, because `(is/2)` is so incomplete. – false Feb 28 '13 at 15:03

Sergii Dymchenko · Answer 4 · 2013-02-28T20:44:03.523

-1

This works for all your test cases (but it has red cut):

my_length([], 0).
my_length([_|T], N) :- 
    ( integer(N) ->
        !, 
        N > 0, 
        my_length(T, X), N is 1 + X, !
    ;
        my_length(T, X), N is 1 + X
    ).

edited Feb 28 '13 at 20:44

answered Feb 27 '13 at 21:43

Sergii Dymchenko

6,052
1
16
34

Nice. It seems, that your solution uses basically the same idea as mine my_len_tail: test if we have fully instantiated length and if so, abort after first choice at a checkpoint. – damisan Feb 27 '13 at 22:33
But in case `my_len([1,2,3], X)` PROLOG tries to unify 3 times with second predicate (and fail every time). To fix that problem, I used a "trampoline" to choose the right flavor of the length function at the first unification. Could you give some examples where one should prefer ground/1 over nonvar/1 (in this function)? – damisan Feb 27 '13 at 22:38
`nonvar/1` is probably better here, but `integer/1` is even better. And `N > 0` should be added to the second clause. – Sergii Dymchenko Feb 28 '13 at 01:12
false, yes, that's why I told "N > 0 should be added to the second clause" in my previous comment. – Sergii Dymchenko Feb 28 '13 at 20:38
`my_length(L,N), N = 100000` – false Feb 28 '13 at 21:02

score -1 · Answer 5 · edited May 23 '17 at 12:16

-1

(I've tried to edit @false's response, but it was rejected)

my_len_tail/2 is faster (in terms of both the number of inferences and actual time) than buldin length/2 when generating a list, but has problem with N in 1..2 constraint.

?- time(( my_len_tail(L,N),N=10000000 )).
% 20,000,002 inferences, 2.839 CPU in 3.093 seconds (92% CPU, 7044193 Lips)
L = [_G67, _G70, _G73, _G76, _G79, _G82, _G85, _G88, _G91|...],
N = 10000000 .

?- time(( length(L,N),N=10000000 )).
% 30,000,004 inferences, 3.557 CPU in 3.809 seconds (93% CPU, 8434495 Lips)
L = [_G67, _G70, _G73, _G76, _G79, _G82, _G85, _G88, _G91|...],
N = 10000000 .

edited May 23 '17 at 12:16

Community

1
1

answered Feb 28 '13 at 16:44

damisan

977
4
16

1

? you need to compare things on the very same machine. – false Feb 28 '13 at 17:30
Either your machine as faster or your clock is coarser. Try the query with N = 100000 for **both** my_len_tail and length – false Feb 28 '13 at 18:02
By "faster" I meant the number of inferences (the same on different machines), not the actual time. I've edited my post. – damisan Feb 28 '13 at 19:48
Misunderstanding: I compared my_len_clp vs. length. You compare my_len_tail and length! – false Feb 28 '13 at 20:17
1

W.r.t "faster": The different number of inferences here is due to one version using (is)/2 and the other using succ/2. succ/2 counts as an inference but (is)/2 does not. So that is pretty arbitrary and unrelated to actual execution. – false Feb 28 '13 at 22:27
Nice observation. After changing `(is)/2` to `succ/2`, `my_len_tail` uses the same number of inferences and runs in ~ same time that build-in `length/2`. – damisan Mar 01 '13 at 00:26

score -1 · Answer 6 · answered Mar 07 '19 at 15:31

implementation

goal_expansion((_lhs_ =:= _rhs_),(when(ground(_rhs_),(_lhs_ is _rhs_))))  .

:- op(2'1,'yfx','list')  .

_list_ list [size:_size_] :-
_list_ list [size:_size_,shrink:_shrink_] ,
_list_ list [size:_size_,shrink:_shrink_,size:_SIZE_]  .

_list_ list [size:0,shrink:false]  .

_list_ list [size:_size_,shrink:true] :-
when(ground(_size_),(_size_ > 0))  .

[] list [size:0,shrink:false,size:0] .

[_car_|_cdr_] list [size:_size_,shrink:true,size:_SIZE_] :-
(_SIZE_ =:= _size_ - 1) ,
(_size_ =:= _SIZE_ + 1) ,
_cdr_ list [size:_SIZE_]  .

testing

/*
   ?- L list Z .
L = [],
Z = [size:0] ? ;
L = [_A],
Z = [size:1] ? ;
L = [_A,_B],
Z = [size:2] ? ;
L = [_A,_B,_C],
Z = [size:3] ?
yes

   ?- L list [size:0] .
L = [] ? ;
no
   ?- L list [size:1] .
L = [_A] ? ;
no
   ?- L list [size:2] .
L = [_A,_B] ? ;
no

   ?- [] list [size:S] .
S = 0 ? ;
no
   ?- [a] list [size:S] .
S = 1 ? ;
no
   ?- [a,b] list [size:S] .
S = 2 ? ;
no
   ?- [a,b,c] list [size:S] .
S = 3 ? ;
no
   ?- 
*/

How to duplicate the behavior of predefined length/2 in SWI-Prolog?

6 Answers6

implementation

testing