6

I'm doing an academic exercise (for personal growth). I want to find programming languages that allow you to define functions that are capable of accepting themselves (i.e., pointers to themselves) as arguments.

For example, in JavaScript:

function foo(x, y) {
    if (y === 0) return;
    x(x, y - 1);
}
foo(foo, 10);

The code above will execute foo() exactly 11 times before y reaches zero, causing the recursion to terminate.

I tried defining a similar function in OCaml like this:

let rec foo x y = if y < 1 then "hi" else x x (y - 1);;

But it failed with a type error:

Error: This expression has type 'a -> 'b -> 'c
   but an expression was expected of type 'a
   The type variable 'a occurs inside 'a -> 'b -> 'c

I'm wondering, is it possible to define such a function in OCaml? I'm particularly interested in OCaml because I know it has a global type inference system. I want to know if such functions are compatible with global type inference. Thus, I'm looking for examples of these types of functions in any language with global type inference.

Will Ness
  • 62,652
  • 8
  • 86
  • 167
Joshua Wise
  • 533
  • 4
  • 12
  • "languages that allow you to define functions that are capable of accepting themselves (i.e., pointers to themselves) as arguments": Machine. [The Story of Mel](http://www.catb.org/jargon/html/story-of-mel.html). – Mars Apr 25 '19 at 03:50

7 Answers7

7

It is possible in any language, that features either mutability or recursion or both, to call a function with a pointer to itself. Basically, all conventional Turing complete languages, have those features, therefore there are so many answers.

The real question is how to type such functions. Non strongly typed languages (like C/C++) or dynamically (or gradually) typed are of no interest, as they enable type coercing of some form, that basically makes the task trivial. They rely on a programmer to provide a type and take it as granted. Therefore, we should be interested in strictly typed languages with the static type system.

If we will focus on OCaml, then your definition could be accepted by the compiler if you pass the -rectypes option, which will disable the occurrence check, that disallows recursive types. Indeed, the type of your function is ('a -> int -> string as 'a) -> int -> string,

 # let foo x y = if y < 1 then "hi" else x x (y - 1);;
 val foo : ('a -> int -> string as 'a) -> int -> string = <fun>

Note that, you don't need rec here, as your function is not really recursive. What is recursive is the type, ('a -> int -> string as 'a), here as expands to the left up to the parenthesis, i.e., 'a = 'a -> int -> string. This is a recurrence and, by default, many compilers disallow such equations (i.e., equations where the same type variable occurs on both sides of the equation, hence the name occurrence check). If this check is disabled, the compiler will allow this and alike definitions. However, it was observed that the occurrence check catches more bugs than disallows well-formed programs. In other words, when the occurrence check is triggered it is more likely a bug, rather than a deliberate attempt to write a well-typed function.

Therefore, in real life, programmers feel reluctant to introduce this option to their build systems. The good news is that if we will massage the original definition a little bit, we don't really need recursive types. For example, we can change the definition to the following,

 let foo x y = if y < 1 then "hi" else x (y - 1)

which now has type

 val foo : (int -> string) -> int -> string = <fun>

I.e., it is a function that takes another function of type (int -> string) and returns a function of type (int -> string). Therefore, to run foo we need to pass it a function that recursively calls foo, e.g.

 let rec run y = foo run y

This is where the recursion comes into play. Yes, we didn't pass the function to itself directly. Instead, we passed it a function, that references foo and when foo calls this function it, in fact, calls itself, via an extra reference. We may also notice, that wrapping our function in a value of some other kind1) (using, record, or variant, or object) will also allow your definition. We can even specify those extra helper type as [@@unboxed] so that the compiler will not introduce extra boxing around the wrapper. But this is a sort of cheating. We still won't be passing the function to itself, but an object that contains this function (even though the compiler optimization will remove this extra indirection, from the perspective of the type system, those are still different objects, therefore the occurrence check is not triggered). So, we still need some indirection, if we don't want to enable recursive types. And let's stick to the simplest form of indirection, the run function and try to generalize this approach.

In fact, our run function is a specific case of a more general fixed-point combinator. We can parametrize run with any function of type ('a -> 'b) -> ('a -> 'b), so that it will work not only for foo:

 let rec run foo y = foo (run foo) y

and in fact let's name it fix,

 let rec fix f n = f (fix f) n

which has type

 val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>

And, we can still apply it to our foo

 # fix foo 10

The Oleg Kiselyov web site is an excellent resource that shows many ways of defining the fixed point combinator in OCaml, Scheme, and Haskell.


1) This is essentially the same as the delegate approach, that was shown in other answers (both including languages with type inference like Haskell and OCaml, and languages that don't, like C++ and C#).

Will Ness
  • 62,652
  • 8
  • 86
  • 167
ivg
  • 28,999
  • 2
  • 25
  • 53
  • of course to define the fix point combinator in languages without recursive `let` you will still need the self application, which will bring you back to the square one. another thing is, indirection step is still an indirection step, with the `run`-like recursive definition, or with a record, or a recursive type -- whether named, like in Haskell (`U a ~ U a -> a`), or unnamed, OCaml style (`t ~ t -> a`). You know the saying, "there ain't nothin' can't be solved with one of them indirection steps!" (my wording) :) there's not that much of a difference there, looking from sufficiently far away. – Will Ness Apr 24 '19 at 14:46
  • `OCaml... if you pass the -rectypes option, which will disable the occurrence check` Thanks. Always wondered why `rec` was required but never knew why. Now it makes sense and every time I type `rec` I will forever think of this answer. You have turned yourself into a virtual statue with that statement. – Guy Coder Mar 19 '20 at 13:43
4

Your OCaml function requires a recursive type, i.e., a type that contains a direct reference to itself. You can define such types (and have values of such types) if you specify -rectypes when you run OCaml.

Here's a session with your function:

$ rlwrap ocaml -rectypes
        OCaml version 4.06.1

# let rec foo x y = if y < 1 then "hi" else x x (y - 1);;
val foo : ('a -> int -> string as 'a) -> int -> string = <fun>
# foo foo 10;;
- : string = "hi"
#

The default is not to support recursive types, because they almost always are the result of programming errors.

Jeffrey Scofield
  • 57,655
  • 2
  • 63
  • 92
  • so with that flag, OCaml infers the recursive type all by itself? – Will Ness Apr 22 '19 at 09:11
  • @WillNess, yes. – Andreas Rossberg Apr 22 '19 at 09:12
  • @AndreasRossberg that's cool! thanks. in Haskell we have to do all the tagging/untagging by hand, with a special-purpose recursive type. what is the `{f = foo}` syntax that you're using? is your `rf` type a record type and `x.f` accesses its `f` field? (did I guess correctly?) :) – Will Ness Apr 22 '19 at 09:15
  • 1
    @WillNess, you guessed correctly. In Ocaml, this lax occurs-check even used to be the default behaviour until version 1.05 (in 1997), but it was changed because users complained. – Andreas Rossberg Apr 22 '19 at 09:22
3

Some examples I can write:

  • C++
  • C
  • C#
  • Python
  • Scheme

C++

Ok, so not the first language you would think of, and definitely not a painless way of doing it, but it's very much possible. It's C++ and it's here because they say write about what you know :) Oh, and I wouldn't recommend doing this outside of academic interest.

#include <any>
#include <iostream>

void foo(std::any x, int y)
{
    std::cout << y << std::endl;

    if (y == 0)
        return;

    // one line, like in your example
    //std::any_cast<void (*) (std::any, int)>(x)(x, y - 1);

    // or, more readable:

    auto f = std::any_cast<void (*) (std::any, int)>(x);
    f(x, y - 1);
}

int main()
{
    foo(foo, 10);
}

If the casts are too much (and too ugly) you can write a small wrapper like bellow. But the biggest advantage is performance: you completely bypass the std::any heavy type.

#include <iostream>

class Self_proxy
{
    using Foo_t = void(Self_proxy, int);

    Foo_t* foo;

public:
    constexpr Self_proxy(Foo_t* f) : foo{f} {}

    constexpr auto operator()(Self_proxy x, int y) const
    {
        return foo(x, y);
    }
};

void foo(Self_proxy x, int y)
{
    std::cout << y << std::endl;

    if (y == 0)
        return;

    x(x, y - 1);
}

int main()
{
    foo(foo, 10);
}

And a generic version of the wrapper (forwarding omitted for brevity):

#include <iostream>

template <class R, class... Args>
class Self_proxy
{
    using Foo_t = R(Self_proxy<R, Args...>, Args...);

    Foo_t* foo;

public:
    constexpr Self_proxy(Foo_t* f) : foo{f} {}

    constexpr auto operator()(Self_proxy x, Args... args) const
    {
        return foo(x, args...);
    }
};

void foo(Self_proxy<void, int> x, int y)
{
    std::cout << y << std::endl;

    if (y == 0)
        return;

    x(x, y - 1);
}

int main()
{
    foo(foo, 10);
}

C

You can do this in C also:

https://ideone.com/E1LkUW

#include <stdio.h>

typedef void(* dummy_f_type)(void);

void foo(dummy_f_type x, int y)
{
    printf("%d\n", y);

    if (y == 0)
        return;

    void (* f) (dummy_f_type, int) = (void (*) (dummy_f_type, int)) x;
    f(x, y - 1);
}

int main()
{
    foo((dummy_f_type)foo, 10);
}

The trap to avoid here is that you cannot use void* as the type for x as it's not valid to cast a pointer type to a data pointer type.

Or, as shown by leushenko in the comments, you can use the same pattern with a wrapper:

#include <stdio.h>

struct RF {
    void (* f) (struct RF, int);
};

void foo(struct RF x, int y)
{
    printf("%d\n", y);

    if (y == 0)
        return;

    x.f(x, y - 1);
}

int main()
{
    foo((struct RF) { foo }, 10);
}

C #

https://dotnetfiddle.net/XyDagc

using System;

public class Program
{
    public delegate void MyDelegate (MyDelegate x, int y);

    public static void Foo(MyDelegate x, int y)
    {
        Console.WriteLine(y);

        if (y == 0)
            return;

        x(x, y - 1);
    }

    public static void Main()
    {
        Foo(Foo, 10);
    }
}

Python

https://repl.it/repls/DearGoldenPresses

def f(x, y):
  print(y)
  if y == 0:
    return

  x(x, y - 1)

f(f, 10)

Scheme

And finally here's a functional language

https://repl.it/repls/PunyProbableKernelmode

(define (f x y)
  (print y)
  (if (not (= y 0)) (x x (- y 1)))
)

(f f 10)
bolov
  • 58,757
  • 13
  • 108
  • 182
  • I didn't find a C++17 online compiler with shareable link support. You can see it in action on https://wandbox.org/ . Select C++2a and copy paste the text. As far as I can see you can't share a link. – bolov Apr 22 '19 at 05:41
  • can it be done with functor objects, in C++, which could perhaps lead to a simpler code, without the casts? – Will Ness Apr 22 '19 at 09:10
  • @WillNess yes, thank you :). I guess you could write a small wrapper to hide the castings in its implementation. But you can't do with `std::function` because you would get into a recursive type declaration, unless you do something like the C version, something like `std::function` in which case you are back to casting. – bolov Apr 22 '19 at 09:26
  • so C++ types can't be recursive? there's this trick, isn't it, when a class must refer to self or something, or it derives something that refers to itself (with templates)? it has some acronym... (that's all I remember :) ) maybe that could be used? – Will Ness Apr 22 '19 at 09:37
  • @WillNess A class can contain a member of type itself only if it's a pointer or reference. E.g. `class X` can contain a member `X&` or `X*` but not `X` (you would need infinite space to store it). You might refer to [CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern). It doesn't apply to our situation. [cont] – bolov Apr 22 '19 at 09:47
  • [cont] What I meant is this: in `C++` a function has a type which contains the type of it's arguments. So you cannot have a function who accepts a type that's itself because you cannot declare such a function: you would get into infinite declaration, e.g.: `f` is a function who takes a parameter of type function who takes a parameter of type function who takes a parameter of type function ... . Templates don't solve this problem. That's why we need some other type that's unrelated in its type declaration to `f` but can be casted to `f` – bolov Apr 22 '19 at 09:48
  • @WillNess I've updated with a small wrapper. You can't solve with `typedef` because they are completely transparent. They don't create a new type (and don't mangle) – bolov Apr 22 '19 at 10:04
  • [casts are not needed in C either, you can use the same auxiliary type pattern as the ML examples above](https://godbolt.org/z/Mc0NZW) (casts aren't really an example of how to do this while maintaining static typing) – Leushenko Apr 22 '19 at 10:10
  • 1
    Why the roundabout means for the function call in C? `void (*)()` is a pointer to a function that can take [any number of arguments](https://softwareengineering.stackexchange.com/questions/286490/what-is-the-difference-between-function-and-functionvoid). [Try it online!](https://tio.run/##HY3BCsIwDIbPy1OEiZCWTtxZ8Um8jNRuAU1lbruIz16zHhK@fORPuBuZSzmI8nOND7x@lij5NN0AtiwR06pMlcinnB25gKILqoMvNO/ZOFF7lLu2weQFGklI6tCWycpk15v@Aeyx1yBKOwzzyAF5Gmb03oat3qvf9hawP9dUKX8 "C (gcc) – Try It Online") –  Apr 23 '19 at 15:17
  • @Rogem Well, first of all I don't like the "any type of argument" functions. Second the standard guarantees that a pointer to a function can be cast to a pointer of another function type and back and you will get the initial value, so I know for sure what I do is perfectly fine. For casting to `void (*)()` (and not going back to the initial function type) to work without UB the two function types have to be compatible, and I don't know if they are. And I didn't want to check the standard since I don't want to use it in the first place. – bolov Apr 23 '19 at 18:39
  • @bolov As if passing a function with arguments as `void (*)(void)` (explicitly no arguments), then casting it to a function *with* arguments and finally calling it was not UB. –  Apr 23 '19 at 19:21
  • In addition, you lose any safety that was there in the first place due to the casts - if you pass, say, `printf` to the function by accident you'll likely get a crash at runtime with no compiler warnings/errors because the compiler will think "Oh, he's casting. He knows what he's doing". With a `void (*)()` you don't *need* to do casts, and the compiler will let you know that you made the mistake and you can then insert the cast if you're absolutely sure about what you're doing. –  Apr 23 '19 at 19:31
  • @Rogem the standard guarantees that a function pointer can be cast to another function pointer type and then cast back to the original pointer function type and get the initial pointer value. It's a guarantee so it is not UB. Yes, if you call with another function type then the code is UB because it's not casted back to the original function pointer type, so you shouldn't. But if you call it with the correct function type then it is not UB. – bolov Apr 23 '19 at 20:41
3

As Jeffrey points out, OCaml can deal with this, if you activate -rectypes. And the reason that it is not turned on by default is not that it's a problem for ML-style type inference, but that it's usually not helpful to programmers (masks programming errors).

Even without the -rectypes mode you can easily construct an equivalent functions via an auxiliary type definition. For example:

type 'a rf = {f : 'a rf -> 'a}
let rec foo x y = if y < 1 then "hi" else x.f x (y - 1)

Note that this still infers everything else, such as the other function arguments. Sample use:

foo {f = foo} 11

Edit: As far as ML type inference is concerned, the only difference between the algorithm with and without -rectypes is that the latter omits the occurs-check during unification. That is, with -rectypes the inference algorithm actually becomes "simpler" in a sense. Of course, that assumes a suitable representation of types as graphs (rational trees) that allows cycles.

Andreas Rossberg
  • 31,309
  • 3
  • 55
  • 70
2

One language that is incredible for recursion/iteration (the name for what you're asking for) is a dialect of Lisp called Scheme. Check out a book called SICP for this language. Calling self is a technique to implement anonymous recursion.

Here is what your procedure would look like in Scheme:

(define (foo x y)
    (if (= y 0) null (x x (- y 1))))

(foo foo 10)
Will Ness
  • 62,652
  • 8
  • 86
  • 167
Jodast
  • 1,193
  • 1
  • 15
  • 32
  • 1
    Scheme is a little bit cheaty, as it is not statically typed nor it has the global type inference. I believe, that the question is not really about whether it is possible to pass a function to itself (it's trivial) rather than (infer) type such functions. Which is not trivial. – ivg Apr 22 '19 at 20:38
2

For completeness, Haskell.

newtype U a = U { app :: U a -> a }

foo :: Int -> ()
foo y = f (U f) y
  where
  f x y | y <= 0    = ()
        | otherwise = app x x (y-1)

Trying:

> foo 10
()

The statically typed languages seem to be doing more or less the same thing to achieve this: put the function into a record and pass that to itself as an argument. Haskell's newtype creates ephemeral "records" though, so it really is the function itself, at run time.

The dynamically typed languages just pass self to self and are done with it.

Will Ness
  • 62,652
  • 8
  • 86
  • 167
0

You can do it in C which supports function pointers, C# which supports delegates, and Java in which you might need to declare your own @FunctionalInterface for the method to match.

daniu
  • 12,131
  • 3
  • 23
  • 46