48

Is there any way to define a sum type in Java? Java seems to naturally support product types directly, and I thought enums might allow it to support sum types, and inheritance looks like maybe it could do it, but there is at least one case I can't resolve. To elaborate, a sum type is a type which can have exactly one of a set of different types, like a tagged union in C. In my case, I'm trying to implement haskell's Either type in Java:

data Either a b = Left a | Right b

but at the base level I'm having to implement it as a product type, and just ignore one of its fields:

public class Either<L,R>
{
    private L left = null;
    private R right = null;

    public static <L,R> Either<L,R> right(R right)
    {
        return new Either<>(null, right);
    }

    public static <L,R> Either<L,R> left(L left)
    {
        return new Either<>(left, null);
    }

    private Either(L left, R right) throws IllegalArgumentException
    {
        this.left = left;
        this.right = right;
        if (left != null && right != null)
        {
            throw new IllegalArgumentException("An Either cannot be created with two values");
        }
        if (left == right)
        {
            throw new IllegalArgumentException("An Either cannot be created without a value");
        }
    }

    .
    .
    .
}

I tried implementing this with inheritance, but I have to use a wildcard type parameter, or equivalent, which Java generics won't allow:

public class Left<L> extends Either<L,?>

I haven't used Java's Enums much, but while they seem the next best candidate, I'm not hopeful.
At this point, I think this might only be possible by type-casting Object values, which I would hope to avoid entirely, unless there's a way to do it once, safely, and be able to use that for all sum types.

Zoey Hewll
  • 3,400
  • 2
  • 16
  • 30
  • 3
    Scala uses inheritance to emulate sum types. You can't force Java to make a type "sealed" (that is, someone else could come along and write something foolish like `class Apollo11 extends Either`), but you can simply not write any more subclasses yourself and hope for the best. "Safely" doing this is outside of the abilities of Java's type system, but doing it is possible with inheritance. – Silvio Mayolo Jan 08 '18 at 01:45
  • 1
    Looking at the signature of [`Collections.newSetFromMap()`](https://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#newSetFromMap(java.util.Map)) I think it should be perfectly ok to just define an arbitrary generic type, such as `class Left extends Either`. – Izruo Jan 08 '18 at 01:46
  • @silvio a `final` class cannot be extended in Java. – Dragonthoughts Jan 08 '18 at 01:48
  • 2
    @Dragonthoughts Yes, but we want to extend it a limited number of times and then lock it. Scala has this in the form of the `sealed` keyword, but in Java, you either leave a class open to everyone or make it `final` and block even yourself from inheriting. – Silvio Mayolo Jan 08 '18 at 01:49
  • 2
    By making all constructors `package-private`, inheritance can be limited to the enclosing package though. Instantiation must then be performed through static factory, as it is already done here. – Izruo Jan 08 '18 at 01:52
  • @Izruo Hmm, your solution seems right, but I've had trouble putting Functional Interface types into `Object`. Eg `Object a = () -> {};` doesn't compile. Is `Object` still the base of every concrete type? – Zoey Hewll Jan 08 '18 at 01:53
  • 2
    You could have a base that is package visible only but have the derived classes final and public. – Dragonthoughts Jan 08 '18 at 01:54
  • 1
    As the point of that 'wildcard' is, that the `right` member of an instance of type `Left` is never assigned, this won't matter. By the way, the compiler error is probably because it cannot determine the type of functional interface. It should therefore work with an explicit cast: `Object a = (Runnable) () -> {}`, although I don't see a point in this. – Izruo Jan 08 '18 at 01:58
  • check out https://github.com/pakoito/JavaSealedUnions – beluchin Sep 17 '19 at 23:16

4 Answers4

61

Make Either an abstract class with no fields and only one constructor (private, no-args, empty) and nest your "data constructors" (left and right static factory methods) inside the class so that they can see the private constructor but nothing else can, effectively sealing the type.

Use an abstract method either to simulate exhaustive pattern matching, overriding appropriately in the concrete types returned by the static factory methods. Implement convenience methods (like fromLeft, fromRight, bimap, first, second) in terms of either.

import java.util.Optional;
import java.util.function.Function;

public abstract class Either<A, B> {
    private Either() {}

    public abstract <C> C either(Function<? super A, ? extends C> left,
                                 Function<? super B, ? extends C> right);

    public static <A, B> Either<A, B> left(A value) {
        return new Either<A, B>() {
            @Override
            public <C> C either(Function<? super A, ? extends C> left,
                                Function<? super B, ? extends C> right) {
                return left.apply(value);
            }
        };
    }

    public static <A, B> Either<A, B> right(B value) {
        return new Either<A, B>() {
            @Override
            public <C> C either(Function<? super A, ? extends C> left,
                                Function<? super B, ? extends C> right) {
                return right.apply(value);
            }
        };
    }

    public Optional<A> fromLeft() {
        return this.either(Optional::of, value -> Optional.empty());
    }
}

Pleasant and safe! No way to screw it up. Because the type is effectively sealed, you can rest assured that there will only ever be two cases, and every operation ultimately must be defined in terms of the either method, which forces the caller to handle both of those cases.

Regarding the problem you had trying to do class Left<L> extends Either<L,?>, consider the signature <A, B> Either<A, B> left(A value). The type parameter B doesn't appear in the parameter list. So, given a value of some type A, you can get an Either<A, B> for any type B.

gdejohn
  • 6,648
  • 1
  • 30
  • 45
  • I would introduce an enum into this, to give a way to exhaustively match against the cases – Alexander Jan 08 '18 at 19:24
  • It might be wise to name the method generics differently from the class' generics. E.g., maybe `AL`, `BL`, `AR`, and `BR` so that it's easy to distinguish which ones are being used where. – jpmc26 Jan 09 '18 at 00:01
  • 1
    note this approach can be industrialised with a JSR269 annotation processor to generate the boilerplate (the static left and right methods and others, like hashCode/equals). See my project [Derive4J](https://github.com/derive4j/derive4j#derive4j-java-8-annotation-processor-for-deriving-algebraic-data-types-constructors-pattern-matching-and-more). – JbGi Jan 09 '18 at 11:10
  • 1
    @vtosh that was because I was using anonymous-class type-argument inference, which was added in Java 9, but I've edited it to make the type arguments explicit so that now it will compile on Java 8 – gdejohn Dec 10 '19 at 17:23
25

A standard way of encoding sum types is Boehm–Berarducci encoding (often referred to by the name of its cousin, Church encoding) which represents an algebraic data type as its eliminator, i.e., a function that does pattern-matching. In Haskell:

left :: a -> (a -> r) -> (b -> r) -> r
left x l _ = l x

right :: b -> (a -> r) -> (b -> r) -> r
right x _ r = r x

match :: (a -> r) -> (b -> r) -> ((a -> r) -> (b -> r) -> r) -> r
match l r k = k l r

-- Or, with a type synonym for convenience:

type Either a b r = (a -> r) -> (b -> r) -> r

left :: a -> Either a b r
right :: b -> Either a b r
match :: (a -> r) -> (b -> r) -> Either a b r -> r

In Java this would look like a visitor:

public interface Either<A, B> {
    <R> R match(Function<A, R> left, Function<B, R> right);
}

public final class Left<A, B> implements Either<A, B> {

    private final A value;

    public Left(A value) {
        this.value = value;
    }

    public <R> R match(Function<A, R> left, Function<B, R> right) {
        return left.apply(value);
    }

}

public final class Right<A, B> implements Either<A, B> {

    private final B value;

    public Right(B value) {
        this.value = value;
    }

    public <R> R match(Function<A, R> left, Function<B, R> right) {
        return right.apply(value);
    }

}

Example usage:

Either<Integer, String> result = new Left<Integer, String>(42);
String message = result.match(
  errorCode -> "Error: " + errorCode.toString(),
  successMessage -> successMessage);

For convenience, you can make a factory for creating Left and Right values without having to mention the type parameters each time; you can also add a version of match that accepts Consumer<A> left, Consumer<B> right instead of Function<A, R> left, Function<B, R> right if you want the option of pattern-matching without producing a result.

Jon Purdy
  • 49,516
  • 7
  • 90
  • 154
  • This already mirrors the remainder of my implementation what I'd hidden in the `...`, as it's adjacent to the actual issue of implementing the sum type, but it is definitely an important part of the Either type's implementation – Zoey Hewll Jan 08 '18 at 03:14
  • 4
    A main selling point of this approach is that there is no need to use type casts (nor `instanceof`) to eliminate the sum type. In other words, this would also work in a fragment of Java where potentially dangerous constructs such as type casts are forbidden. – chi Jan 08 '18 at 14:49
  • FWIW, I have explored a similar ADT encoding in Java in my blog: https://garciat.com/2020/05/07/java-adt/ – Gabriel Garcia May 26 '20 at 14:20
8

Alright, so the inheritance solution is definitely the most promising. The thing we would like to do is class Left<L> extends Either<L, ?>, which we unfortunately cannot do because of Java's generic rules. However, if we make the concessions that the type of Left or Right must encode the "alternate" possibility, we can do this.

public class Left<L, R> extends Either<L, R>`

Now, we would like to be able to convert Left<Integer, A> to Left<Integer, B>, since it doesn't actually use that second type parameter. We can define a method to do this conversion internally, thus encoding that freedom into the type system.

public <R1> Left<L, R1> phantom() {
  return new Left<L, R1>(contents);
}

Complete example:

public class EitherTest {

  public abstract static class Either<L, R> {}

  public static class Left<L, R> extends Either<L, R> {

    private L contents;

    public Left(L x) {
      contents = x;
    }

    public <R1> Left<L, R1> phantom() {
      return new Left<L, R1>(contents);
    }

  }

  public static class Right<L, R> extends Either<L, R> {

    private R contents;

    public Right(R x) {
      contents = x;
    }

    public <L1> Right<L1, R> phantom() {
      return new Right<L1, R>(contents);
    }

  }

}

Of course, you'll want to add some functions for actually accessing the contents, and for checking whether a value is Left or Right so you don't have to sprinkle instanceof and explicit casts everywhere, but this should be enough to get started, at the very least.

Silvio Mayolo
  • 24,199
  • 3
  • 34
  • 65
  • What is the role of the enclosing `EitherTest` type? – Zoey Hewll Jan 08 '18 at 02:03
  • 2
    Nothing in particular. Just wrapping it all in one file so all the classes can be public. Java requires that each file have at most one public class. If you implement this solution in your project, there's no reason to have `EitherTest`. – Silvio Mayolo Jan 08 '18 at 02:05
1

Inheritance can be used to emulate sum types (Disjoint unions), but there are a few issues you need to deal with:

  1. You need to take care to keep others from adding new cases to your type. This is especially important if you want to exhaustively handle every case you might encounter. It's possible with a non-final super class, and package-private constructor.
  2. The lack of pattern patching makes it quite difficult to consume a value of this type. If you want compiler-checked way to guarantee that you've exhaustively handled all cases, you need to implement a match function yourself.
  3. You're forced into one of two styles of API, neither of which are ideal:
    • All cases implement a common API, throwing errors on API they don't support themselves. Consider Optional.get(). Ideally, this method would only be available on a disjoint type who's value is known to be some rather than none. But there's no way to do that, so it's an instance member of a general Optional type. It throws NoSuchElementException if you call it on an optional whose "case" is "none".
    • Each case has a unique API that tells you exactly what it's capable of, but that requires a manual type check and cast every time you wish to call one of these subclass-specific methods.
  4. Changing "cases" requires new object allocation (and adds pressure on the GC if done often).

TL;DR: Functional programming in Java is not a pleasant experience.

Alexander
  • 48,074
  • 8
  • 78
  • 121
  • You missed a third case, where the common API provides only the `switch`, which unwraps the structure for you and provides control flow, like Haskell's pattern match. I doubt there would be a situation where you need to decide control flow based on the type, but then use the whole object, instead of just the contained value. – Zoey Hewll Jan 09 '18 at 01:03
  • See [Jon Purdy's answer](https://stackoverflow.com/a/48143514/6112457) for something similar to my implementation – Zoey Hewll Jan 09 '18 at 01:04
  • @ZoeyHewll Yeah. One of the issues though, is that different cases can have can different numbers of values, with different types. Java doesn't give you a way to deal with that – Alexander Jan 09 '18 at 01:07
  • [It definitely does](https://gist.github.com/anonymous/0eea8d3768c571a9230f54fd31d43ba0) – Zoey Hewll Jan 09 '18 at 01:19
  • By passing in the functions to be called in each case, you can have the enclosed values passed to a function that expects exactly that number of values, of the required types – Zoey Hewll Jan 09 '18 at 01:21
  • @ZoeyHewll I'm not quite getting it, could you show me an example of the usage? – Alexander Jan 09 '18 at 01:37
  • See my comment on the gist for a better formatted example, but: `Maybe m = someOperation(); m.match( (Integer value) -> System.out.println("value = " + value), () -> System.out.println("there is no value") );` – Zoey Hewll Jan 09 '18 at 01:44
  • 1
    @ZoeyHewll I see, so the super class declares an abstract match method that takes `n` functions for each of the `n` cases, each taking as parameters the values associated with that case. Then each subclass calls one of of those `n` functions, passing it members (associated values) as arguments. Correct? – Alexander Jan 09 '18 at 01:50
  • Yes. The `match` function directly mirrors Haskell's `case` expression in this regard – Zoey Hewll Jan 09 '18 at 01:58
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/162777/discussion-between-alexander-and-zoey-hewll). – Alexander Jan 09 '18 at 02:10