7

The Sieve of Eratosthenes can be implemented very neatly in Haskell, using laziness to generate an infinite list and then remove all multiples of the head of the list from its tail:

primes :: [Int]
primes = sieve [2..]
sieve (x:xs) = x : sieve [y | y <- xs, y `mod` x > 0]

I'm trying to learn about using streams in Java 8, but I figure out if there's a way of achieving the same result in Java as the Haskell code above. If I treat a Haskell lazy list as equivalent to a Java stream, it seems that I need to take a stream headed by 2 and produce a new stream with all multiples of 2 removed, and then take that stream and produce a new stream with all multiples of 3 removed, and...

And I have no idea how to proceed.

Is there any way of doing this, or am I deluding myself when I try to think of Java streams as comparable to Haskell lists?

Will Ness
  • 62,652
  • 8
  • 86
  • 167
user1636349
  • 381
  • 1
  • 3
  • 15
  • 2
    *"am I deluding myself when I try to think of Java streams as comparable to Haskell lists"* - I would say yes. *Maybe* you can do similar things but they certainly will be faaaaar more complicated to do in Java. – luk2302 May 03 '17 at 12:57
  • Having worked with C# (LINQ) and Haskell all I can say is getting Java's streams to do similar stuff is not straightforward. I would recommend staying closer to idiomatic Java. IMHO Java lacks the built-ins to express these complicated ideas cleanly and concisely. – ThreeFx May 03 '17 at 13:08
  • 11
    By the way, [that's not the Sieve of Eratosthenes](https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf). – Benjamin Hodgson May 03 '17 at 13:38
  • @Holger yes, "stream" is redundant for those coming from "java" tag, but what about others who come from "haskell" or even just "sieve-of-eratosthenes" for whom "stream" makes much more sense? What's the harm in having "stream" as additional tag? – Will Ness May 03 '17 at 16:00
  • @Will Ness: “stream” simply is a *wrong* tag. There is no stream here, unless you’re talking about the Java-specific `Stream` API, which is covered by “java-stream”. Why should a Haskell programmer ever consider the FP code of this question to have anything to do with *streams*? – Holger May 03 '17 at 16:03
  • 1
    @Holger because lazy lists are streams? – Will Ness May 03 '17 at 17:24
  • @Holger: I used the "stream" tag in the sense of the Java stream API. I was trying to refine things so that I wouldn't get conventional non-stream-based "solutions". – user1636349 May 04 '17 at 07:42
  • @user1636349 then you're looking for the "java-stream" tag. Your question will then even attract more attention from people familiar with Java 8 Stream. – Dušan May 04 '17 at 07:59
  • @Dušan yes they (the OP) are; but what about all others? Q&A entries on SO are to serve the whole community, and "stream" has a general meaning outside of Java concerns. I hope someone would restore the tag, still (don't particularly want to edit war). – Will Ness May 04 '17 at 11:01
  • @ThreeFX: now that I have a Java solution (below), I'd be interested to see how you'd do it in C# if you have a solution handy... – user1636349 May 06 '17 at 10:46
  • @user1636349 you could try asking another question with similar text and "C#" and "stream" tags instead of "java" and "java-stream". – Will Ness May 07 '17 at 09:31

5 Answers5

12

Sure, it is possible, but greatly complicated by the fact that Java streams have no simple way of being decomposed into their head and their tail (you can easily get either one of these, but not both since the stream will already have been consumed by then - sounds like someone could use linear types...).

The solution, is to keep a mutable variable around. For instance, that mutable variable can be the predicate that tests whether a number is a multiple of any other number seen so far.

import java.util.stream.*;
import java.util.function.IntPredicate;

public class Primes {

   static IntPredicate isPrime = x -> true;
   static IntStream primes = IntStream
                               .iterate(2, i -> i + 1)
                               .filter(i -> isPrime.test(i))
                               .peek(i -> isPrime = isPrime.and(v -> v % i != 0));

   public static void main(String[] args) {
      // Print out the first 10 primes.
      primes.limit(10)
            .forEach(p -> System.out.println(p));

   }
}

Then, you get the expected result:

$ javac Primes.java
$ java Primes
2
3
5
7
11
13
17
19
23
29
Alec
  • 29,819
  • 5
  • 59
  • 105
  • Close to [this one](http://stackoverflow.com/a/20007272/2711488), though, I’d prefer to implement a true Sieve of Eratosthenes like [shown here](http://stackoverflow.com/a/37282074/2711488)… – Holger May 03 '17 at 17:01
  • 1
    @Holger: OK, but that requires a defined limit for the number of primes to be generated. The Haskell solution processes an infinite list. – user1636349 May 04 '17 at 07:32
  • Thanks for this! I can see I've got a way to go before streams become an automatic part of my Java vocabulary -- but I should expect that, it took ~2 years for OOP to sink in, and that was over 20 years ago -- my brain cells aren't more sprightly now than they were then! – user1636349 May 04 '17 at 07:34
  • @user1636349: that’s an intrinsic limitation of the actual Sieve of Eratosthenes algorithm. On the other hand, this Stream solution will build a linked `Predicate` chain behind the scene, which eats up far more memory, though the performance degradation will be so big that it will appear to hang before ever reaching the `OutOfMemoryError`. You may run the solution of this answer without a `limit(…)`, to see when it will hang or throw either, `StackOverflowError` or `OutOfMemoryError`. Then take the last prime it printed times ten, pass it to my `BitSet` based solution and watch the difference… – Holger May 04 '17 at 09:44
  • @Holger: fair enough, but I asked the question purely out of academic interest, to see if I can learn more about the uses (and limitations) of streams. I can now see more of the limitations, and the solution given is one I would never have thought of myself, even if it is implementationally a monster. I really don't give a flying F about the SoE or efficient ways of finding primes per se, just the relationship of streams to stuff I already know (like Haskell lists) to help me get my head around it all. – user1636349 May 06 '17 at 10:51
  • 1
    @Holger maybe it is possible to augment this algorithm by [*postponement*](http://stackoverflow.com/questions/1764163/explain-this-chunk-of-haskell-code-that-outputs-a-stream-of-primes/8871918#8871918) (also, [this](http://stackoverflow.com/questions/2211990/how-to-implement-an-efficient-infinite-generator-of-prime-numbers-in-python/10733621#10733621) and [this](https://stackoverflow.com/questions/13894417/double-stream-feed-to-prevent-unneeded-memoization/13895347#13895347)), to achieve dramatic speedups. – Will Ness May 07 '17 at 11:37
  • @Holger what that means is, having a limit is not an intrinsic trait of the sieve of Eratosthenes. – Will Ness May 07 '17 at 23:33
  • @Will Ness: did you read [this comment](http://stackoverflow.com/questions/43760641/java-8-streams-and-the-sieve-of-eratosthenes/43763404?noredirect=1#comment74564575_43760641)? Not every algorithm calculating primes is the [Sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). The Sieve of Eratosthenes is what I have shown in my answer, not what has been shown in the question. The Sieve needs a size limit, but as explained in previous comment, these FP solution only pretend to be limitless while failing long before the `BitSet` based true Sieve hits its limit. – Holger May 08 '17 at 07:13
  • @Holger Yes I did (and didn't need to read the comment to know what it's about). Have you looked into any of the links I provided? the essence of the sieve is captured by equation `primes = [2..] \ [[p*p,p*p+p..] for p in primes]`, and there's no limit inherent in that. Merging the composites streams [via an array, by chunks](http://stackoverflow.com/a/42150321/849891) between successive squares of primes, we get the theoretical complexity of the sieve, too, per prime produced (amortized). Merging by a tree of `union` nodes (or a priority queue) we only pay an additional log factor. – Will Ness May 08 '17 at 08:10
  • @Will Ness: I’m not willing to read tons of links of unfamiliar-to-me Haskell code, when right the first one is full of `mod` or similar operations, which is an immediate prove of not being the Sieve of Eratosthenes. There seems to be a habit amongst some developers to call everything that calculates prime number as “Sieve of Eratosthenes”. The Sieve of Eratosthenes is that algorithm that allows to do the entire thing with addition only, nothing else. – Holger May 08 '17 at 08:21
  • @Holger the links I gave you contain lots of verbal explanations, and a [44 upvotes Python answer](http://stackoverflow.com/questions/2211990/how-to-implement-an-efficient-infinite-generator-of-prime-numbers-in-python/10733621#10733621), that has no mod operations in it whatsoever. (and do you see any mods in the mathy pseudocode in my comment? I don't) – Will Ness May 08 '17 at 08:52
  • @Will Ness: As said, I’m not willing to read *all of them*, to find the one that might prove your point. But now that you made me doing that anyway, I have to admit that `yield` seems to be quite a useful tool. I don’t see any way to do a similar thing in Java (with a reasonable code size). Still, I have my doubts whether this can still be considered the Sieve of Eratosthenes. It’s great, it’s efficient, it deserves its upvotes, but that doesn’t prove it to be the Sieve of Eratosthenes in its original sense. That might have been the starting point though… – Holger May 08 '17 at 09:12
  • @Holger just as you said, the true sieve does its job by (generating the composites of each found prime by) addition, and (my words) finding the primes in between. Which is what all of them do, that I linked to. as for `yield`, generators *are* streams, in a sense. --- as for the "starting point", it was in my [original comment](http://stackoverflow.com/questions/43760641/java-8-streams-and-the-sieve-of-eratosthenes/43763404?noredirect=1#comment74699095_43763404) (though it was the 2nd link there). – Will Ness May 08 '17 at 10:03
  • lastly, I should've stressed that the postponement technique is equally applicable to this (trial division) code and brings its complexity down from *n^2* to about *n^1.5*, in *n* primes produced (as was explained in the 1st link in my original comment). – Will Ness May 08 '17 at 13:49
3

If you'd accept a Scala solution instead, here it is:

def sieve(nums:Stream[Int]):Stream[Int] = nums.head #:: sieve(nums.filter{_ % nums.head > 0})
val primes:Stream[Int] = sieve(Stream.from(2))

It is not as elegant as the Haskell solution but it comes pretty close IMO. Here is the output:

scala> primes take 10 foreach println
2
3
5
7
11
13
17
19
23
29

Scala's Stream is a lazy list which is far lazier than the Java 8 Stream. In the documentation you can even find the example Fibonacci sequence implemantation which corresponds to the canonical Haskell zipWith implementation.

Holger
  • 243,335
  • 30
  • 362
  • 661
Dušan
  • 117
  • 1
  • 9
3

EDIT: The sieve, unoptimised, returning an infinite stream of primes

public static Stream<Integer> primeStreamEra() {
    final HashMap<Integer, Integer> seedsFactors =
        new HashMap<Integer, Integer>();
    return IntStream.iterate(1, i -> i + 1)
                    .filter(i -> {
                        final int currentNum = i;
                        seedsFactors.entrySet().parallelStream()
                            .forEach(e -> {
                                // Update all factors until they have
                                //the closest value that is >= currentNum
                                while(e.getValue() < currentNum)
                                    e.setValue(e.getValue() + e.getKey());
                            });
                        if(!seedsFactors.containsValue(i)) {
                            if(i != 1)
                                seedsFactors.put(i, i);
                            return true;
                        }
                        return false;
                    }).boxed();
}

Test:

public static void main(String[] args) {
    primeStreamEra().forEach(i -> System.out.println(i));
}

Initial Post:

A somewhat simpler solution that avoids some unnecessary operations (such as testing even numbers).

We iterate all odd numbers from 3 until the limit.

Within the filter function:

  • We test for all primes we have found that are smaller/equal than sqrt(currentNumber) rounded down.
  • If they divide our current number return false.
  • Else add to the list of found primes and return true.

Function:

public static IntStream primeStream(final int limit) {
    final ArrayList<Integer> primes = new ArrayList<Integer>();
    IntStream primesThreeToLimit =  
           IntStream.iterate(3, i -> i + 2)
                    .takeWhile(i -> i <= limit)
                    .filter(i -> {
                        final int testUntil = (int) Math.sqrt((double) limit);
                        for(Integer p: primes) {
                            if(i % p == 0) return false;
                            if(p > testUntil) break;
                        }
                        primes.add(i);
                        return true;
                    });
    return IntStream.concat(IntStream.of(1,2), primesThreeToLimit);
}

Test:

public static void main(String[] args) {
    System.out.println(Arrays.toString(primeStream(50).toArray()));
}

Output: [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]

Edit: To convert from IntStream to Stream<Integer> just do primeStream(50).boxed().

  • I realize this is a more optimized trial division than the OP, but it is still trial division and not a sieve. – DanaJ Mar 27 '19 at 19:10
  • 1
    @DanaJ You are correct. I added a streams implementation of the actual sieve to my answer. – Nobby Nobbs Mar 28 '19 at 10:56
1

An alternative solution, you can implement the Collector interface.

  public static void main(String[] args)
  {
    Collector<Integer, List<Integer>, List<Integer>> sieve = new Collector<Integer, List<Integer>, List<Integer>>()
    {
      @Override
      public Supplier<List<Integer>> supplier()
      {
        return ArrayList::new;
      }

      @Override
      public BiConsumer<List<Integer>, Integer> accumulator()
      {
        return (prevPrimes, candidate) ->
        {
          if (prevPrimes.stream().noneMatch(p -> candidate % p == 0))
          {
            prevPrimes.add(candidate);
          }
        };
      }

      @Override
      public BinaryOperator<List<Integer>> combiner()
      {
        return (list1, list2) ->
        {
          list1.addAll(list2);
          return list1;
        };
      }

      @Override
      public Function<List<Integer>, List<Integer>> finisher()
      {
        return Function.identity();
      }

      @Override
      public Set<Characteristics> characteristics()
      {
        Set<Characteristics> set = new HashSet<>();
        set.add(Characteristics.IDENTITY_FINISH);
        return set;
      }
    };

    List<Integer> primesBelow1000 = IntStream.range(2, 1000)
        .boxed()
        .collect(sieve);

    primesBelow1000.forEach(System.out::println);
  }

More concisely:

  public static void main(String[] args)
  {

    List<Integer> primesBelow1000 = IntStream.range(2, 1000)
        .boxed()
        .collect(
            ArrayList::new,
            (primes, candidate) ->
            {
              if (primes.stream().noneMatch(p -> candidate % p == 0))
              {
                primes.add(candidate);
              }
            },
            List::addAll
        );

    primesBelow1000.forEach(System.out::println);
  }

More efficient (using Java 9 TakeWhile to change O(n) to O(sqrt(n))):

List<Long> primesBelowLimit = LongStream.range(2, below)
        .collect(
                ArrayList::new,
                (primes, candidate) ->
                {
                    long candidateRoot = (long) Math.sqrt(candidate);
                    if (primes.stream()
                        .takeWhile(p -> p <= candidateRoot)
                        .noneMatch(p -> candidate % p == 0))
                    {
                        primes.add(candidate);
                    }
                },
                List::addAll
        );
wilmol
  • 641
  • 6
  • 14
0

May be such solution?

public class ErythropheneSieveFunctionBitSet implements IntFunction<BitSet> {

    @Override
    public BitSet apply(int value) {
        BitSet set = new BitSet();
        fillSet(set, value);

        int s = set.stream().min().getAsInt();
        while (s * s <= value) {
            int temp = s;
            int i = 0;
            int multipleTemp;
            while ((multipleTemp = s * (s + i)) <= value) {
                set.clear(multipleTemp);
                i++;
            }
            s = set.stream().filter(x -> x > temp).min().getAsInt();
        }

        return set;
    }

    private void fillSet(BitSet set, int value) {
        for (int i = 2; i < value; i++) {
            set.set(i);
        }
    }
}