2

I have a Stream of data and I want to throw everything away except for the final n elements. If the input doesn't have enough elements then the resulting stream is padded with None. This is what I came up with:

def lastN[T](in: Stream[T], len: Int): Stream[Option[T]] =
  in.foldLeft(Vector.fill[Option[T]](len)(None))(_.tail :+ Option(_)).to[Stream]

I chose Vector for the internal buffer because of its tail and append performance characteristics.

This all works fine. Perhaps there's a better way? [Note: There's always a better way.]

But suppose Iterator is a more appropriate representation of the input data? No problem, just replace the 3 mentions of Stream with Iterator and it all works.

OK, so why not either/both?

I was hoping I could do something like this:

import scala.language.higherKinds
def lastN[T, C[U] <: TraversableOnce[U] ](in: C[T], len: Int): C[Option[T]] =
  in.foldLeft(Vector.fill[Option[T]](len)(None))(_.tail :+ Option(_)).to[C] 

Alas, no go.

error: Cannot construct a collection of type C[Option[T]] with elements of type Option[T] based on a collection of type Nothing.

I've tried futzing with CanBuildFrom directly but I'm just not coming up with the magic formula.

jwvh
  • 46,953
  • 7
  • 29
  • 59

2 Answers2

1

I think it's more natural to use Queue as an internal buffer. It's more semantically suitable for this sort of processing and scala.collection.immutable.Queue is implemented with two Lists and may actually be more efficient than Vector (you'd have to make a measurement to find out if that's the case of course). Otherwise the API stays completely the same: you can just replace the mention of Vector with Queue.

As for CanBuildFrom, it's used in your code to call the to method. You can consult its full signature to find out what CanBuildFrom you'd have to request:

def to[Col[_]](implicit cbf: CanBuildFrom[Nothing, A, Col[A]]): Col[A] 

So, you would need CanBuildFrom[Nothing, Option[T], C[Option[T]]].

Putting it all together a possible implementation looks like this:

import scala.collection.generic.CanBuildFrom
import scala.collection.immutable.Queue

def lastN[T, C[U] <: TraversableOnce[U]](in: C[T], len: Int)(
  implicit cbf: CanBuildFrom[Nothing, Option[T], C[Option[T]]]
): C[Option[T]] =
  in.foldLeft(Queue.fill(len)(None: Option[T]))(_.tail :+ Option(_)).to[C]

As for your comment, the compiler knows that to call to it needs CanBuildFrom[Nothing, Option[T], C[Option[T]]], but it can't automatically find an implicit argument with abstract types.

But if you put a request CanBuildFrom[Nothing, Option[T], C[Option[T]]] in the lastN signature, then when you call for example lastN(Vector(1,2,3), 2), the compiler knows that C is Vector, and T is Int, so it has to pass a CanBuildFrom[Nothing, Option[Int], Vector[Option[Int]]].

Here all types are concrete and the compiler can find a relevant instance of CanBuildFrom using the usual implicit lookup rules. I believe, it will find one in the companion object of Vector in this example.

Community
  • 1
  • 1
Kolmar
  • 13,241
  • 1
  • 19
  • 24
  • this shows no improvement over the original version (still fails compilation, with the same error). – Dima Mar 17 '16 at 21:28
  • Indeed, I am. Don't know what I was smoking :( Please, edit your answer, so I can retract my downvote – Dima Mar 17 '16 at 23:24
  • @Kolmar, many thanks. Exactly what I was looking for, but I can't say I completely understand it. The compiler knows what needs to be built (see error text) but the implicit CanBuildFrom that `to` pulls down is insufficient (the signature looks good to me). So if I pull in an implicit CanBuildFrom with the exact signature then `to` can use it? Back to the books for me. – jwvh Mar 18 '16 at 00:30
  • @jwvh Added a little explanation for your question. I hope it would be helpful. – Kolmar Mar 18 '16 at 10:06
1

If you want to take the last N elements of an Iterable, you can use the takeRight function. This should work for any collection that inherits from Iterable, so it will work for Stream. Unfortunately, it has been pointed out that this does not work for Iterator.

def lastN[T](in: Iterable[T], n: Int) = in.takeRight(n)

Traversable, the superclass of Iterator, does have a toIterable function you can use though. If you really want to make it as generic as possible, you can try:

def lastN[T](in: Traversable[T], n: Int) = in.toIterable.takeRight(n)
Ethan
  • 801
  • 4
  • 11
  • Beautifully simple and straight forward. Unfortunately `Iterator` doesn't have the `takeRight` method. – jwvh Mar 17 '16 at 22:37
  • Thanks for catching that, I don't know why I thought `Iterator` inherited from `Iterable`. – Ethan Mar 17 '16 at 22:39