65

Say I have a Map<? extends Object, List<String>>

I can get the values of the map easily enough, and iterate over it to produce a single List<String>.

   for (List<String> list : someMap.values()) {
        someList.addAll(list);
    }

Is there a way to flatten it in one shot?

  List<String> someList = SomeMap.values().flatten();
max.kuzmentsov
  • 676
  • 9
  • 21
Tony Ennis
  • 10,958
  • 6
  • 46
  • 68
  • What's wrong with using using a loop? – Josh M Aug 17 '13 at 16:28
  • 4
    @JoshM Nothing at all. But if I can use something built-in, I should. I usually know the answers to these types of questions but this time I don't, so I thought I'd ask. – Tony Ennis Aug 17 '13 at 16:36

9 Answers9

78

Using Java 8 and if you prefer not to instantiate a List instance by yourself, like in the suggested (and accepted) solution

someMap.values().forEach(someList::addAll);

You could do it all by streaming with this statement:

List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());

By the way it should be interesting to know, that on Java 8 the accepted version seems to be indeed the fastest. It has about the same timing as a

for (List<String> item : someMap.values()) ...

and is a way faster than the pure streaming solution. Here is my little testcode. I explicitly don't name it benchmark to avoid the resulting discussion of benchmark flaws. ;) I do every test twice to hopefully get a full compiled version.

    Map<String, List<String>> map = new HashMap<>();
    long millis;

    map.put("test", Arrays.asList("1", "2", "3", "4"));
    map.put("test2", Arrays.asList("10", "20", "30", "40"));
    map.put("test3", Arrays.asList("100", "200", "300", "400"));

    int maxcounter = 1000000;
    
    System.out.println("1 stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("1 parallel stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);

    System.out.println("1 foreach");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        map.values().forEach(mylist::addAll);
    }
    System.out.println(System.currentTimeMillis() - millis);        

    System.out.println("1 for");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        for (List<String> item : map.values()) {
            mylist.addAll(item);
        }
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    
    System.out.println("2 stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("2 parallel stream flatmap");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
    }
    System.out.println(System.currentTimeMillis() - millis);
    
    System.out.println("2 foreach");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        map.values().forEach(mylist::addAll);
    }
    System.out.println(System.currentTimeMillis() - millis);        

    System.out.println("2 for");
    millis = System.currentTimeMillis();
    for (int i = 0; i < maxcounter; i++) {
        List<String> mylist = new ArrayList<String>();
        for (List<String> item : map.values()) {
            mylist.addAll(item);
        }
    }
    System.out.println(System.currentTimeMillis() - millis);

And here are the results:

1 stream flatmap
468
1 parallel stream flatmap
1529
1 foreach
140
1 for
172
2 stream flatmap
296
2 parallel stream flatmap
1482
2 foreach
156
2 for
141

Edit 2016-05-24 (two years after):

Running the same test using an actual Java 8 version (U92) on the same machine:

1 stream flatmap
313
1 parallel stream flatmap
3257
1 foreach
109
1 for
141
2 stream flatmap
219
2 parallel stream flatmap
3830
2 foreach
125
2 for
140

It seems that there is a speedup for sequential processing of streams and an even larger overhead for parallel streams.

Edit 2018-10-18 (four years after):

Using now Java 10 version (10.0.2) on the same machine:

1 stream flatmap
393
1 parallel stream flatmap
3683
1 foreach
157
1 for
175
2 stream flatmap
243
2 parallel stream flatmap
5945
2 foreach
128
2 for
187

The overhead for parallel streaming seems to be larger.

Edit 2020-05-22 (six years after):

Using now Java 14 version (14.0.0.36) on a different machine:

1 stream flatmap
299
1 parallel stream flatmap
3209
1 foreach
202
1 for
170
2 stream flatmap
178
2 parallel stream flatmap
3270
2 foreach
138
2 for
167

It should really be noted, that this was done on a different machine (but I think comparable). The parallel streaming overhead seems to be considerably smaller than before.

wumpz
  • 6,147
  • 2
  • 25
  • 25
  • 8
    While actually longer by a few characters, writing `flatMap(Collections::stream)` might be preferable in style to `flatMap(c -> c.stream())`. – Ian Robertson Sep 11 '14 at 19:20
  • 4
    It's `Collection::stream`, using `Collections` won't do in my test. – BAERUS Mar 29 '16 at 10:44
  • 1
    Which one is faster probably also depends on your input data. I wouldn't be surprised if the stream version was faster when the input is many small lists. If it is clever enough it has a chance to allocate a memory for the whole result at once, while the forEach version will have to reallocate it a few times. – danadam Nov 23 '16 at 02:20
  • Sure, but this issue is more or less on overview of the overhead you have to expect and to deal with. – wumpz May 22 '20 at 05:29
  • I think the results here for the parallelStream are a little misleading. At a certain collection size, the benefit of parallel processing will outweigh the overhead needed for creating multiple threads, and delegating actions to them. The map you are testing with doesn't even come close to that threshold. In a real world scenario where the map could have thousands of value collections, each with thousands of their own elements, the benefit of the parallelStream comes into play. – Nicholas Leach Apr 25 '21 at 18:06
  • You are right. However, it was this overhead I was interested in. Sure it will outperform the non parallel thread version if the list us large enough. – wumpz Apr 26 '21 at 08:58
59

If you are using Java 8, you could do something like this:

someMap.values().forEach(someList::addAll);
Josh M
  • 10,457
  • 7
  • 37
  • 46
  • 6
    If I am not wrong this is actually not recommended - https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html Please see side -effects section. > Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards. So in this case it is better to use `Collector.toList()` – Anton Balaniuc Mar 14 '17 at 15:28
  • 1
    @AntonBalaniuc you are quoting the `stream` documentation, but this is not used here. `list.stream.forEach` != `list.forEach` – Eugene Jan 14 '19 at 14:53
37

When searching for "java 8 flatten" this is the only mentioning. And it's not about flattening stream either. So for great good I just leave it here

.flatMap(Collection::stream)

I'm also surprised no one has given concurrent java 8 answer to original question which is

.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);
user2418306
  • 2,173
  • 1
  • 19
  • 29
  • 3
    I believe that `.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);` is the correct answer. `flatMap()` is not useful in this situation. `flatMap()` can be useful if you need to call another method on the argument before obtaining a stream (i.e., calling the `stream()` method). However, here, we already have a reference to an object for which we may directly retrieve a stream. – Raffi Khatchadourian Jul 24 '15 at 15:36
8

Suggested by a colleague:

listOfLists.stream().flatMap(e -> e.stream()).collect(Lists.toList())

I like it better than forEach().

orbfish
  • 6,349
  • 12
  • 50
  • 71
7

If you're using Eclipse Collections, you can use Iterate.flatten().

MutableMap<String, MutableList<String>> map = Maps.mutable.empty();
map.put("Even", Lists.mutable.with("0", "2", "4"));
map.put("Odd", Lists.mutable.with("1", "3", "5"));
MutableList<String> flattened = Iterate.flatten(map, Lists.mutable.empty());
Assert.assertEquals(
    Lists.immutable.with("0", "1", "2", "3", "4", "5"),
    flattened.toSortedList());

flatten() is a special case of the more general RichIterable.flatCollect().

MutableList<String> flattened = 
    map.flatCollect(x -> x, Lists.mutable.empty());

Note: I am a committer for Eclipse Collections.

Donald Raab
  • 5,761
  • 2
  • 29
  • 33
Craig P. Motlin
  • 25,569
  • 17
  • 93
  • 124
5

No, there is no shorter method. You have to use a loop.

Update Apr 2014: Java 8 has finally come out. In the new version you can use the Iterable.forEach method to walk over a collection without using an explicit loop.

Update Nov 2017: Found this question by chance when looking for a modern solution. Ended up going with reduce:

someMap.values().stream().reduce(new ArrayList(), (accum, list) -> {
    accum.addAll(list);
    return accum;
}):

This avoids depending on mutable external state of forEach(someList::addAll) the overhead of flatMap(List::stream).

Joni
  • 101,441
  • 12
  • 123
  • 178
0

If you just want to iterate through values, you can avoid all these addAll methods.

All you have to do is write a class that encapsulates your Map, and that implements the Iterator :

public class ListMap<K,V> implements Iterator<V>
{
  private final Map<K,List<V>> _map;
  private Iterator<Map.Entry<K,List<V>>> _it1 = null;
  private Iterator<V> _it2 = null;

  public ListMap(Map<K,List<V>> map)
  {
    _map = map;
    _it1 = map.entrySet().iterator(); 
    nextList();
  }

  public boolean hasNext()
  {
    return _it2!=null && _it2.hasNext();
  }

  public V next()
  {
    if(_it2!=null && _it2.hasNext())
    {
      return _it2.next();
    }
    else
    {
      throw new NoSuchElementException();
    }
    nextList();
  } 

  public void remove()
  {
    throw new NotImplementedException();
  }

  private void nextList()
  {
    while(_it1.hasNext() && !_it2.hasNext())
    {
      _it2 = _it1.next().value();
    }
  }
}
David
  • 963
  • 4
  • 14
0

A nice solution for the subcase of a Map of Maps is to store, if possible, the data in Guava's Table.

https://github.com/google/guava/wiki/NewCollectionTypesExplained#table

So for instance a Map<String,Map<String,String>> is replaced by Table<String,String,String> which is already flattend. In fact, the docs say that HashBasedTable, Table's Hash implementation, is essentially backed by a HashMap<R, HashMap<C, V>>

Guy Grin
  • 1,763
  • 1
  • 14
  • 32
0

Flatten on a function:

    private <A, T> List<T> flatten(List<A> list, Function<A, List<T>> flattenFn) {
        return list
                .stream()
                .map(flattenFn)
                .flatMap(Collection::stream)
                .collect(Collectors.toUnmodifiableList());
    }
Leo Duarte
  • 11
  • 1
  • 2