Why is toArray implemented like this in java?

Question

As I see the source code of: java.util.AbstractCollection.toArray(), it is implemented like this:

 public Object[] toArray() {
    // Estimate size of array; be prepared to see more or fewer elements
    Object[] r = new Object[size()];
    Iterator<E> it = iterator();
    for (int i = 0; i < r.length; i++) {
        if (! it.hasNext()) // fewer elements than expected
            return Arrays.copyOf(r, i);
        r[i] = it.next();
    }
    return it.hasNext() ? finishToArray(r, it) : r;
}

private static <T> T[] finishToArray(T[] r, Iterator<?> it) {
    int i = r.length;
    while (it.hasNext()) {
        int cap = r.length;
        if (i == cap) {
            int newCap = cap + (cap >> 1) + 1;
            // overflow-conscious code
            if (newCap - MAX_ARRAY_SIZE > 0)
                newCap = hugeCapacity(cap + 1);
            r = Arrays.copyOf(r, newCap);
        }
        r[i++] = (T)it.next();
    }
    // trim if overallocated
    return (i == r.length) ? r : Arrays.copyOf(r, i);
}

As you see,the implementation is not so easy to understand, my question is :

What will I get when the collection's elements change (size not changed) during iteration? I guess the iterator may be some kind of snapshot.
What will I get when the collection's size is changed? I wonder if it can work correctly.

It runs the iterator from start until it is done - how would the elements change during iteration? (Except from another thread which is not supported in this case anyway). — Benjamin Gruenbaum, Jan 07 '16 at 15:09
This is source of which version of Java major-minor? Copied from where? — Aseem Bansal, Jan 07 '16 at 15:17
@AseemBansal the version is: jdk 1.7.0.79, it's auto-associated by eclipse. — scugxl, Jan 07 '16 at 15:39
Link to JDK 8 source code, which is essentially the same: http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/jdk8-b132/src/share/classes/java/util/AbstractCollection.java#l112 — Stuart Marks, Jan 08 '16 at 02:39

score 3 · Accepted Answer · answered Jan 08 '16 at 02:36

As you see,the implementation is not so easy to understand, my question is :

What will I get when the collection's elements change (size not changed) during iteration? I guess the iterator may be some kind of snapshot.

What will I get when the collection's size is changed? I wonder if it can work correctly.

The implementation is the way it is because it's intended to handle the case where the iterator returns a different number of elements than size(). This can occur if the collection's size changes during the iteration. The destination array is allocated based on size(), and in the optimistic case where the size doesn't change, it's pretty straightforward. The complexity of the code comes in where the actual number of elements returned by the iterator differs from the initial value returned by size(). If the actual number of elements is smaller, the elements are copied into a smaller array of the right size. If the actual number is bigger, the elements are copied into a larger array, and then more elements are iterated. The array is repeatedly reallocated larger if it fills up, until the iteration completes.

To your first question, the iterator doesn't necessarily take a snapshot of the elements. It depends on the actual collection implementation. Some collections (such as CopyOnWriteArrayList) do have snapshot semantics, so if the collection is modified, the modification won't be visible to the iterator. In this case the number of elements reported by the iterator will match size(), so no array reallocation is necessary.

Other collection implementations have different policies for what happens if the collection is modified during iteration. Some are fail-fast which means they'll throw ConcurrentModificationException. Others are weakly consistent which means that modifications might or might not be visible to the iterator.

This applies to your second question. If the collections size changes during iteration, and if that collection's iterator supports this (i.e., it's not fail-fast), the code here will handle a different number of elements coming out of the iterator than was initially reported by size().

An example where this can occur is with ConcurrentSkipListSet. This class's iterator is weakly consistent, and it inherits the toArray() method from AbstractCollection. Thus, while toArray() is iterating the set in order to gather the elements into the destination array, it's entirely legal for another thread to modify the set, possibly changing its size. This can clearly cause the iterator to report a different number of elements from the initial value returned by size(), which will cause the array reallocation code in toArray() to be executed.

Gaël J · Answer 2 · 2016-01-07T15:16:10.277

0

What will I get when the collection's size changed?

If the collection's size is less than expected, the array is "reduced" with return Arrays.copyOf(r, i) in toArray() method like the comment indicates.
If the collection's size is more than expected, the it.hasNext() ? finishToArray(r, it) : r call handles the case. finishToArray method continues to add elements into the array and "expand" its size if needed: new capacity is computed (newCap = cap + (cap >> 1) + 1) and array is "expanded" (r = Arrays.copyOf(r, newCap)).

edited Jan 07 '16 at 15:16

answered Jan 07 '16 at 15:09

Gaël J

2,589
1
13
21

Are you sure there won't be a `ConcurrentModificationException`? – Marvin Jan 07 '16 at 15:16
I guess you're right, if the size changes after iteration started, you'll get an exception. Is that it ? I'm interested in the answer :) – Gaël J Jan 07 '16 at 15:19
@Marvin I thought about this as well, but the `ConcurrentModificationException` may not be thrown in all cases (and is not guaranteed to be thrown at all). From a quick glance, the code seems like it tries to cope with this fact, but I'd have to digest it further to make a more profound answer. – Marco13 Jan 07 '16 at 15:44

score 0 · Answer 3 · answered Jan 07 '16 at 15:13

I dont think all the Collection implementations are thread safe, instead of worrying you can make your Collection synchronized using:

Collections.synchronizedCollection(myCollection);

or you can take a look:

https://docs.oracle.com/javase/tutorial/essential/concurrency/collections.html

Edit: Here I found a nice explanation

Tobías · Answer 4 · 2016-01-07T16:43:40.790

You can only be sure that the result of the iteration is undefined (unless you know the exact implementation of the collection in use). Usually a ConcurrentModificationException will be thrown, but you can't rely on that assumption.

If a Collection is modified while iterating over it, in most of the implementations, a ConcurrentModificationException is thrown. Iterators that do so are known as fail-fast iterators.

But this depends on each implementation, although all the general purpose collection implementations provided by the JRE do so, not all Iterators are fail-fast. And also note that fail-fast behavior cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification.

Why is toArray implemented like this in java?

Because this implementation assumes that the size of the collection can change at any time, as the iterator may not throw any exceptions. Therefore, this method checks that the iterator may provide more or fewer elements than the initial estimated size.

Why is toArray implemented like this in java?

4 Answers4