memory reallocation issue in Java ArrayList, HashSet and HashMap

Question

Looked some unofficial references, and want to confirm here my understanding is correct. Suppose we are adding new (unique) elements from time to time,

ArrayList<T> will reallocate for memory since its memory needs to be continuous, when memory grow by newly inserted elements exceeds some threshold, reallocation of larger continuous memory space will happen, and existing elements will be moved to such newly allocated larger continuous memory space;
HashSet<T> and HashMap<T> has no such issue since memory of them does not require to be continuous?

BTW, if some good articles on these areas, appreciate for refer as well.

regards, Lin

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/ArrayList.java#ArrayList.ensureCapacity%28int%29 seems to prove point1 — Mc Kevin, Nov 10 '16 at 06:47
Read [the javadoc](https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html). It says: *When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets* — JB Nizet, Nov 10 '16 at 07:06
@McKevin, thanks and vote up, how about 2nd item of my question? — Lin Ma, Nov 10 '16 at 07:09
@LinMa The quote in my comment is an extract from the javadoc of HashMap. The link I have in my comment links to the documentation of HashMap. HashSet is implemented using a HashMap. You really need to take some time to read, carefully. — JB Nizet, Nov 10 '16 at 07:14
@JBNizet, thanks and I want to find some data structure in Java which does not have such kinds of heavy operation (either re-allocate space like ArrayList, or rehash like Hashtable) when continue adding new elements, is there such data structure or any settings in Java could avoid it? I ask this since I have the use case of continue add a large number of elements. — Lin Ma, Nov 10 '16 at 07:16
LinkedList is such a data structure. TreeSet and TreeMap, too. But you're overthinking things. Not reallocating doesn't mean that it's faster. And a Map and a List serve completely different purposes. If you need a List, use a List. If you need a Set, use a Set. If you need a Map, use a Map. Except for very specific use-cases, an ArrayList is faster (and uses less memory) than LinkedList. — JB Nizet, Nov 10 '16 at 07:20
@JBNizet, thanks for the advice. I read some other blogs and discussions, sometimes I see even if there is memory reallocation happens, the `amortized performance` is O(1) for ArrayList. Any thoughts what means `amortized` performance? — Lin Ma, Nov 10 '16 at 19:41
http://stackoverflow.com/questions/200384/constant-amortized-time — JB Nizet, Nov 10 '16 at 20:08

score 2 · Accepted Answer · edited May 23 '17 at 12:07

If you checkout the sourcecode of add(E e) method in ArrayList<> in Java 8 (jre 1.8.0_71), it calls in for a method called ensureCapacityInternal(int minCapacity). i.e this method is called every time you add in an object to the ArrayList. This inturn calls in a series of methods and finally if the size of your ArrayList is smaller to hold the newly added element, it calls in a method called grow(int minCapacity). This method is as shown below:

/**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

This will create a new array with size 1.5 times more than the initial one and copies all the elements from old array to the new one. This proves your point no. 1.

Coming back to your point no. 2, in case of HashMap<K,V>, they are a special type of array that hold key and value pairs. This array slots are called buckets. So, every object that you add into an HashMap, should override hashCode() and equals() method properly. When you call put(K key, V value) method, it inturn calls a method called putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) by calculating the #hash of the key hash(Object key) that you have passed. This hash indicates the bucket location where the Value object shoud go. Hence, the array here only indicates the address blocks where the objects goes in. This thread explains it in more detail. I hope this is what you were looking for.

Thanks sbaitmangalkar for such details, just want to clarify, for `HashMap` and `HashSet`, since no requirement to be in continuous memory space, there is no element reallocation happens, correct? — Lin Ma, Nov 10 '16 at 19:38
Yes, there is no requirement for continuous memory locations and there is no reallocation happening. But whenever the map gets dynamically resized, the existing entries are redistributed. — Shyam Baitmangalkar, Nov 11 '16 at 09:34

memory reallocation issue in Java ArrayList, HashSet and HashMap

1 Answers1