3

Java uses type erasure

It is my understanding that new ArrayList<String>() is converted into its raw type, and that a lot of syntactic sugar is used to pretend that this ArrayList of Object acts like an ArrayList of String. Java refers to this as type erasure.

For example

This is what I wrote in Java:

public static void main(String[] args) {
    ArrayList<String> stringList = new ArrayList<>();
    stringList.add("foo");
    String s = stringList.get(0);
}

When I decompiled the byte code, I got this:

public static void main(String[] args) {
    ArrayList<String> stringList = new ArrayList();
    stringList.add("foo");
    String s = (String)stringList.get(0);
} 

Therefore

Why couldn't new T[] be converted to (T[]) new Object[] automatically, using the same "shtick" the compiler pulls for type erasure?

Please don't refer me to this question: What's the reason I can't create generic array types in Java? In particular this comment:

The problem is deeper than pointed in that answer, so further investigation is needed. As you said type info is erased and in compiled code we have no difference between two generic types - all we have is base type - so why for T[] - compile to Object[]. In this case everything will be fine - array will remember that it was created with object type and will let save all types. However as for me the real problem is that arrays are covariant meaning that Animal[] can be assigned to Object[]. On the other hand generics are not ArrayList<Animal> can not be assigned to ArrayList<Object>

Because this logic is flawed!

There are two processes going on here.

  1. The compiler enforces an "artificial" invariance to ArrayList<String>.

  2. The compiler casts the Object into T.

Again, why couldn't arrays of generics be implemented in Java using the simple syntactic sugar that form all generics, while maintaining the normal covariance of arrays?

Pang
  • 8,605
  • 144
  • 77
  • 113
Nechemia Hoffmann
  • 994
  • 11
  • 18

2 Answers2

8

Type safety can be enforced in two ways: at runtime or compile time.

Arrays enforce it at runtime:

Object[] array = new Integer[1];
array[0] = "";
// ArrayStoreException!

Generics enforce it at compile time:

List<Object> list1 = new ArrayList<Integer>();
// does not compile!

List<? extends Object> list2 = new ArrayList<Integer>();
list2.add(1);
// does not compile!

Because T[] is an array, it must be covariant, but because of type erasure, there's no way to check the type at runtime, as demonstrated in this example:

Object[] array = (T[])new Object[1];
array[0] = 1;

Since at runtime array actually is of type Object[], this code will compile and run without error, regardless of what T happens to be. That causes heap pollution, which often results in your code failing at some unpredictable place or time, making it difficult to diagnose and fix. Hence the warning.

shmosel
  • 42,915
  • 5
  • 56
  • 120
4

This code throws an exception:

static <T> T[] newArray() {
    return (T[]) new Object[0];
}
static {
    // throws ClassCastException,
    // because Object[] is not a String[].
    String[] a = newArray();
}

This happens because String[] and Object[] actually have different types at run-time. (This is unlike generics: for example, List<String> and List<Object> have the same class.)

With your suggestion, the method newArray would just look like this:

static <T> T[] newArray() {
    return new T[0];
}

That's not obviously wrong, but it is. I think this change would just cause more confusion for few actual gains.

Conceivably, they could have completely changed the way that arrays work, so that they only have compile-time subtyping and are all just Object[] at run-time. This would have been a pretty major change, though, and would break old code.

Personally, I think they should just add a generic class Array<E> which would serve the same purpose as a generic array. It's pretty rare to actually need one in the first place.

Radiodef
  • 35,285
  • 14
  • 78
  • 114