0

I'm trying to use the following data structures:

//all the words in each file as index, the label as value
static Map< ArrayList<String> , String > train__list_of_file_words = new HashMap<>();
static Map< ArrayList<String> , String > test__list_of_file_words = new HashMap<>();

//frequency count against global dictionary
static Map< int[] , String > train_freq_count_against_globo_dict = new HashMap<>();
static Map< int[] , String > test_freq_count_against_globo_dict = new HashMap<>();

But I was told that this is non-sensical since they don't work smoothly in conjuction with equals() and are mutable.

I suppose the solution would be to write my own objects to store that information in a similar way, but I've never done that before. How to do it?

smatthewenglish
  • 2,595
  • 3
  • 29
  • 64
  • 3
    Where do you get these quotes from? Define "don't work smoothly". Who does "they" refer to? What has mutable to do with this? How would creating your own objects solve this problem? Answer these questions and then we can continue. – Jeroen Vannevel Feb 26 '15 at 13:52
  • An int[] keys will definitely not work, since you can't arrays don't override Object's equals. `ArrayList` keys can work as long as you don't mutate those keys after they are put in the hash map. – Eran Feb 26 '15 at 13:55
  • 1
    @JeroenVannevel "What does mutable have to do with this?" All keys in a `Map` should be immutable. If the keys are mutable, a mutation might change the hash code of the object, and the value might be "lost" in the map. Creating immutable wrappers around the arrays and lists will solve the problem of immutability. – Harald K Feb 26 '15 at 14:02
  • @Eran, depends on how you define "work" :-) If you just want to associate some string with an int[] instance, then it works. In fact it works *thanks* to the fact that int[]s hashcode doesn't depend on their content. (Another way of putting it, it's as useful as IdentityHashMap) – aioobe Feb 26 '15 at 14:05
  • so i can use int[] as the key to a hash map? they don't get subsequently changed. – smatthewenglish Feb 26 '15 at 14:06
  • 2
    @flavius_valens, yes you can, but keep in mind that when you look up values (call `map.get`) you need to use *the same* array that you used when adding value (when you called `map.put`) Another array with the same content can not be used. – aioobe Feb 26 '15 at 14:07
  • @flavius_valens You can't use `int[]` or any other array as a key in any meaningful way, because arrays do not override equals and hashCode. – Sergey Kalinichenko Feb 26 '15 at 14:08
  • @dasblinkenlight, doesn't that imply that [`IdentityHashMap`](http://docs.oracle.com/javase/7/docs/api/java/util/IdentityHashMap.html) is useless? – aioobe Feb 26 '15 at 14:09
  • @aioobe Would it be possible to use it in [this](http://stackoverflow.com/questions/28742832/extract-keys-and-values-from-hashmap-and-insert-into-more-malleable-structures) situation? – smatthewenglish Feb 26 '15 at 14:10
  • 1
    @aioobe All it implies is that `IdentityHashMap` is not really a `HashMap`, because it breaks the contract :-) – Sergey Kalinichenko Feb 26 '15 at 14:11
  • @dasblinkenlight, sorry, I don't see the connection. A) `IdentityHashMap` is based on `Object.{hashCode,equals}` so regardless if it breaks the contract or not, it's comparable to using `int[]` as key. B) Using `int[]` as key in a hash map *does* obey to the contract, since the `int[]`-implementations of `hashCode/equals` inherited from `Object` obey the contract. – aioobe Feb 26 '15 at 15:12

1 Answers1

5

Since no one has stepped forward and given a full answer to your questions, I'll give it a shot:

Using Map< ArrayList<String> , String >

The problem with this approach is that ArrayList<String> is mutable. See this example for instance:

Map<ArrayList<String>, String> map = new HashMap<>();

ArrayList<String> l = new ArrayList<>();
l.add("a");
l.add("b");

map.put(l, "Hello");
System.out.println(map.get(l)); // "Hello";

l.add("c"); // Mutate key.
System.out.println(map.get(l)); // null (value lost!)

Further reading:

Using Map< int[] , String >

This is possible, but may be confusing because two arrays may seem equal, but not .equals each other. Consider the following example:

Map<int[], String> map = new HashMap<>();

int[] arr1 = { 1, 2 };
map.put(arr1, "Hello");

int[] arr2 = { 1, 2 };
System.out.println(map.get(arr2));  // null, since arr1.equals(arr2) == false

So, to retrieve previously inserted values, you need to use the same instance as key. The above example only works if you use map.get(arr1).

So what to do?

  • You can, as you suggest, roll your own data structure that keeps track of the mapping in private data structures. You could for instance use a Map<List<...>, String> as a backing structure but make sure that you never mutate the keys you use in that map (for instance by keeping the map private and never ever publish references to the List-keys).

  • If you know the sizes of your keys in advance, you can use nested maps, as follows: Map<String, Map<String, String>>.

  • You can use a third party collection library such as Guava or Apache Commons. They have a data structures called Table resp. MultiKeyMap which both seems to match your requirements.

  • You can create a new, immutable, key object as suggested by @dasblinkenlight. (A word of caution; this is safe just because String is immutable!) The code can be simplified slightly as follows:

    final class StringTuple {
        private String[] vals;
    
        public StringTuple(String... vals) {
            this.vals = vals.clone();
        }
    
        public int hashCode() {
            return Arrays.hashCode(vals);
        }
    
        public boolean equals(Object obj) {
            return (obj instanceof StringTuple)
                && Arrays.equals(vals, ((StringTuple) obj).vals);
        }
    }
    
Community
  • 1
  • 1
aioobe
  • 383,660
  • 99
  • 774
  • 796
  • table takes three elements, but I really only need two, should I just make up something trivial to put in there for good measure? seems wasteful. – smatthewenglish Feb 27 '15 at 02:20