4

I am trying to implement an unique hashCode based on six different values. My Class has the following attributes:

private int id_place;
private String algorithm;
private Date mission_date;
private int mission_hour;
private int x;
private int y;

I am calculating the hashCode as following:

id_place * (7 * algorithm.hashCode()) + (31 * mission_date.hashCode()) + (23 * mission_hour + 89089) + (x * 19 + 67067) + (y * 11 + 97097);

How can I turn it into an unique hashCode? I'm not confident it is unique...

Guillem
  • 57
  • 5
  • 4
    Hashcodes can't be guaranteed to be unique. You can create an infinite number of `MyClass` instances with different field values, but there are only 2^32 possible hash codes. – Andy Turner Jan 07 '19 at 12:10
  • Related: [Why hashcode does not generate unique hashcode?](https://stackoverflow.com/questions/26740091/why-hashcode-does-not-generate-unique-hashcode) There are more, just search. – Ole V.V. Jan 07 '19 at 12:35
  • @AndyTurner Thanks a lot for the answer. So, my question is... Imagine I want to use a data structure where I want to have a list of PointDensity (with the attributes above). And I want to know how much times they are repeated. I was thinking of using a HashMap like this Map or Map (where String is an unique Key and PointDensity holds also the times it has repeated). What is the optimal way of implementing this? – Guillem Jan 07 '19 at 13:26
  • 1
    A lazy alternative to the DIY-method would be to use [`Objects.hash`](https://docs.oracle.com/javase/9/docs/api/java/util/Objects.html#hash-java.lang.Object...-) – Paul Jan 07 '19 at 13:35

5 Answers5

3

It doesn't have to be unique and it cannot be unique. hashCode() returns an int (32 bits), which means it could be unique if you only had one int property and nothing else.

The Integer class can (and does) have a unique hashCode(), but few other classes do.

Since you have multiple properties, some of which are int, a hashCode() that is a function of these properties can't be unique.

You should strive for a hasCode() function that gives a wide range of different values for different combinations of your properties, but it cannot be unique.

Eran
  • 359,724
  • 45
  • 626
  • 694
  • Thanks a lot for the answer. So, my question is... Imagine I want to use a data structure where I want to have a list of PointDensity (with the attributes above). And I want to know how much times they are repeated. I was thinking of using a HashMap like this Map or Map (where String is an unique Key and PointDensity holds also the times it has repeated). What is the optimal way of implementing this? – Guillem Jan 07 '19 at 13:23
  • The only way to know which alternative would be better is to try both and run a benchmark. You should be able to use `PointDensity` as a key if you override `equals` and `hashCode` properly. – Eran Jan 07 '19 at 13:31
2

HashCode for two different object needs not be unique. According to https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode() -

  1. Whenever it is invoked on the same object more than once during an execution of a Java application, hashCode() must consistently return the same value, provided no information used in equals comparisons on the object is modified. This value needs not remain consistent from one execution of an application to another execution of the same application
  2. If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same value.
  3. It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

So , you don't have to create hashCode() function which returns distinct hash code everytime.

p.bansal
  • 206
  • 1
  • 7
2

Unique is not a hard requirement, but the more unique the hash code is, the better.

Note first that the hash code in general is used for a HashMap, as index into a 'bucket.' Hence optimally it should be unique modulo the bucket size, the number of slots in the bucket. However this may vary, when the map grows.

But okay, towards an optimal hash code:

  • Ranges are important; if x and y where in 0..255, then they could be packed uniquely in two bytes, or when 0..999 then y*1000+x. For LocalDateTime, if one could take the long in seconds (i.o. ms or ns), and since 2012-01-01 so you might assume a range from 0 upto say two years in the future.
  • You can explore existing or generate plausible test data. One then can mathematically optimize your hash code function by their coincidental coefficients (7, 13, 23). This is linear optimisation, but one can also do it by simple trial-and-error: counting the clashes for varying (A, B, C).

    //int[] coeffients = ...;
    int[][] coefficientsCandidates = new int[NUM_OF_CANDIDATES][NUM_OF_COEFFS];
    ...
    int[] collisionCounts = new int[NUM_OF_CANDIDATES];
    for (Data data : allTestData) {
        ... update collisionCounts for every candidate
    }
    ... take the candidate with smallest collision count
    ... or sort by collisionCounts and pick other candidates to try out
    

In general such evaluation code is not needed for a working hash code, but especially it might detect bad hash codes, were there is some pseudo-randomness going wrong. For instance if a factor is way too large for the range (weekday * 1000), so value holes appear.

But also one has to say in all honesty, that all this effort probably really is not needed.

Joop Eggen
  • 96,344
  • 7
  • 73
  • 121
1

In Eclipse, there is a function that generates the method public int hashCode() for you. I used the class attributes you provided and the result is as follows:

@Override
public int hashCode() {
    final int prime = 31;
    int result = 1;
    result = prime * result + ((algorithm == null) ? 0 : algorithm.hashCode());
    result = prime * result + id_place;
    result = prime * result + ((mission_date == null) ? 0 : mission_date.hashCode());
    result = prime * result + mission_hour;
    result = prime * result + x;
    result = prime * result + y;
    return result;
}

It looks a lot like your calculation. However, as Andy Turner pointed out in a comment to your question and Eran in an answer, you simply cannot make a unique hash code for every single instance of an object if their amount exceeds the maximum amount of possible different hash codes.

deHaar
  • 11,298
  • 10
  • 32
  • 38
  • `((algorithm == null) ? 0 : algorithm.hashCode())` could simply be [`Objects.hashCode(algorithm)`](https://docs.oracle.com/javase/8/docs/api/java/util/Objects.html#hashCode-java.lang.Object-) (as of Java 7). – Andy Turner Jan 07 '19 at 12:17
  • @AndyTurner I know, I just pasted the code that was generated by Eclipse... It isn't perfect ;-) – deHaar Jan 07 '19 at 12:17
  • Thanks a lot for the answer. So, my question is... Imagine I want to use a data structure where I want to have a list of PointDensity (with the attributes above). And I want to know how much times they are repeated. I was thinking of using a HashMap like this Map or Map (where String is an unique Key and PointDensity holds also the times it has repeated). What is the optimal way of implementing this? – Guillem Jan 07 '19 at 13:24
0

Because you have multiple fields, use:

public int hashCode() {
    return Objects.hash(id_place, algorithm, mission_date, mission_hour, x, y);
}

If objA.equals(objB) is true, then objA and objB must return the same hash code. If objA.equals(objB) is false, then objA and objB might return the same hash code, if your hashing algo happens to return different hashCodes in this case, it ise good for performance reasons.

 public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    ClassA classA = (ClassA) o;
    return id_place == classA.id_place &&
            mission_hour == classA.mission_hour &&
            x == classA.x &&
            y == classA.y &&
            Objects.equals(algorithm, classA.algorithm) &&
            Objects.equals(mission_date, classA.mission_date);
}
JineshEP
  • 723
  • 4
  • 7