36

I need to hash multiple keys from multiple threads using MessageDigest in a performance critical environment. I came to know that MessageDigest is not thread safe as it stores its state in it's object. What can be the best possible way to achieve thread safe hashing of keys?

Use case:

MessageDigest messageDigest = MessageDigest.getInstance("SHA-1");

//somewhere later, just need to hash a key, nothing else
messageDigest.update(key);
byte[] bytes = messageDigest.digest(); 

Specifically:

  1. Will ThreadLocal guaranteed to work? Will it have performance penalty?
  2. Are the objects returned by getInstance different and they do not interfere with each other? The documentation says 'new' object, but I am not sure whether it is just a wrapper on (shared) shared concrete class?
  3. If getInstance() returns 'real' new objects, is it advisable to create a new instance each time I need to calculate the hash? In terms of performance penalty - how costly is it?

My use case is very simple - just hash a simple key. I cannot afford to use synchronization.

Thanks,

Anil Padia
  • 433
  • 1
  • 6
  • 12

3 Answers3

54

Create a newMessageDigest instance each time you need one.

All of the instances returned from getInstance() are distinct. They need to be, as they maintain separate digests (and if that's not enough for you, here's a link to the source).

ThreadLocal can provide a performance benefit when used with a threadpool, to maintain expensive-to-construct objects. MessageDigest isn't particularly expensive to construct (again, look at the source).

parsifal
  • 681
  • 6
  • 4
  • 1
    If I see the codeof getInstance(), it doesn't seem to create new object, rather it calls Security to get the object Object[] objs = Security.getImpl I wrote test case below: MessageDigest messageDigest1 = MessageDigest.getInstance("SHA-1"); MessageDigest messageDigest2 = MessageDigest.getInstance("SHA-1"); // update and digest and saw that both the messageDigest objects are different, as well their inner objects/buffers are also different. So, I guess ThreadLocal should work. And yes, it is a web server with thread pool. I will use ThreadLocal. Thanks, – Anil Padia Jul 10 '13 at 08:58
  • 8
    @AnilPadia - I **strongly** recommend **not** using `ThreadLocal`. It's premature optimzation. I wrote a micro-benchmark that took approximately 2 *micro*-seconds to create a new `MessageDigest`. That is going to be *far* outweighed by the code that uses the digest. – parsifal Jul 10 '13 at 12:16
  • What are the problems you see with using ThreadLocal. Even if I have hundreds of thread, there will be hundreds of such objects. I found the memory footprint of such objects really less. ThreadLocal is working fine for me. I also tested creating objects and it took 4 microseconds. I would really like to know why are against ThreadLocal – Anil Padia Jul 11 '13 at 08:02
  • @AnilPadia - As I already said, creating a new `MessageDigest` is a very fast operation. You may save a microsecond or two by using `ThreadLocal`, but that's almost certainly an infinitesimal part of your overall execution time. And with the `ThreadLocal`, you need additional code that has to be maintained, and you also have to be more careful with exception handling. You *do* have a finally block to clean up your digest on any exception, right? – parsifal Jul 11 '13 at 21:02
  • But, I realize that you're happy with the `ThreadLocal`, and I'm unlikely to sway your opinion. So do what makes you happy. – parsifal Jul 11 '13 at 21:03
  • 1
    There _can_ be significant performance issues in heavy multi-threaded systems (I've verified this on jdk1.7.0_80) as the underlying `java.security.Provider.getService` method is synchronized and appears to be a singleton. – Martin Serrano Oct 30 '16 at 13:34
  • Whill create instance every time we need will introduce any performance issue? I am using MD5 instance – JaskeyLam Aug 18 '17 at 03:26
6

As an alternative, use DigestUtils, Apache Commons' thread-safe wrapper for MessageDigest.

sha1() does what you need:

byte[] bytes = sha1(key)

fishyfriend
  • 138
  • 1
  • 6
  • 2
    See the answer below; DigestUtils is no more threadsafe than MessageDigest, as DigestUtils.getDigest() just calls MessageDigest.getInstance() and transform a checked exception into an unchecked exception. – Siddhu Oct 12 '15 at 07:46
  • 6
    The point here is that MessageDigest is not thread-safe, so reusing the same instance in a concurrent environment lead to unpredictable results. using a new/different instance (for example by calling MessageDigest.getInstance) every time solves your problem. DisgestUtils is thread-safe in the meaning that each of its convenience methods uses a new MessageDigest instance, each of them calls (after a couple of calls) MessageDigest.getInstance creating a new instance. for example each call to DigestUtils. sha256Hex("My string"); will use a different instance of MessageDigest – Legna Sep 24 '16 at 03:27
1

You could use ImmutableMessageDigest from Caesar, an open source library I wrote.

It essentially wraps a MessageDigest instance and clones it before every digest() or update() call.

Janez Kuhar
  • 2,202
  • 2
  • 16
  • 32