4

I'm having a real curious situation, in a production environment using wildfly 8.2 and Java 1.7.

The situation is that when the server is been up for more than 2 weeks the login begins to drop performance. I have been looking for clues that may indicate where the issue is. Then after doing some testing, I came to the conclusion that the problem is when the password inserted in plain text is been encrypted to be compared with the one already inserted.

When the function that encrypts the password is executed it takes almost 2 minutes, but when the server is restarted the same execution takes less than 30 seconds.

The encryption is using java.security.MessageDigest to generate the hash. Specifically using SHA-256 with 50000 iterations. Any idea why this process could get slower with time? I'm using /dev/urandom for the generation of random so that shouldn't be the problem.

Here is the funtion code:

protected byte[] hash(byte[] bytes, byte[] salt, int hashIterations) throws UnknownAlgorithmException {
    MessageDigest digest = getDigest(getAlgorithmName());
    if (salt != null) {
        digest.reset();
        digest.update(salt);
    }
    byte[] hashed = digest.digest(bytes);
    int iterations = hashIterations - 1; //already hashed once above
    //iterate remaining number:
    for (int i = 0; i < iterations; i++) {
        digest.reset();
        hashed = digest.digest(hashed);
    }
    return hashed;
}
A.rolo
  • 93
  • 10
  • Are you 100% sure that exactly the shown code block is the culprit? Not something around or inside it, e.g. `getDigest()` or `getAlgorithmName()` or something which compares the hash with some existing data? – Dominik Sandjaja Jun 14 '17 at 09:10
  • Have you checked that there are no memory leaks? Getting slower as living time goes up can be a symptom – Pelocho Jun 14 '17 at 09:20
  • Hi @DominikSandjaja thanks for your quick answer. Since is a production enviroment my testing capacity is limited, so I have been able to reduce the issue to that fragment of code ( is in an external library that is been used for authentication ) but I cant ensure exactly wich line, I guess is the digest.digest(hashed) part. – A.rolo Jun 14 '17 at 09:25
  • @Pelocho thanks four asnwer too. I have check the memory and is not full , and the GC graphics shown that the behaviour is as expected. So I dont think that problem could be a memory leak. – A.rolo Jun 14 '17 at 09:28
  • Then I'd guess that you try to find out if one of the called methods does something weird. Maybe try putting it in a JMH test and have it run a gazillion times? – Dominik Sandjaja Jun 14 '17 at 11:34

3 Answers3

5

After somedays of research I have finally found the answer to my question. I want to share it here in case it is useful to someone else.

The problem was caused because of the code cache memory. I was focusing on the heap memory and I didn't see any problem, but when I inspect the non-heap memory I find out that just as the login process began to slow down there was a drop of more than half on the code cache memory used.

Investigating this memory, I have found that when there are large drops in this space, it may happen that the JIT compiler stops working. In conclusion, that was what was happening, and having the JIT compiler turned off caused that each iteration of my encryption loop had to be interpreted each execution, which logically made the process much slower.

Here I leave some links that I find helpful in this topic.

[1] - https://www.quora.com/In-Java-what-exactly-will-the-JVM-interpreter-and-the-JIT-compiler-do-with-the-bytecode

[2] - https://www.atlassian.com/blog/archives/codecache-is-full-compiler-has-been-disabled

Thanks to those who take the time to answer it anyways

A.rolo
  • 93
  • 10
0

Get rid of the statement: int iterations = hashIterations - 1; and just use hashIterations.

In the best case it reduces the iterations from 50000 (in the stated case) to 49999 and in the worst case causes integer underflow and increases the iterations to the max value of int.

At a minimum guard against subtracting 1 when hashIterations is zero.

Also consider instrumentation, for debugging, by logging the value of iterations.

zaph
  • 108,117
  • 19
  • 176
  • 215
  • Hi @zaph in my case the number of iterarion it's fixed, so if I correctly understand what you mean, that case is not possible to happen. Thaks anyway for your response. – A.rolo Jun 14 '17 at 13:57
  • The function call is `hash(byte[] bytes, byte[] salt, int hashIterations)` which means that `0` *could* be passed in for `hashIterations` and then what happens? Can you guarantee that no one will ever call with an iteration count of `0`? If the input is always `50000` there is no reason to have that line of code but is can cause unexpected results. Code really should be written to behave in a reasonable and expected manor given any input. Otherwise the code is a ticking time-bomb. tick-tick-tick-... – zaph Jun 14 '17 at 16:24
  • 1
    That fragment of code belongs to an external library, the number of iteration have a default value, and the calls to that functions are been made by the library as well. I'm sure that the function is not been call with the value 0 because the number of iterations are been taken from the encrypted password ( also I did some debugging and make sure the number is always the expected) – A.rolo Jun 15 '17 at 07:07
0

Why will somebody close this??? Maybe because there's nothing there what could probably cause the problem.

While digest.digest is normally the time-consuming part, it's a pure computation and there's nothing what could slow it down. So what remains is getAlgorithmName() and getDigest(String). The former is probably a trivial getter, but the latter probably uses MessageDigest.getInstance which locates the digest. Just guessing: There's a lookup over all security providers and all the stuff they provide and someone may lengthen this list somehow.

You can sort of benchmark this library method even in production: Just copy the method in a new source file and add some logging and some code invoking it periodically (or manually, if you prefer). When the slowdown happens, you'll have something to compare to and you'll find some detailed timings in your logs.

When all imaginable reasons are exhausted, try the non-imaginable ones like varying iterations (which you believe is a constant), etc.

maaartinus
  • 40,991
  • 25
  • 130
  • 292
  • 1
    Your are right, the functions MessageDigest.getInstance is using MessageDigest.getInstance(algorithmName) which as you said is doing a look up through all security providers. I already did what you are suggesting about replicating classes and adding log. But I have to wait until the issue reproduces again, so I will be able to see that log. – A.rolo Jun 15 '17 at 07:12