0

There's a Java application that occasionally starts to utilise all available cores, starves GC and crashes with OOM. The application is quite complex. It uses Akka Streams, Kafka, Etcd and has a built-in HTTP server for metrics reporting. I've added the -XX:+CrashOnOutOfMemoryError to create core dumps, but they aren't helpful, they show a thread that couldn't allocate memory but what is needed is what threads are actually eating all the CPU. Any guides or ideas on what could be be done to find out what's going on?

The application is running on OpenJDK 11 on Linux with 3 cores assigned via cgroups. Heap size is set to 3Gb and initial heap size is 1.5Gb, it uses G1 without any tweaks.

UPD: At the time of crash heap_lock is held by neither the active thread nor by a GC thread and last 10 GC events essentially failed to free any memory.

synapse
  • 5,123
  • 6
  • 27
  • 56
  • You have to use tool like New Relics to identify. – Sambit Jul 29 '19 at 19:07
  • Get a profiler. [VisualVM](https://visualvm.github.io/) is free and does quite a bit. You can use it see snapshots of memory usage, or see cpu activity by thread, or see which functions are being called the most; other stuff too. – kaan Jul 29 '19 at 20:15
  • To close voters: Asking for a minimal reproducible example does not seem helpful, because pinpointing the cause of the memory leak is exactly what OP seems to have trouble with. And as for the tool recommendation, OP did not ask which tool is best, but seems unaware what kind of tool to look for. That can be answered objectively. – meriton Jul 29 '19 at 22:07

2 Answers2

2

If you can reproduce the problem run your application with Java Flight Recorder or other CPU profiler. This will give you more information although most JVM based profilers are slightly skewed as they profile JVM and not the OS e.g. might not report time spent in the native methods correctly. If you are on Linux you can try OS level profiling with Flame Graph.

If you suspect that CPU runs wild due to insufficient memory follow this answer and open the heap dump in Memory Analyzer (MAT). Your application is multi-threaded, so unless a single thread is allocating a humongous array, the thread that throws OutOfMemoryError might not be the thread that consumes the memory. I'd probably do this as step one and ensure that the application has enough heap memory before diving into profiling CPU.

Karol Dowbecki
  • 38,744
  • 9
  • 58
  • 89
2

A JVM about to run out of memory spends most of its time doing garbage collection.

Specifically, the cost of garbage collection is proportional to memory used, and its frequency inversely proportional to memory freed. Therefore, as memory usage approaches 100%, garbage collection overhead tends towards infinity ... and that's actually what causes the JVM to abort with an OutOfMemoryError (it's not that JVM couldn't free any more memory if it tried, but the effort of collection being wholly disproportionate to memory freed)

You can check whether this is the cause of your CPU problems by inspecting JVM metrics, specifically the garbage collection overhead. You can inspect JVM metrics using JConsole or any other JMX client.

If JVM metrics confirm that most CPU time is spent in GC, fixing your memory problem will be enough. To learn how to fix memory problems, see How to identify the issue when Java OutOfMemoryError?

meriton
  • 61,876
  • 13
  • 96
  • 163
  • What’s strange is that the application uses all the CPU cores assigned via cgroups (specifically all three) before going down. Could it be indicative of insufficient heap memory assigned? – synapse Jul 29 '19 at 20:11
  • That would be consistent with a GC problem; most GC algorithms employ several threads. So yes, it could be too little memory ... or a memory leak. – meriton Jul 29 '19 at 20:45
  • See update. I thought it would start doing "stop the world" collections in a single thread when really low on memory. Anyway, I'll start by examining heap dump and then move to profilers and `perf`. – synapse Jul 29 '19 at 22:00