7

I'm running Tomcat 7.0.53 on CentOS 6.5 64 bit and OpenJDK 1.7 64.

I have several servers where occasionally - like randomly once per week - Tomcat will just gracefully shut itself down. I've been looking at this for months and cannot find the reason. The only pattern seems to be a long period of nothing going on then the shutdown.

  • I have upgraded to latest versions of the underlying tools (Tomcat, Java, etc)
  • I am running with the default memory settings
  • I have disabled the SHUTDOWN in the server.xml
  • I have moved MySQL to another box to ensure Tomcat is the only substantial application (server also runs nginx)
  • I have verified that I have no System.exit() in any of my code. I have not scanned the libs as I can't figure out how to do that. And I don't expect I will find anything there either. Why would a library exit anyway?
  • I have enabled logging of garbage collection. But they seem pretty quick - like full GC in 0.15s

Here is the memory usage of one of the servers. Lots of RAM free.

                 total       used       free     shared    buffers     cached
    Mem:          2006        771       1234          0        176        281
    -/+ buffers/cache:        313       1692
    Swap:         2047          0       2047

Below is one of the events in the Catalina.out. You can see it starts then does nothing exciting for several hours. Then gracefully shuts itself down as if it was told to.

I've researched this to death and have not bee able to get a handle on this.

Can someone please propose a plan of action for me?

Thanks

From the Server.xml:

    <Server port="-1" shutdown="__SHUTDOWN__">

From Catalina.out:

Apr 28, 2014 5:34:50 PM org.apache.tomcat.util.digester.SetPropertiesRule begin
WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Valve} Setting property 'remoteIpProxiesHeader' to 'x-forwarded-by' did not find a matching property.
Apr 28, 2014 5:34:50 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 5:34:50 PM org.apache.coyote.AbstractProtocol init
INFO: Initializing ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 5:34:50 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 3247 ms
Apr 28, 2014 5:34:50 PM org.apache.catalina.core.StandardService startInternal
INFO: Starting service Catalina
Apr 28, 2014 5:34:50 PM org.apache.catalina.core.StandardEngine startInternal
INFO: Starting Servlet Engine: Apache Tomcat/7.0.53
Apr 28, 2014 5:34:59 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 5:34:59 PM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 5:34:59 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 8278 ms
Apr 28, 2014 5:41:53 PM org.apache.jasper.compiler.TldLocationsCache tldScanJar
INFO: At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol pause
INFO: Pausing ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 10:32:32 PM org.apache.catalina.core.StandardService stopInternal
INFO: Stopping service Catalina
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol stop
INFO: Stopping ProtocolHandler ["ajp-apr-8009"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["http-apr-8080"]
Apr 28, 2014 10:32:32 PM org.apache.coyote.AbstractProtocol destroy
INFO: Destroying ProtocolHandler ["ajp-apr-8009"]
May 05, 2014 8:10:32 PM org.apache.catalina.core.AprLifecycleListener init
INFO: Loaded APR based Apache Tomcat Native library 1.1.29 using APR version 1.3.9.
May 05, 2014 8:10:32 PM org.apache.catalina.core.AprLifecycleListener init
INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true].
May 05, 2014 8:10:32 PM org.apache.tomcat.util.digester.SetPropertiesRule begin
PrecisionPete
  • 2,417
  • 4
  • 25
  • 42
  • Use a profiler to evaluate the application in live. You should also analyze a heap dump and a thread dump to have better understanding of the problems within your applications. – Luiggi Mendoza May 06 '14 at 14:31
  • 1
    It's difficult to say, but tomcat shutting down is typically due to one or two things. 1) It's leaking ThreadLocals, check your JDBC driver, this is the most common source. 2) The server, or one of its applications has crashed irrecoverably (for instance in native code). Edit: Also explicit calls to the shutdown hook. :) – Mikkel Løkke May 06 '14 at 14:46
  • My initial gut feeling would be that this is not happening from inside the server, something in the OS is tripping the shutdown. But I haven't the faintest clue what magical process that could be. – Gimby May 06 '14 at 14:55
  • 1
    Change your tomcat ports in server.xml (specifically your shutdown port) and update your tomcat script that does shutdown / start to log when it's called and taking a snapshot of what's running. Also, do you have multiple versions of tomcat running or something else that might be hitting your shutdown listener port? – Jan Vladimir Mostert May 06 '14 at 15:17
  • check that your application does not have a memory leak look at 2nd answer here http://stackoverflow.com/questions/6470651/creating-a-memory-leak-with-java – mosaad May 06 '14 at 15:42
  • Periodic job that stupidly stops Tomcat? Perhaps some kind of attempt at log rotation? – Christopher Schultz May 06 '14 at 18:44
  • Thanks for the ideas... I did disable the shutdown port. So I know it's not that. And I believe a Kill will make more noise in the log than this. Also... I'm pretty sure it's not log rotation. I checked and the Linux logrotate does not have a definition for tomcat. And any log date changes do not coincide with the pause times. That leaves me with the really hard ones. – PrecisionPete May 07 '14 at 01:48

5 Answers5

8

Something is telling Tomcat to shut down.

Tomcat does not gracefully shut down when it's "in trouble" (out of memory, or anything else). Linux does not gracefully shut down processes when it's low on memory, it stops them hard with a kill -9, which leave no trace in the Tomcat log.

  • kill -15 will shut down tomcat gracefully, via the shut down hook. (The answer previously said kill -3 here, which is simply causing a thread dump.)

  • An internal exit() will shut it down gracefully as well, again via the shut down hook.

  • It can be shut down via JMX, and, of course the shut down command from Catalina.

So, someone, somewhere, is telling Tomcat to do this. This isn't some unknown, internal "problem". Those kind of things just kill tomcat outright, they don't ask politely.

One thing you might try is to enable DEBUG for logging, to see what chatter you get. If you get too much, you can try looking specifically at org.apache.catalina.core.StandardContext. This may well not give you more than it already being logged, but it may tell you something.

After that, if so motivated, and it continues, I'd suggest building Tomcat yourself, and interjecting some stack trace dumps.

For example, in org.apache.coyote.AbstractProtcol, you can add:

Exception e = new Exception("Shut down trace");
getLog().info("Shutting down protocol", e);

to the stop() method.

That will give you a solid stack trace to learn from whence this came. Tomcat isn't really a nasty event storm of just random messages floating around. The shut down is pretty synchronous. So, that stack trace will give you a solid lead as to who started the party. Then you can look from there to see how that could have happened.

There's a bunch of interfaces and abstractions in Tomcat, but really only a few implementations, so it's not impenetrable. And having a smoking stack trace will help immensely in pinning it down.

nwinkler
  • 46,078
  • 18
  • 137
  • 157
Will Hartung
  • 107,347
  • 19
  • 121
  • 195
  • Good point regarding injecting our own logs in the `stop()` method – Saif Asif May 07 '14 at 05:50
  • Short typo above : kill -3 (SIGQUIT) does not kill the JVM but produces a thread dump (you meant kill -15 (SIGTERM) that indeed gracefully shutdown the JVM and activates the various shutdown hooks). – SRG Aug 07 '15 at 06:49
1

A case in DC/OS

Recently we had a similar incident. Our tomcat is running in a docker as a service in a DC/OS environment. The stderr just said tomcat was killed, and there was no useful traces in any log files of tomcat.

It turned out the reason was simply that we did not allocate enough memory to the service. More debugging suggestions are here, but the key technique we used is to check the kernel log:

# journalctl -f _TRANSPORT=kernel
...
Mar 29 15:09:09 host-17 kernel: Memory cgroup out of memory: Kill process 7935 (java) score 1021 or sacrifice child
Mar 29 15:09:09 host-17 kernel: Killed process 7906 (java) total-vm:11583840kB, anon-rss:1032860kB, file-rss:30924kB, shmem-rss:0kB
...

Once we found that out, we increased the memory allocation to the service in DC/OS and the problem was solved.

leeyuiwah
  • 5,456
  • 4
  • 31
  • 54
0

How is your Tomcat started? If there is a parent process, a signal sent to the parent could be the cause of a clean shutdown.

nwinkler
  • 46,078
  • 18
  • 137
  • 157
Gerard H. Pille
  • 2,147
  • 1
  • 10
  • 15
0

There is a known (and fixed bug) in Tomcat versions prior to v7.0.55:

https://bz.apache.org/bugzilla/show_bug.cgi?id=56684

The thread is really good reading but I can summarize it for you. When you start Tomcat there is a wait state that starts waiting for the shutdown command. This is normal, but the bug in these versions is an implicit timeout on this wait of about 50 days. When the wait state timed out, the Tomcat would shutdown.

So there ARE cases of odd things happening in Tomcat and NO, you are not crazy. This was happening to a SOLR cluster that I maintained for more than a year before I found this bug and fixed it with an upgrade. No one ever noticed because it took so long to crash and with all that goes on it can be easy to brush past finding the root cause. The easiest thing is to just restart Tomcat and forget it ever happened.

I hope this answer helps you. I see that you are/were running a version with this bug.

Good luck!

e.thompsy
  • 1,208
  • 1
  • 15
  • 18
0

How is your Tomcat started? If there is a parent process, a signal sent to the parent could be the cause of a clean shutdown. I suggest adding set -m option into start_script.sh.

qudongfang
  • 141
  • 8