When facing a stuck Thread problem, a common question is how can I dynamically kill the observed stuck Threads in order to quickly recover my middleware environment?
This is a question I’m getting quite often from my work colleagues and clients.
This short article will provide you with background on why this is not a good idea and not possible with current Java specifications and high level strategies to prevent stuck Threads at the first place.
Stuck Thread – What is it?
Stuck Thread problems are very common and can be hard to solve. A stuck Thread is basically a Thread which is hanging and / or has stopped its current assigned tasks for various reasons:
· Thread waiting on a blocking IO call ex: hanging socket.read() operation
· Thread waiting to acquire a lock on an Object monitor (synchronized)
· Thread forced to go in the wait() state ex: waiting to acquire a free JDBC Connection from a pool etc.
· Thread involved in a real deadlock scenario e.g. Thread A waiting on Object monitor held by Thread B, Thread B waiting on Object monitor held by Thread A
· Thread hanging on a disk IO operation
· Thread hanging / paused due to excessive garbage collection going on
· More scenarios…
Exactly the problem I’m facing, please tell me how I can kill these stuck Threads
The simple answer is that you cannot. Earlier Java specifications used to have Thread.stop().
This method was originally designed to destroy a thread without any cleanup:
· Any monitors it held would have remained locked. However, the method was never implemented. If it were to be implemented, it would be deadlock-prone in much the manner of suspend().
· If the target thread held a lock protecting a critical system resource when it was destroyed, no thread could ever access this resource again.
· If another thread ever attempted to lock this resource, deadlock would result. Such deadlocks typically manifest themselves as "frozen" processes.
You can also consult the official Oracle documentation on this subject.
As you can see, because of the above risk scenarios, such Thread stop mechanism is not implemented which does not allow your middleware vendor to expose any Thread stop button / functionality to the end user / middleware administrator.
In order to terminate stuck Threads, you have to bring down the entire JVM.
Any proposed solution?
Your best strategy is to prevent stuck Threads at the first place as much as possible and to properly perform root cause analysis & Thread Dump analysis post stuck Thread related incidents. As you can see in the above scenarios, typical stuck Thread problems are just “symptoms” of other problems so solutions include:
· Proper timeout implementation to prevent forever hanging stuck Thread on blocking IO calls. Most communication API out there properly expose timeout methods allowing you to cap in seconds how long you are allowing your Threads to wait on remote resources (Web Services etc.)
· Application code should be reviewed and optimized in order to eliminate wrong synchronized usage
· Proper capacity planning of your environment is a must in order to prevent overload of key infrastructure components such as an Oracle DB which can trigger slow running queries and stuck Thread on middleware side
· Deadlock scenarios are usually symptoms of code problem / code not Thread safe ex: re-using the same JDBC Connection object between 2..n Threads
· Excessive IO / disk access can trigger stuck Threads, again symptoms of application problem(s) performing too much IO / class loading calls etc.
· Lack of tuning, capacity planning and / or Java Heap memory leak may lead to excessive garbage collection and inevitably to stuck Threads; again symptoms of a bigger problem which requires proper Java Heap tuning and Heap Dump / application memory footprint analysis
I hope this article has helped you understand why you cannot and should not rely on Thread.stop() to fix your stuck Thread problems and strategies to prevent these problems at the first place.
Please don’t hesitate to post any comment or question about any stuck Thread problem you are currently facing.
7 comments:
Hi P-H
How to do Thread Dump analysis post stuck thread , i read your thread dump analysis series, it talks about various section in a thread dump, hope you will soon cover, how to identify the root cause for a struck thread from the dump.
Regards,
Neel
Hi Neel and thanks for your comment,
This subject will be covered starting part 5 of the series. The first parts was to give you background and understanding of Thread Dump structure.
Part5+ will show you how to identify patterns and pinpoint root cause.
Regards,
P-H
Hi PH,
I have found this article and the 5Part series on Thread Dump Analysis extremely useful. I thank you for sharing your knowledge!
Hi,
I am also facing same issue. Ultimate i am using Thread.stop() for killing the thread. Is there any other way?
Regards,
Abhishek Kumar
Well-written IO/sync API are supposed to support Thread.interrupt() which is clean and reliable way to stop a Thread.
See
java.nio.channels.InterruptibleChannel
java.util.concurrent.locks.Lock.lockInterruptibly()
Thanks for your comments,
You are correct, well written IO/sync API does provide a reliable way to stop thread via Thread.interrup(). Unfortunately, a lot of API out there are still using plain blocking IO impl. There are also many other scenarios, not IO related, that can cause threads to get stuck forever such as heavy lock contention (especially when not using ReetrantLock strategy), deadlock etc.
The main point of the article is that current Java EE containers & JVM specifications do not offer any reliable & safe way to terminate stuck threads via admin tasks. Full shutdown & restart of the affected JVM processes is required to cleanup those forever stuck threads.
Regards,
P-H
Thanks another good video,
The producer–consumer problem (also known as the bounded-buffer problem) is a classic example of a multi-process synchronization problem. The problem describes two processes, the producer and the consumer, who share a common, fixed-size buffer used as a queue
http://www.youtube.com/watch?v=dUwboVZ59KM
Learn when and How to Stop the Thread,using jconsole or JvisualVM to indetify is the thread is running
http://www.youtube.com/watch?v=3_Bqhw0d2ko
http://www.youtube.com/watch?v=3E3gNReWCfM
Just purchasing lens is waste of money until you justify the reason behind it,lenses are costly, and prime lens are even costlier, choose wisely the lens you are looking for is really needfull, read the articles blog's gather information about it
Post a Comment