Pages

7.05.2012

How to analyze Thread Dump – Part 5: Thread Stack Trace

This article is part 5 of our Thread Dump analysis series. So far you have learned the basic principles of threads and their interactions with your Java EE container & JVM. You have also learned different Thread Dump formats for HotSpot and IBM Java VM’s. It is now time for you to deep dive into the analysis process.

** UPDATE: Thread Dump analysis tutorial videos now available here.

In order for you to quickly identify a problem pattern from a Thread Dump, you first need to understand how to read a Thread Stack Trace and how to get the “story” right. This means that if I ask you to tell me what the Thread #38 is doing; you should be able to precisely answer; including if Thread Stack Trace is showing a healthy (normal) vs. hang condition.

Java Stack Trace revisited

Most of you are familiar with Java stack traces. This is typical data that we find from server and application log files when a Java Exception is thrown. In this context, a Java stack trace is giving us the code execution path of the Thread that triggered the Java Exception such as a java.lang.NoClassDefFoundError, java.lang.NullPpointerException etc. Such code execution path allows us to see the different layers of code that ultimately lead to the Java Exception.

Java stack traces must always be read from bottom-up:
-        The line at the bottom will expose the originator of the request such as a Java / Java EE container Thread.
-        The first line at the top of the stack trace will show you the Java class where that last Exception got triggered.

Let’s go through this process via a simple example. We created a sample Java program simply executing some Class methods calls and throwing an Exception. The program output generated is as per below:

JavaStrackTraceSimulator
Author: Pierre-Hugues Charbonneau

Exception in thread "main" java.lang.IllegalArgumentException:
        at org.ph.javaee.training.td.Class2.call(Class2.java:12)
        at org.ph.javaee.training.td.Class1.call(Class1.java:14)
        at org.ph.javaee.training.td.JavaSTSimulator.main(JavaSTSimulator.java:20)

-        Java program JavaSTSimulator is invoked (via the “main” Thread)
-        The simulator then invokes method call() from Class1
-        Class1 method call() then invokes Class2 method call()
-        Class2 method call()throws a Java Exception: java.lang.IllegalArgumentException
-        The Java Exception is then displayed in the log / standard output

As you can see, the code execution path that lead to this Exception is always displayed from bottom-up.

The above analysis process should be well known for any Java programmer. What you will see next is that the Thread Dump Thread stack trace analysis process is very similar to above Java stack trace analysis.

Thread Dump: Thread Stack Trace analysis

Thread Dump generated from the JVM provides you with a code level execution snapshot of all the “created” Threads of the entire JVM process. Created Threads does not mean that all these Threads are actually doing something. In a typical Thread Dump snapshot generated from a Java EE container JVM:

-        Some Threads could be performing raw computing tasks such as XML parsing, IO / disk access etc.
-        Some Threads could be waiting for some blocking IO calls such as a remote Web Service call, a DB / JDBC query etc.
-        Some Threads could be involved in garbage collection at that time e.g. GC Threads
-        Some Threads will be waiting for some work to do (Threads not doing any work typically go in wait() state)
-        Some Threads could be waiting for some other Threads work to complete e.g. Threads waiting to acquire a monitor lock (synchronized block{}) on some objects

I will get back to the above with more diagrams in my next article but for now let’s focus on the stack trace analysis process. Your next task is to be able to read a Thread stack trace and understand what it is doing, on the best of your knowledge.

A Thread stack trace provides you with a snapshot of its current execution. The first line typically includes native information of the Thread such as its name, state, address etc. The current execution stack trace has to be read from bottom-up. Please follow the analysis process below. The more experience you get with Thread Dump analysis, the faster you will able to read and identify very quickly the work performed by each Thread:

-        Start to read the Thread stack trace from the bottom
-        First, identify the originator (Java EE container Thread, custom Thread ,GC Thread, JVM internal Thread, standalone Java program “main” Thread etc.)
-        The next step is to identify the type of request the Thread is executing (WebApp, Web Service, JMS, Remote EJB (RMI), internal Java EE container etc.)
-        The next step is to identify form the execution stack trace your application module(s) involved e.g. the actual core work the Thread is trying to perform. The complexity of analysis will depend of the layers of abstraction of your middleware environment and application
-        The next step is to look at the last ~10-20 lines prior to the first line. Identify the protocol or work the Thread is involved with e.g. HTTP call, Socket communication, JDBC or raw computing tasks such as disk access, class loading etc.
-        The next step is to look at the first line. The first line usually tells a LOT on the Thread state since it is the current piece of code executed at the time you took the snapshot
-        The combination of the last 2 steps is what will give you the core of information to conclude of what work and / or hanging condition the Thread is involved with

Now find below a visual breakdown of the above steps using a real example of a Thread Dump Thread stack trace captured from a JBoss 5 production environment. In this example, many Threads were showing a similar problem pattern of excessive IO when creating new instances of JAX-WS Service instances.


As you can see, the last 10 lines along with the first line will tell us what hanging or slow condition the Thread is involved with, if any. The lines from the bottom will give us detail of the originator and type of request.

I hope this article has helped you understand the importance of proper Thread stack trace analysis. I will get back with much more Thread stack trace examples when we cover the most common Thread Dump problem patterns in future articles. The next article will now teach you how to breakdown the Thread Dump threads in logical silos and come up with a potential list of root cause “suspects”.

Please feel free to post any comment and question.

54 comments:

  1. Emanuele GherardiniJuly 7, 2012 at 7:19 AM

    Thanks a lot for this series, it is indeed useful and very well explained.

    ReplyDelete
  2. Thanks Emanuele for your comments,

    Please feel free to ask any question on this. I'm planning to release the next article on this series about ~ 2 weeks.

    Regards,
    P-H

    ReplyDelete
  3. Thanks for the details on this.. Can you please explain some threads waiting on SocketRead, ASyncLibrary etc

    ReplyDelete
  4. Hi Anonymous,

    Sure, the next articles will include much more examples; including common blocking IO calls.

    Regards,
    P-H

    ReplyDelete
  5. Very informative. Eagerly waiting for more thread dump analysis

    ReplyDelete
  6. Thanks Allidonow,

    I'm hoping to have the next article ready in one week from now.

    Regards,
    P-H

    ReplyDelete
  7. Hi Pierre,

    I've been reading your Thread Dump series of posts and found them most educating. I'm new to WebLogic administration and not being from a Java background have been struggling to grasp the concepts of JVM & Threads Dumps. So these posts are helpful to me.
    Could you possibly include a few sample dumps at the end of each post. Would serve as a small exercise to newbies like me.

    Regards,
    Ani

    ReplyDelete
  8. Hi Ani,

    Glad to see that this series is helpful to you.

    The next articles will include sample Thread Dumps as well you can practice yourself.

    Regards,
    P-H

    ReplyDelete
  9. P-H

    This an awesum tutorial on thread dump. i read all of them and practiced on my real dumps. And i guess it worked very well. Thanks for sharing such a good stuff !!!

    ReplyDelete
  10. Thanks Gaurav for your comments.

    The series is not over yet so please stay tuned for more updates.

    Thanks.
    P-H

    ReplyDelete
  11. Thanks Rob for your comments,

    Not at this time but I will continue to write articles to complete the Thread Dump analysis series with more problem patterns and examples.

    Regards,
    P-H

    ReplyDelete
  12. thanks a lot for the nice articles you are posting, waiting for the next one
    we faced a very critical problem in our production environment, the CPU becomes full and everything is stuck till we restart the whole server (we are using weblogic 10.3.5.0)

    after checking the thread dumps we found stuck threads, the code related to our application reads from shared application module, but it is a very normal viewobject



    any idea why does this happen? thank you in advance

    ReplyDelete
  13. Thanks for the nice post!

    The thread dump is not give us the full stack trace -
    ajp-0.0.0.0-8009-7 tid=32483751528 [NEW]
    java.net.SocketInputStream.$$YJP$$socketRead0(FileDescriptor, byte[], int, int, int)
    java.net.SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int)
    java.net.SocketInputStream.read(byte[], int, int)
    org.apache.coyote.ajp.AjpProcessor.read(byte[], int, int)
    org.apache.coyote.ajp.AjpProcessor.readMessage(AjpMessage)
    org.apache.coyote.ajp.AjpProcessor.process(Socket)
    org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(Socket)
    org.apache.tomcat.util.net.JIoEndpoint$Worker.run()
    java.lang.Thread.run()


    We are using jboss 5. do we have to pass any jvm params?

    ReplyDelete
  14. Hi P-H,

    The posts were really awesome !!

    As a weblogic administrator I really find it hard to pin point the application which is causing stuck threads . The problem pops when the application doesn't use any workmanager and threads are tagged to the default one. Is there an easy way to find out which application is causing the issue ? In my environment we have around 20 SOA Composites & 30 + OSB Projects - in which only a few has workmanager tagged to them.

    Thanks
    Shankar M

    ReplyDelete
  15. Hi anonymous,

    This STack Trace is actually a full stack trace. This Thread simply means that it is waiting for an incoming request e.g. not stuck.

    Do you have any other STUCK thread you want to share from one of your problem case?

    Thanks.
    P-H

    ReplyDelete
  16. Thanks Shankar for your comments,

    Actually yes. In order to correlate stuck threads with which EAR file etc. you need to go to the monitoring/Threads tab. Then select customize and add the "application" & work manager columns.

    The thread matrix will then display the full list of threads along with the associated WorkManager and application (EAR file).

    Regards,
    P-H

    ReplyDelete
  17. Thanks Nice Article,

    Loooking forward for next articles

    ReplyDelete
  18. Nice work. I was eagerly looking for this kind of article.

    Eagerly waiting for your upcoming articles regarding the same.

    I love the way you have expalianed the things.

    ReplyDelete
  19. Thanks Satish and Srinivas for your comments,

    The next articles of this series will show you the common problem patterns found from various Thread Dump sample data.

    Regards,
    P-H

    ReplyDelete
  20. This is wonderful! Eagerly waiting for the Part-6 release. Please post it ASAP

    ReplyDelete
  21. Thanks anonymous,

    I'm currently working on part 6 and expecting to post shortly.

    Regards,
    P-H

    ReplyDelete
  22. Thanks for the nice posts, I am waiting for your part 6 for common pattern problems.

    ReplyDelete
  23. Thanks P Samaiya for your positive feedback,

    The part 6 will describe the general Thread Dump analysis approach I use and recommend. Future articles will describe the various problem patterns.

    Thanks.
    P-H

    ReplyDelete
  24. Hi P-H,

    Very useful information I got from this article. The way you have explained that is really helpful to understand TD in right direction.

    Thanks,
    Sujit

    ReplyDelete
  25. Thanks Sujit,

    I will work and publish the part 6 shortly. I will then kick-off a new series discussing the different Thread Dump problem patterns.

    Thanks.
    P-H

    ReplyDelete
  26. Hi P-H thanks for such well written posts. I am waiting for your 6th page, hope to see it soon

    ReplyDelete
  27. Thanks gcharan09 for your comments,

    The part6 of the series will be coming soon...

    Thanks.
    P-H

    ReplyDelete
  28. Hi P-H,

    Thanks a lot for the nice articles, waiting for the next article.

    ReplyDelete
  29. How to create thread dump??? is there any software required for this or by coding itself it can be created.whatever will be ,,plz give proper light on how to create dump???

    ReplyDelete
  30. Hi anonymous,

    I will release a seperate post & video on how to create a thread dump from various OS. Essentially the easiest way to create a Thread Dump is to run the following command from UNIX based OS:

    kill -3
    CTRL + Break for Windows (if you have access to the Java console)

    You can also use the jstack utility (HotSpot JVM) or the JVisualVM monitoring tool as well.

    Regards,
    P-H

    ReplyDelete
  31. Hi P-H,

    When are you going to post the next part?

    ReplyDelete
  32. My next article will be on thread dump analysis part 6. It should be released in the next 5-7 days.

    After that, we will study different thread dump analysis patterns.

    Regards,
    P-H

    ReplyDelete
  33. Wonderful series of unique knowledge. Really helpful.
    Thanks a lot & waiting for the next

    ReplyDelete
  34. HI P-H,

    This is Gaurav and i just start following you articles. All are really nice and knowledgeable.
    I am running with a serious server down problem on production from last couple of weeks. I will send you thread dump from prod.

    Please share you email id...

    Also please share your latest (6th) thread dump analysis link in which you will show how to conclude Root Cause and provide solution........

    ReplyDelete
  35. Fantastic job Pierre, i have learned a lot with this serie and i hope that you may continue to share your knowledge with us.

    Thank you very much.

    PD
    this site is already on my bookmarks along with stackoverflow, theserverside and javacodegeeks

    ReplyDelete
  36. Excellent series of Articles. Looking forward to next parts.

    ReplyDelete
  37. Thank you Upendra,

    My next article will provide 10 tips on how to analyze a JVM thread dump. We will then proceed with problem patterns and case studies using simulation programs.

    Regards,
    P-H

    ReplyDelete
  38. Great work Pierre , Your time in sharing the knowledge is really appreciated.

    Thank you for all the good work.

    ReplyDelete
  39. Please advise how we can analyse a "STUCK THREAD" problem.

    ReplyDelete
  40. Thanks for all the comments, I will release more tips regarding thread dump analysis as soon as possible.

    P-H

    ReplyDelete
  41. Hi Pierre,

    Please advise how to find the details about waiting on condition threads. I am using IBM thread and monitor anlayser.

    ReplyDelete
  42. Best tutorial on thread dump analysis on the Internet. I'm sure there are lots of people keen for Part 6!

    ReplyDelete
  43. Talk about a cliff hanger. Part 6 is exactly what I need. Hope you continue with this blog

    ReplyDelete
  44. Hi, yes I will move on with part 6 shortly, thank you for your support.

    P-H

    ReplyDelete
  45. Hello P-H,

    Very nice. Thanks for providing us very informative concepts about thread dump analysis.
    If possible, can you also post some articles about the usage of tools for ThreadDump Analysis. I am familiar with Samurai and IBM TDA but do not know how to extensively use those tools. If you know about any tool which is good to use and you have good knowledge, please share with us how to use that. It will be helpful for many others who ae willing to learn.

    Thanks.
    Pranay.

    ReplyDelete
  46. Thanks, I am planning to release thread dump analysis articles and videos using TDA and one more. That being said, these tools only help to consolidate the thread data, they wont figure out the root cause for you, so I would still recommend to learn how to analysis thread dump using raw data in parallel of other tools.

    Thanks.
    P-H

    ReplyDelete
  47. Good to hear about part 6 coming out. I just stumbled across your blog and plan to read a lot more. -Allan

    ReplyDelete
  48. Thanks. I will be publishing more articles on Thread Dump analysis given it is a common learning pressure point across delivery and support individuals.

    I am also working on a very low cost course that will be available via the Udemy platform. Please stay tune for more updates...

    Thanks.
    P-H

    ReplyDelete
  49. PH,

    This thread dump doesn't give much information, could you please let us know what's going on?, tomcat crashed is the issue here

    com.adtran.pma.agent.PmaAgent.sendReceiveNetconfMsg(java.lang.String, java.lang.String) @bci=0 (Interpreted frame)
    - com.adtran.pma.northbound.camel.NetConfProducer.process(org.apache.camel.Exchange) @bci=152, line=39 (Interpreted frame)
    - org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(org.apache.camel.Exchange, org.apache.camel.AsyncCallback) @bci=21, line=61 (Interpreted frame)
    - org.apache.camel.processor.SendProcessor$2.doInAsyncProducer(org.apache.camel.Producer, org.apache.camel.AsyncProcessor, org.apache.camel.Exchange, org.apache.camel.ExchangePattern, org.apache.camel.AsyncCallback) @bci=45, line=152 (Interpreted frame)
    - org.apache.camel.impl.ProducerCache.doInAsyncProducer(org.apache.camel.Endpoint, org.apache.camel.Exchange, org.apache.camel.ExchangePattern, org.apache.camel.AsyncCallback, org.apache.camel.AsyncProducerCallback) @bci=203, line=304 (Interpreted frame)
    - org.apache.camel.processor.SendProcessor.process(org.apache.camel.Exchange, org.apache.camel.AsyncCallback) @bci=189, line=147 (Interpreted frame)
    - org.apache.camel.management.InstrumentationProcessor.process(org.apache.camel.Exchange, org.apache.camel.AsyncCallback) @bci=47, line=72 (Interpreted frame)
    - org.apache.camel.processor.RedeliveryErrorHandler.process(org.apache.camel.Exchange, org.apache.camel.AsyncCallback) @bci=459, line=416 (Interpreted frame)
    - org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun() @bci=166, line=1526 (Interpreted frame)
    - org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run() @bci=63, line=1482 (Interpreted frame)
    - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1142 (Interpreted frame)
    - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617 (Interpreted frame)
    - org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run() @bci=4, line=61 (Interpreted frame)
    - java.lang.Thread.run() @bci=11, line=745 (Interpreted frame)

    ReplyDelete
  50. Hi PH,
    I am not able to find the next article in this series as
    How to analyze Thread Dump – Part 6: Thread Stack Trace,Please help me to find that. also thanks for such a great series.

    ReplyDelete
  51. Hi PH,
    Even I am not able to find the next article in this series as
    How to analyze Thread Dump – Part 6: Thread Stack Trace,Please help me to find that. also thanks for such a great series.

    ReplyDelete
  52. Hi,

    The next focus will be on YouTube videos which are much more interactive to describe the Thread Dump analysis process and understand common problems patterns.

    I will release a new series this fall on YouTube, including JVM troubleshooting on the Cloud.

    http://www.youtube.com/c/PierreHuguesCharbonneau

    Thanks.

    ReplyDelete