/ How to analyze Thread Dump – Part 3: HotSpot VM ~ Java EE Support Patterns

2.22.2012

How to analyze Thread Dump – Part 3: HotSpot VM

This is part 3 of our Thread Dump analysis series which will provide you with an overview of what is a JVM Thread Dump for the HotSpot VM and the different Threads that you will find. Detail for the IBM VM Thread Dump format will be provided in the part 4.

** UPDATE: Thread Dump analysis tutorial videos now available here.

Please note that you will find the Thread Dump sample used for this article from the root cause analysis forum.

JVM Thread Dump – what is it?

A JVM Thread Dump is a snapshot taken at a given time which provides you with a complete listing of all created Java Threads.

Each individual Java Thread found gives you information such as:

-        Thread name; often used by middleware vendors to identify the Thread Id along with its associated Thread Pool name and state (running, stuck etc.)

-        Thread type & priority ex: daemon prio=3 ** middleware softwares typically create their Threads as daemon meaning their Threads are running in background; providing services to its user e.g. your Java EE application **

-        Java Thread ID ex: tid=0x000000011e52a800 ** This is the Java Thread Id obtained via java.lang.Thread.getId() and usually implemented as an auto-incrementing long 1..n**

-        Native Thread ID ex: nid=0x251c** Crucial information as this native Thread Id allows you to correlate for example which Threads from an OS perspective are using the most CPU within your JVM etc. **

-        Java Thread State and detail ex: waiting for monitor entry [0xfffffffea5afb000] java.lang.Thread.State: BLOCKED (on object monitor)
** Allows to quickly learn about Thread state and its potential current blocking condition **

-        Java Thread Stack Trace; this is by far the most important data that you will find from the Thread Dump. This is also where you will spent most of your analysis time since the Java Stack Trace provides you with 90% of the information that you need in order to pinpoint root cause of many problem pattern types as you will learn later in the training sessions

-        Java Heap breakdown; starting with HotSpot VM 1.6, you will also find at the bottom of the Thread Dump snapshot a breakdown of the HotSpot memory spaces utilization such as your Java Heap (YoungGen, OldGen) & PermGen space. This is quite useful when excessive GC is suspected as a possible root cause so you can do out-of-the-box correlation with Thread data / patterns found

Heap
 PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
  eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
  from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
  to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
 PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
  object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
 PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
  object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)

Thread Dump breakdown overview

In order for you to better understand, find below a diagram showing you a visual breakdown of a HotSpot VM Thread Dump and its common Thread Pools found:


As you can there are several pieces of information that you can find from a HotSpot VM Thread Dump. Some of these pieces will be more important than others depending of your problem pattern (problem patterns will be simulated and explained in future articles).

For now, find below a detailed explanation for each Thread Dump section as per our sample HotSpot Thread Dump:

# Full thread dump identifier
This is basically the unique keyword that you will find in your middleware / standalong Java standard output log once you generate a Thread Dump (ex: via kill -3 <PID> for UNIX). This is the beginning of the Thread Dump snapshot data.

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode):

# Java EE middleware, third party & custom application Threads
This portion is the core of the Thread Dump and where you will typically spend most of your analysis time. The number of Threads found will depend on your middleware software that you use, third party libraries (that might have its own Threads) and your application (if creating any custom Thread, which is generally not a best practice).

In our sample Thread Dump, Weblogic is the middleware used. Starting with Weblogic 9.2, a self-tuning Thread Pool is used with unique identifier “'weblogic.kernel.Default (self-tuning)”

"[STANDBY] ExecuteThread: '414' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=3 tid=0x000000010916a800 nid=0x2613 in Object.wait() [0xfffffffe9edff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
        at java.lang.Object.wait(Object.java:485)
        at weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:160)
        - locked <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)

# HotSpot VM Thread
This is an internal Thread managed by the HotSpot VM in order to perform internal native operations. Typically you should not worry about this one unless you see high CPU (via Thread Dump & prstat / native Thread id correlation).

"VM Periodic Task Thread" prio=3 tid=0x0000000101238800 nid=0x19 waiting on condition


# HotSpot GC Thread
When using HotSpot parallel GC (quite common these days when using multi physical cores hardware), the HotSpot VM create by default or as per your JVM tuning a certain # of GC Threads. These GC Threads allow the VM to perform its periodic GC cleanups in a parallel manner, leading to an overall reduction of the GC time; at the expense of increased CPU utilization.

"GC task thread#0 (ParallelGC)" prio=3 tid=0x0000000100120000 nid=0x3 runnable
"GC task thread#1 (ParallelGC)" prio=3 tid=0x0000000100131000 nid=0x4 runnable
………………………………………………………………………………………………………………………………………………………………

This is crucial data as well since when facing GC related problems such as excessive GC, memory leaks etc, you will be able to correlate any high CPU observed from the OS / Java process(es) with these Threads using their native id value (nid=0x3). You will learn how to identify and confirm this problem is future articles.

# JNI global references count
JNI (Java Native Interface) global references are basically Object references from the native code to a Java object managed by the Java garbage collector. Its role is to prevent collection of an object that is still in use by native code but technically with no "live" references in the Java code.

It is also important to keep an eye on JNI references in order to detect JNI related leaks. This can happen if you program use JNI directly or using third party tools like monitoring tools which are prone to native memory leaks.

JNI global references: 1925

# Java Heap utilization view
This data was added back to JDK 1 .6 and provides you with a short and fast view of your HotSpot Heap. I find it quite useful when troubleshooting GC related problems along with HIGH CPU since you get both Thread Dump & Java Heap in a single snapshot allowing you to determine (or to rule out) any pressure point in a particular Java Heap memory space along with current Thread computing currently being done at that time. As you can see in our sample Thread Dump, the Java Heap OldGen is maxed out!

Heap
 PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
  eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
  from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
  to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
 PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
  object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
 PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
  object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)

I hope this article has helped to understand the basic view of a HotSpot VM Thread Dump.

The Thread Dump Analysis Part 4 is now available and will provide you a high level overview of a Thread Dump and breakdown for the IBM VM 1.6 Thread Dump format.

7 comments:

Good post..Highly expecting the next chapter

When is your next article coming out???

Hi anonymous,

In about 1 week, I will release part 4 and part 5. Part 4 will cover Thread Dump format for IBM VM. Part 5, i will share my Thread Dump analysis technique at a deeper level.

Thanks for your interest in this series.
P-H

Hi, the Thread Dump analysis part 4 (IBM format) is now available.

Thanks.
P-H

Eagerly waiting for part 5....

Thanks, the next article of the series will provide you an "how to" put all the stuff together and how to analyze the thread dump so you can narrow down the root cause...

Post a Comment