This article is part 4 of our Thread Dump analysis series which will provide you with an overview of what is a JVM Thread Dump for the IBM VM and the different Threads and data points that you will find. As you will see and learn, the IBM VM Thread Dump format is different but provides even more out-of-the-box troubleshooting data.
At this point, you should know how Threads interact with the Java EE container and what a Thread Dump is. Before we go any further in the deep dive analysis patterns, you also need to understand the IBM VM Thread Dump format since this is the typical Thread Dump data to expect when using IBM WAS on IBM VM.
IBM VM Thread Dump breakdown overview
In order for you to better understand, find below a diagram showing you a visual breakdown of an IBM 1.6 VM Thread Dump and its common data points found:
As you can, there are extra runtime data that you will not find from a HotSpot VM Thread Dump. Please keep in mind that you may not need to review all these data points but you still need to understand what data is available depending of your problem case.
The rest of the article will cover each Thread Dump portion in more detail.
# Thread Dump generation event
The first portion provides you with detail on how this Thread Dump was generated. IBM Thread Dump can be generated as a result of a “signal 3” or “user” e.g. kill -3 <Java pid> or automatically as a result of severe JVM conditions such as an OutOfMemoryError.
0SECTION TITLE subcomponent dump routine
NULL ===============================
1TISIGINFO Dump Event "user" (00004000) received
1TIDATETIME Date: 2012/03/12 at 20:52:13
1TIFILENAME Javacore filename: /apps/wl11g/domains/app/javacore.20120312.205205.1949928.0004.txt
1TIREQFLAGS Request Flags: 0x81 (exclusive+preempt)
1TIPREPSTATE Prep State: 0x4 (exclusive_vm_access)
0SECTION TITLE subcomponent dump routine
NULL ===============================
1TISIGINFO OUTOFMEMORY received
1TIDATETIME Date: 2012/06/01 at 09:52:12
1TIFILENAME Javacore filename: /usr/WebSphere/AppServer/javacore311328.1338524532.txt
# HW and OS environment detail
The next section provides you with some detail on the current hardware and OS that this IBM VM is running from:
0SECTION GPINFO subcomponent dump routine
NULL ================================
2XHOSLEVEL OS Level : AIX 5.3
2XHCPUS Processors -
3XHCPUARCH Architecture : ppc64
3XHNUMCPUS How Many : 6
3XHNUMASUP NUMA is either not supported or has been disabled by user
# JRE detail and Java start-up arguments
This section is very useful as it provides you with a full view on your JRE major version and patch level along with all JVM start-up arguments.
0SECTION ENVINFO subcomponent dump routine
NULL =================================
1CIJAVAVERSION JRE 1.6.0 IBM J9 2.4 AIX ppc64-64 build jvmap6460sr9-20101124_69295
1CIVMVERSION VM build 20101124_069295
1CIJITVERSION JIT enabled, AOT enabled - r9_20101028_17488ifx2
1CIGCVERSION GC - 20101027_AA
1CIRUNNINGAS Running as a standalone JVM
…………………………………………………………………………………………
# User and environment variables
This section provides you with a listing of current user and environment variables such as File Descriptor limit.
1CIUSERLIMITS User Limits (in bytes except for NOFILE and NPROC)
NULL ------------------------------------------------------------------------
NULL type soft limit hard limit
2CIUSERLIMIT RLIMIT_AS unlimited unlimited
2CIUSERLIMIT RLIMIT_CORE 1073741312 unlimited
2CIUSERLIMIT RLIMIT_CPU unlimited unlimited
2CIUSERLIMIT RLIMIT_DATA unlimited unlimited
2CIUSERLIMIT RLIMIT_FSIZE unlimited unlimited
2CIUSERLIMIT RLIMIT_NOFILE 4096 4096
2CIUSERLIMIT RLIMIT_RSS 33554432 unlimited
2CIUSERLIMIT RLIMIT_STACK 33554432 4294967296
# Java Heap detail and GC history
Similar to HotSpot VM 1.6+, IBM VM Thread Dump also contains information on the Java Heap capacity and utilization along with memory segments allocated for each memory space of the Java process.
Please keep in mind that deeper Java Heap analysis will require you to analyze the Heap Dump binary snapshot as per below tutorial.
Finally, an history of the garbage collection process is also present.
0SECTION MEMINFO subcomponent dump routine
NULL =================================
1STHEAPFREE Bytes of Heap Space Free: 51104BC8
1STHEAPALLOC Bytes of Heap Space Allocated: 80000000
1STSEGTYPE Internal Memory
…………………………………………………………………………………………
1STSEGTYPE Object Memory
…………………………………………………………………………………………
1STSEGTYPE Class Memory
…………………………………………………………………………………………
1STSEGTYPE JIT Code Cache
…………………………………………………………………………………………
1STSEGTYPE JIT Data Cache
…………………………………………………………………………………………
STGCHTYPE GC History
3STHSTTYPE 00:52:07:523048405 GMT j9mm.51 - SystemGC end: newspace=466136480/483183616 oldspace=899251600/1610612736 loa=80530432/80530432
3STHSTTYPE 00:52:07:523046694 GMT j9mm.139 - Reference count end: weak=40149 soft=87504 phantom=33 threshold=17 maxThreshold=32
3STHSTTYPE 00:52:07:522164027 GMT j9mm.91 - GlobalGC end: workstackoverflow=0 overflowcount=0 weakrefs=40149 soft=87504 threshold=17 phantom=33 finalizers=4947 newspace=466136480/483183616 oldspace=899251600/1610612736 loa=80530432/80530432
3STHSTTYPE 00:52:07:522152764 GMT j9mm.90 - GlobalGC collect complete
# Java and JVM object monitor lock and deadlock detail
This Thread Dump portion is very important. Quite often Thread problems involve Threads waiting between each other due to locks on particular Object monitors e.g. Thread B waiting to acquire a lock on Object monitor held by Thread A. Deadlock conditions can also be triggered from time to time; especially for non-Thread safe implementations.
The IBM VM Thread Dump provides a separate section where you can analyze lock(s) held by each Thread including waiting chain(s) e.g. Many Threads waiting to acquire the same Object monitor lock.
0SECTION LOCKS subcomponent dump routine
NULL ===============================
NULL
1LKPOOLINFO Monitor pool info:
2LKPOOLTOTAL Current total number of monitors: 1034
NULL
1LKMONPOOLDUMP Monitor Pool Dump (flat & inflated object-monitors):
2LKMONINUSE sys_mon_t:0x0000000115B53060 infl_mon_t: 0x0000000115B530A0:
3LKMONOBJECT java/util/Timer$TimerImpl@0x0700000000C92AA0/0x0700000000C92AB8: <unowned>
3LKNOTIFYQ Waiting to be notified:
3LKWAITNOTIFY "Thread-7" (0x0000000114CAB400)
…………………………………………………………………………
## Threads waiting chain
2LKMONINUSE sys_mon_t:0x000000012462FE00 infl_mon_t: 0x000000012462FE40:
3LKMONOBJECT com/inc/server/app/Request@0x07000000142ADF30/0x07000000142ADF48: owner "Thread-30" (0x000000012537F300), entry count 1
3LKNOTIFYQ Waiting to be notified:
3LKWAITNOTIFY "Thread-26" (0x0000000125221F00)
3LKWAITNOTIFY "Thread-27" (0x0000000125252000)
3LKWAITNOTIFY "Thread-28" (0x000000012527B800)
3LKWAITNOTIFY "Thread-29" (0x00000001252DDA00)
3LKWAITNOTIFY "Thread-31" (0x0000000125386200)
3LKWAITNOTIFY "Thread-32" (0x0000000125423600)
3LKWAITNOTIFY "Thread-33" (0x000000012548C500)
3LKWAITNOTIFY "Thread-34" (0x00000001255D6000)
3LKWAITNOTIFY "Thread-35" (0x00000001255F7900)
…………………………………………………………………………
# Java EE middleware, third party & custom application Threads
Similar to the HotSpot VM Thread Dump format, this portion is the core of the Thread Dump and where you will typically spend most of your analysis time. The number of Threads found will depend on your middleware software that you use, third party libraries (that might have its own Threads) and your application (if creating any custom Thread, which is generally not a best practice).
The following Thread in the example below is in BLOCK state which typically means it is waiting to acquire a lock on an Object monitor. You will need to search in the earlier section and determine which Thread is holding the lock so you can pinpoint the root cause.
3XMTHREADINFO "[STUCK] ExecuteThread: '162' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x000000013ACF0800, j9thread_t:0x000000013AC88B20, java/lang/Thread:0x070000001F945798, state:B, prio=1
3XMTHREADINFO1 (native thread ID:0x1AD0F3, native priority:0x1, native policy:UNKNOWN)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at org/springframework/jms/connection/SingleConnectionFactory.createConnection(SingleConnectionFactory.java:207(Compiled Code))
4XESTACKTRACE at org/springframework/jms/connection/SingleConnectionFactory.createQueueConnection(SingleConnectionFactory.java:222(Compiled Code))
4XESTACKTRACE at org/springframework/jms/core/JmsTemplate102.createConnection(JmsTemplate102.java:169(Compiled Code))
4XESTACKTRACE at org/springframework/jms/core/JmsTemplate.execute(JmsTemplate.java:418(Compiled Code))
4XESTACKTRACE at org/springframework/jms/core/JmsTemplate.send(JmsTemplate.java:475(Compiled Code))
4XESTACKTRACE at org/springframework/jms/core/JmsTemplate.send(JmsTemplate.java:467(Compiled Code))
…………………………………………………………………………………………………………
# JVM class loader summary
Finally, the last section of the IBM VM Thread Dump provides you with a detailed class loader summary. This is very crucial data when dealing with Class Loader related issues and leaks. You will find the number and type of loaded Classes for each active Class loader in the running JVM.
I suggest that you review the following case study for a complete tutorial on how to pinpoint root cause for this type of issues when using IBM VM.
0SECTION CLASSES subcomponent dump routine
NULL =================================
1CLTEXTCLLOS Classloader summaries
1CLTEXTCLLSS 12345678: 1=primordial,2=extension,3=shareable,4=middleware,5=system,6=trusted,7=application,8=delegating
2CLTEXTCLLOADER p---st-- Loader *System*(0x0700000000878898)
3CLNMBRLOADEDLIB Number of loaded libraries 6
3CLNMBRLOADEDCL Number of loaded classes 3721
2CLTEXTCLLOADER -x--st-- Loader sun/misc/Launcher$ExtClassLoader(0x0700000000AE8F40), Parent *none*(0x0000000000000000)
3CLNMBRLOADEDLIB Number of loaded libraries 0
3CLNMBRLOADEDCL Number of loaded classes 91
2CLTEXTCLLOADER -----ta- Loader sun/misc/Launcher$AppClassLoader(0x07000000008786D0), Parent sun/misc/Launcher$ExtClassLoader(0x0700000000AE8F40)
3CLNMBRLOADEDLIB Number of loaded libraries 3
3CLNMBRLOADEDCL Number of loaded classes 15178
……………………………………………………………………………………………
I hope this article has helped to understand the basic view of an IBM VM Thread Dump. The next article (part 5) will provide you with a tutorial on how to analyze a JVM Thread Dump via a step by step tutorial and technique I have used over the last 10 years.
Please feel free to post any comment and question.
Hi all,
ReplyDeletePlease note that part 5 which is a step by step tutorial on Thread Dump analysis will be available in about 2 weeks from now.
Please ensure that you understand the base and Thread Dump format before moving to part 5.
Thanks.
P-H
By IBM VM, do you mean J9? By JVM Thread Dump, do you mean HotSpot?
ReplyDeleteHi Stephane,
ReplyDeleteYes, IBM VM ex: JRE 1.6.0 IBM J9 2.4
IBM VM is not HotSpot. HotSpot refers to Sun JVM implementation. They share common properties but these are 2 different JVM implementation.
Thread Dump format is also different between Sun HotSpot and IBM VM, please review part 3 for Thread Dump format for HotSpot.
http://javaeesupportpatterns.blogspot.ca/2012/02/how-to-analyze-thread-dump-part-3.html
Thanks.
P-H
Hi,
ReplyDeleteCan you please send the link for part 5.
Hi Jagadeesh,
ReplyDeleteThe part 5 is not available yet. I'm planning to complete and release part 5 in about 1 week.
Thanks for your interest in this series.
P-H
Hi,
ReplyDeletePlease note that the part 5 is now available.
http://javaeesupportpatterns.blogspot.ca/2012/07/how-to-analyze-thread-dump-part-5.html
Thanks.
P-H
Hello,
ReplyDeleteWe are using Weblogic version 10 and our application is in cluster environment.
Very often our portal dispatcher goes to FAILED state. When we verified the log files, we got this error.
This is the current setting in weblogic.
Max Stuck Thread Time : 600 sec
Stuck Thread Count: 1
By increasing the struck thread count will our problem gets resolved?
or do we need to increase both?
Please share your suggestion
Thanks,
Raghav
Hello,
ReplyDeleteMy application error is not displayed in my previous comments.
Hence posting again.
------ Critical Subsystem Thread Pool has failed. Setting server state to FAILED. Server failed as the number of stuck threads has exceeded the max limit of 1
Hi Raghavendra,
ReplyDeleteLooks like you are hitting the classic issue of the over sensitive Weblogic stuck thread detection feature added since WLS 9.2. When enabled, it can disable/force shutdown the entire server for only 1 stuck thread.
Like we did for many of my clients production system, my recommendation is for you to disable the feature forcing Weblogic to take action when stuck thread is detected. You can still monitor the stuck threads and determine root cause as per Thread Dump analysis while keeping server up and running. It is only useful in the event stuck threads are using resources such as infinite looping, eating CPU & Java Heap.
To disable go to >> Environment >> WorkManager >> Select the workmanager you have facing problem with >> then check the Ignore Stuck Threads field
Also go under >> Environment >> servers (select each managed server) >> Overload >> Select ignore, take no action for both Failure Action & Panic Action
Regards,
P-H
Hello,
ReplyDeleteThanks for your response.
Cannot find any entry/record in below path!
To disable go to >> Environment >> WorkManager >> Select the workmanager you have facing problem with >> then check the Ignore Stuck Threads field
This option is already set in the weblogic.
Also go under >> Environment >> servers (select each managed server) >> Overload >> Select ignore, take no action for both Failure Action & Panic Action
How to solve my problem? Please suggest some solution :(
Many thanks,
Raghav
hi Raghav,
ReplyDeletePlease go again under Environment >> servers (select each managed server) >> Overload >>
Can you please tell me what you see in the Stuck Thread Count field? The value should be 0. This will prevent Weblogic to change server in FAILED state.
Please confirm.
Thanks.
P-H
Hello,
ReplyDeleteDetails are below.
Stuck Thread Count: 1
Max Stuck Thread Time : 600 sec
Thanks,
Raghav
Perfect, this is the problem. With this value, Weblogic will automatically go in FAILED state as soon as only ONE stuck Thread is found.
ReplyDeletePlease change the value from 1 to 0. This should fix your problem.
Regards,
P-H
Hi!
ReplyDeleteCould you please hint me how to match thread id from trace.log and thread id from thread dump?
Hi Banepmn,
ReplyDeleteCan you please give me detail of data found from your trace.log? Are you referring to the AIX CPU per Thread data?
Thanks.
P-H
Hi HeapUpNoRegrets,
ReplyDeleteI will see what I can find as reference such as IBM related articles.
Please stay tuned as I will also release Thread Dump analysis videos shortly which will be very useful for you to understand the analysis process.
Regards,
P-H
Hi,
ReplyDeleteIn future articles will you teach us how to use IBM thread analyser tool?