/ January 2013 ~ Java EE Support Patterns

1.25.2013

Java concurrency: the hidden thread deadlocks

Most Java programmers are familiar with the Java thread deadlock concept. It essentially involves 2 threads waiting forever for each other. This condition is often the result of flat (synchronized) or ReentrantLock (read or write) lock-ordering problems.

Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting to lock monitor 0x0237ada4 (object 0x272200e8, a java.lang.Object),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x0237aa64 (object 0x272200f0, a java.lang.Object),
  which is held by "pool-1-thread-2"

The good news is that the HotSpot JVM is always able to detect this condition for you…or is it?

A recent thread deadlock problem affecting an Oracle Service Bus production environment has forced us to revisit this classic problem and identify the existence of “hidden” deadlock situations.

This article will demonstrate and replicate via a simple Java program a very special lock-ordering deadlock condition which is not detected by the latest HotSpot JVM 1.7. You will also find a video at the end of the article explaining you the Java sample program and the troubleshooting approach used.

The crime scene

I usually like to compare major Java concurrency problems to a crime scene where you play the lead investigator role. In this context, the “crime” is an actual production outage of your client IT environment. Your job is to:

  • Collect all the evidences, hints & facts (thread dump, logs, business impact, load figures…)
  • Interrogate the witnesses & domain experts (support team, delivery team, vendor, client…)
The next step of your investigation is to analyze the collected information and establish a potential list of one or many “suspects” along with clear proofs. Eventually, you want to narrow it down to a primary suspect or root cause. Obviously the law “innocent until proven guilty” does not apply here, exactly the opposite.

Lack of evidence can prevent you to achieve the above goal. What you will see next is that the lack of deadlock detection by the Hotspot JVM does not necessary prove that you are not dealing with this problem.

The suspect

In this troubleshooting context, the “suspect” is defined as the application or middleware code with the following problematic execution pattern.

  • Usage of FLAT lock followed by the usage of ReentrantLock WRITE lock (execution path #1)
  • Usage of ReentrantLock READ lock followed by the usage of FLAT lock (execution path #2)
  • Concurrent execution performed by 2 Java threads but via a reversed execution order
The above lock-ordering deadlock criteria’s can be visualized as per below:


Now let’s replicate this problem via our sample Java program and look at the JVM thread dump output.

Sample Java program

This above deadlock conditions was first identified from our Oracle OSB problem case. We then re-created it via a simple Java program. You can download the entire source code of our program here.

The program is simply creating and firing 2 worker threads. Each of them execute a different execution path and attempt to acquire locks on shared objects but in different orders. We also created a deadlock detector thread for monitoring and logging purposes.

For now, find below the Java class implementing the 2 different execution paths.

package org.ph.javaee.training8;

import java.util.concurrent.locks.ReentrantReadWriteLock;

/**
 * A simple thread task representation
 * @author Pierre-Hugues Charbonneau
 *
 */
public class Task {
      
       // Object used for FLAT lock
       private final Object sharedObject = new Object();
       // ReentrantReadWriteLock used for WRITE & READ locks
       private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
      
       /**
        *  Execution pattern #1
        */
       public void executeTask1() {
            
             // 1. Attempt to acquire a ReentrantReadWriteLock READ lock
             lock.readLock().lock();
            
             // Wait 2 seconds to simulate some work...
             try { Thread.sleep(2000);}catch (Throwable any) {}
            
             try {              
                    // 2. Attempt to acquire a Flat lock...
                    synchronized (sharedObject) {}
             }
             // Remove the READ lock
             finally {
                    lock.readLock().unlock();
             }           
            
             System.out.println("executeTask1() :: Work Done!");
       }
      
       /**
        *  Execution pattern #2
        */
       public void executeTask2() {
            
             // 1. Attempt to acquire a Flat lock
             synchronized (sharedObject) {                 
                   
                    // Wait 2 seconds to simulate some work...
                    try { Thread.sleep(2000);}catch (Throwable any) {}
                   
                    // 2. Attempt to acquire a WRITE lock                   
                    lock.writeLock().lock();
                   
                    try {
                           // Do nothing
                    }
                   
                    // Remove the WRITE lock
                    finally {
                           lock.writeLock().unlock();
                    }
             }
            
             System.out.println("executeTask2() :: Work Done!");
       }
      
       public ReentrantReadWriteLock getReentrantReadWriteLock() {
             return lock;
       }
}

As soon ad the deadlock situation was triggered, a JVM thread dump was generated using JVisualVM.


As you can see from the Java thread dump sample. The JVM did not detect this deadlock condition (e.g. no presence of Found one Java-level deadlock) but it is clear these 2 threads are in deadlock state.

Root cause: ReetrantLock READ lock behavior

The main explanation we found at this point is associated with the usage of the ReetrantLock READ lock. The read locks are normally not designed to have a notion of ownership. Since there is not a record of which thread holds a read lock, this appears to prevent the HotSpot JVM deadlock detector logic to detect deadlock involving read locks.

Some improvements were implemented since then but we can see that the JVM still cannot detect this special deadlock scenario.

Now if we replace the read lock (execution pattern #1) in our program by a write lock, the JVM will finally detect the deadlock condition but why?

Found one Java-level deadlock:
=============================
"pool-1-thread-2":
  waiting for ownable synchronizer 0x272239c0, (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x025cad3c (object 0x272236d0, a java.lang.Object),
  which is held by "pool-1-thread-2"

Java stack information for the threads listed above:
===================================================
"pool-1-thread-2":
       at sun.misc.Unsafe.park(Native Method)
       - parking to wait for  <0x272239c0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
       at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
       at org.ph.javaee.training8.Task.executeTask2(Task.java:54)
       - locked <0x272236d0> (a java.lang.Object)
       at org.ph.javaee.training8.WorkerThread2.run(WorkerThread2.java:29)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:722)
"pool-1-thread-1":
       at org.ph.javaee.training8.Task.executeTask1(Task.java:31)
       - waiting to lock <0x272236d0> (a java.lang.Object)
       at org.ph.javaee.training8.WorkerThread1.run(WorkerThread1.java:29)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:722)

This is because write locks are tracked by the JVM similar to flat locks. This means the HotSpot JVM deadlock detector appears to be currently designed to detect:

  • Deadlock on Object monitors involving FLAT locks
  • Deadlock involving Locked ownable synchronizers associated with WRITE locks
The lack of read lock per-thread tracking appears to prevent deadlock detection for this scenario and significantly increase the troubleshooting complexity.

I suggest that you read Doug Lea’s comments on this whole issue since concerns were raised back in 2005 regarding the possibility to add per-thread read-hold tracking due to some potential lock overhead.

Find below my troubleshooting recommendations if you suspect a hidden deadlock condition involving read locks:

  • Analyze closely the thread call stack trace, it may reveal some code potentially acquiring read locks and preventing other threads to acquire write locks.
  • If you are the owner of the code, keep track of the read lock count via the usage of the lock.getReadLockCount() method
I’m looking forward for your feedback, especially from individuals with experience on this type of deadlock involving read locks.

Finally, find below a video explaining such findings via the execution and monitoring of our sample Java program.



1.12.2013

QOTD: Java Thread vs. Java Heap Space

The following question is quite common and is related to OutOfMemoryError: unable to create new native thread problems during the JVM thread creation process and the JVM thread capacity. 

This is also a typical interview question I ask to new technical candidates (senior role). I recommend that you attempt to provide your own response before looking at the answer.

Question:

Why can’t you increase the JVM thread capacity (total # of threads) by expanding the Java heap space capacity via -Xmx?

Answer:

The Java thread creation process requires native memory to be available for the JVM process. Expanding the Java heap space via the –Xmx argument will actually reduce your Java thread capacity since this memory will be “stolen” from the native memory space.

  • For a 32-bit JVM, the Java heap space is in a race with the native heap, including the thread capacity
  • For a 64-bit JVM, the thread capacity will mainly depend of your OS physical & virtual memory availability along with your current OS process related tuning parameters
In order to better understand this limitation, I now propose to you the following video tutorial.

You can also download the sample Java program from the link below:
https://docs.google.com/file/d/0B6UjfNcYT7yGazg5aWxCWGtvbm8/edit

1.11.2013

Java Verbose GC - tutorial video


This is my first tutorial video which will provide you with technical detail on how to enable and analyze the verbose:gc output data of your JVM process.

You can also download the Java sample program from the link below. Please make sure that you configure your Java runtime with a heap space of only 1 GB (-Xmx1024m).
https://docs.google.com/file/d/0B6UjfNcYT7yGenZCU3FfTHFfNnc/edit




Future videos may also include scripts in order for the non-English audience to perform proper language translation.

I’m looking forward for your feedback and suggestions on topics and video format you would like to see.

1.09.2013

Java video tutorials: A picture is worth a thousand words

This is my first post for 2013. I want to thank all my readers for their great feedback and comments from the various articles I posted in 2012.

It is sometimes quite difficult to truly show and explain some troubleshooting techniques via articles only. For 2013, on top of my regular posting, I will be introducing tutorial and troubleshooting videos. These training videos will be available for free from YouTube. I really hope that you will appreciate this addition. Watch for a new Videos section from the top navigation bar.

I’m really looking forward for your suggestions and recommendations on what type of tutorials and videos you would like to watch and learn. Here are some examples:

  • Live Thread Dump analysis tutorial
  • Live Heap Dump analysis tutorial
  • JVM monitoring and tuning
  • JVM verbose:gc live analysis
  • Java CPU troubleshooting
  • JVisualVM tutorial
  • Weblogic & JBoss console management
  • Weblogic & JBoss monitoring
  • Java & Java EE sample program create via Eclipse
  • Many more…

Regards,
Pierre-Hugues Charbonneau