/ HotSpot 32-bit to 64-bit upgrade: what to look for ~ Java EE Support Patterns

5.02.2013

HotSpot 32-bit to 64-bit upgrade: what to look for

This short post will test your knowledge on JVM and project delivery skills; especially regarding JVM upgrades. I’m looking forward for your comments and answers on how to approach this type of projects in order to say away from performance problems.

Background

I was recently involved in a recent problem case affecting a production environment running on Weblogic 10 and using the HotSpot JVM 1.6 @32-bit. Given recent challenges and load increase forecast, the decision was taken to upgrade the HotSpot JVM 1.6 from 32-bit to 64-bit.

Please note that no change was applied to the JVM arguments.

After a few weeks of functional testing and planning, the upgrade was deployed successfully to the production environment. However, the support team did observe the next day major performance degradation, including thread lock contention, forcing the deployment team to rollback the upgrade.

The root cause was eventually found and the upgrade will be re-attempted in the near future.

Question:

Based on the above background, provide a list of possible root causes that may explain this performance degradation.

Propose a list of improvements to the project delivery and recommendations on how to properly manage and de-risk this type of upgrade.

Answer:

I often hear the assumption that switching from a 32-bit JVM to 64-bit JVM will automatically bring performance improvement. This is partially true. Performance improvements will only be observed if you are dealing with existing memory footprint problem(s) prior to the upgrade such as excessive GC or java.lang.outofmemoryerrorjava heap space conditions and if you performed proper tuning & Java heap sizing.

Unfortunately, we often overlook the fact that for a 64-bit JVM process, native pointers in the system takes up 8 bytes instead of 4. This can result in an increased memory footprint of your application and leading to more frequent GC and performance degradation.

Here is the official explanation from Oracle:

What are the performance characteristics of 64-bit versus 32-bit VMs?

"Generally, the benefits of being able to address larger amounts of memory come with a small performance loss in 64-bit VMs versus running the same application on a 32-bit VM.  This is due to the fact that every native pointer in the system takes up 8 bytes instead of 4.  The loading of this extra data has an impact on memory usage which translates to slightly slower execution depending on how many pointers get loaded during the execution of your Java program.  The good news is that with AMD64 and EM64T platforms running in 64-bit mode, the Java VM gets some additional registers which it can use to generate more efficient native instruction sequences.  These extra registers increase performance to the point where there is often no performance loss at all when comparing 32 to 64-bit execution speed.  
The performance difference comparing an application running on a 64-bit platform versus a 32-bit platform on SPARC is on the order of 10-20% degradation when you move to a 64-bit VM.  On AMD64 and EM64T platforms this difference ranges from 0-15% depending on the amount of pointer accessing your application performs."   

Now back to our original problem case, this memory footprint increase was found to be significant and at the root cause of our problem. Depending of the GC policy that you are using, an increase in GC major collections will lead to higher JVM & thread pause times, opening the door for thread lock contention and other problems. As you can see below, the upgrade to 64-bit JVM did increase our existing application static memory footprint (tenured space) by 45%. 

Java Heap footprint ~900 MB after major collections (32-bit)



 Java Heap footprint ~1.3 GB after major collections (64-bit)



** 45% increase of the application Java heap memory footprint (retained tenured space). Again, this is due to the expanded size of managed native pointers.

The other part of the problem is that no performance and load testing was performed prior to production implementation, functional testing only. Also, since no change or tuning was applied to the JVM settings, this automatically triggered an increased frequency of major collections and JVM pause time. The final solution did involve increasing the Java heap capacity form 2 GB to 2.5 GB and using the Compressed Oops option available with HotSpot JDK 6u23.

Now that you understand the root cause, find below my recommendations when performing this type of upgrade:

  • Execute performance and load testing cycles and compare your application memory footprint and GC behaviour before (32-bit) and after (64-bit).
  • Ensure you spend enough time tuning your GC settings and the Java heap size in order to minimize the JVM pause time.
  • If you are using HotSpot JVM 1.6 6u23 and later, take advantage of the Compressed Oops tuning parameter. Compressed oops will allow the HotSpot JVM to represent many (not all) managed pointers as 32-bit object offsets from the 64-bit Java heap base address; resulting in a reduced memory footprint following the upgrade.
  • Perform proper capacity planning of the hardware hosting your JVM processes. Ensure that you have enough physical RAM and CPU capacity to handle the extra memory and CPU footprint associated with this upgrade.
  • Develop a low risk implementation strategy by upgrading only a certain percentage of your production environment to 64-bit JVM e.g. 25%-50%. This will also allow you to compare the behavior with the existing 32-bit JVM processes and make sure performance is aligned with your P & L results.

6 comments:

The simplest possibility is that a 64-bit JVM requires more memory than a 32-bit JVM to handle the same amount of working data: my hypothesis is that performance was worse because the 64-bit JVM was busy performing garbage collection to work with the same memory allocation that was tuned for the 32-bit JVM.

As for the improvements to the delivery procedure, I can again think of a couple of obvious ones:

- perform load testing in a production-like environment, with the most recent data from the production environment
- migrate to a 64-bit JVM on only one of the machines in the production environment, and run it side-by-side with the other 32-bit JVM, then observe and analyze the differences

Hi Sebastiano,

Great reply, you are right on. A 32-bit to 64-bit upgrade with no JVM tuning will lead to increased GC frequency as per your explanation.

Your recommendations are right on as well.

I will provide more details on analysis done, recommendations and tuning strategies next week.

Regards,
P-H

Hi PH,

Shall Parallel GC be done explicitly? Shouldn't that be left on Server and JVM to scavange the OLD gen, unless and until we really running into memory issue's?

One more thing, shall the min and max memory assigned to the heap of the server needs to be close enough? For instance application wont need 2048 M when it starts, so min can be set as 512 M and max can be set as 2048 M ?

Please let me know your opinion

Thanks,




Hi PH,

Thank you for this very interesting article. Loved it.

Eli

Hi anonymous,

At the current JVM specs that we are, the Full GC such as parallel GC will always work best if managed by the JVM itself e.g. not using explicit GC. Explicit GC should ONLY be used if you want to do live calculation of the memory footprint retention. It will NOT fix any issue such as leak.

I always recommend to sync the MIN & MAX JVM settings in order to avoid the JVM to dynamically expand itself which results in performance penalty.

Regards,
P-H

Post a Comment