This short
post will test your knowledge on JVM and project delivery skills; especially
regarding JVM upgrades. I’m looking forward for your comments and answers
on how to approach this type of projects in order to say away from performance
problems.
Background
I was
recently involved in a recent problem case affecting a production environment running
on Weblogic 10 and using the HotSpot JVM 1.6 @32-bit. Given recent challenges
and load increase forecast, the decision was taken to upgrade the HotSpot JVM 1.6
from 32-bit to 64-bit.
Please note that no change was applied to the JVM arguments.
Please note that no change was applied to the JVM arguments.
After a
few weeks of functional testing and planning, the upgrade was deployed
successfully to the production environment. However, the support team did observe the next day major performance degradation, including thread lock contention, forcing the
deployment team to rollback the upgrade.
The root
cause was eventually found and the upgrade will be re-attempted in the near
future.
Question:
Based on
the above background, provide a list of possible root causes that may explain
this performance degradation.
Propose a
list of improvements to the project delivery and recommendations on how to properly
manage and de-risk this type of upgrade.
Answer:
I often
hear the assumption that switching from a 32-bit JVM to 64-bit JVM will
automatically bring performance improvement. This is partially true.
Performance improvements will only be observed if you are dealing with existing
memory footprint problem(s) prior to the upgrade such as excessive GC or java.lang.outofmemoryerrorjava heap space conditions and if you performed proper tuning & Java heap
sizing.
Unfortunately,
we often overlook the fact that for a 64-bit JVM process, native pointers in
the system takes up 8 bytes instead of 4. This can result in an increased
memory footprint of your application and leading to more frequent GC and
performance degradation.
Here is
the official explanation from Oracle:
What are the performance characteristics of
64-bit versus 32-bit VMs?
"Generally, the benefits of being able to
address larger amounts of memory come with a small performance loss in
64-bit VMs versus running the same application on a 32-bit VM. This is due to the fact that every native
pointer in the system takes up 8 bytes instead of 4. The loading of this extra data has an impact
on memory usage which translates to slightly slower execution depending on how
many pointers get loaded during the execution of your Java program. The good news is that with AMD64 and EM64T
platforms running in 64-bit mode, the Java VM gets some additional registers
which it can use to generate more efficient native instruction sequences. These extra registers increase performance to
the point where there is often no performance loss at all when comparing 32 to
64-bit execution speed.
The performance difference comparing an
application running on a 64-bit platform versus a 32-bit platform on SPARC is
on the order of 10-20% degradation when you move to a 64-bit VM. On AMD64 and EM64T platforms this difference
ranges from 0-15% depending on the amount of pointer accessing your application
performs."
Now back
to our original problem case, this memory footprint increase was found to be
significant and at the root cause of our problem. Depending of the GC policy
that you are using, an increase in GC major collections will lead to higher JVM & thread pause times, opening the door for thread lock contention and other
problems. As you can see below, the upgrade to 64-bit JVM did increase our
existing application static memory footprint (tenured space) by 45%.
Java Heap footprint ~900 MB after major
collections (32-bit)
Java Heap footprint ~1.3 GB after major
collections (64-bit)
** 45% increase of the application Java heap
memory footprint (retained tenured space). Again, this is due to the expanded
size of managed native pointers.
The other
part of the problem is that no performance and load testing was performed prior
to production implementation, functional testing only. Also, since no change or
tuning was applied to the JVM settings, this automatically triggered an increased
frequency of major collections and JVM pause time. The final solution did involve increasing the Java heap capacity form 2 GB to 2.5 GB and using the Compressed Oops option available with HotSpot JDK 6u23.
Now that
you understand the root cause, find below my recommendations when performing
this type of upgrade:
- Execute performance and load testing cycles and
compare your application memory footprint and GC behaviour before (32-bit)
and after (64-bit).
- Ensure you spend enough time tuning your GC settings
and the Java heap size in order to minimize the JVM pause time.
- If you are using HotSpot JVM 1.6 6u23 and later,
take advantage of the Compressed Oops tuning parameter. Compressed oops will
allow the HotSpot JVM to represent many (not all) managed pointers as
32-bit object offsets from the 64-bit Java heap base address; resulting in
a reduced memory footprint following the upgrade.
- Perform proper capacity planning of the hardware
hosting your JVM processes. Ensure that you have enough physical RAM and
CPU capacity to handle the extra memory and CPU footprint associated with
this upgrade.
- Develop a low risk implementation strategy by upgrading only a certain percentage of your production environment to 64-bit JVM e.g. 25%-50%. This will also allow you to compare the behavior with the existing 32-bit JVM processes and make sure performance is aligned with your P & L results.
The simplest possibility is that a 64-bit JVM requires more memory than a 32-bit JVM to handle the same amount of working data: my hypothesis is that performance was worse because the 64-bit JVM was busy performing garbage collection to work with the same memory allocation that was tuned for the 32-bit JVM.
ReplyDeleteAs for the improvements to the delivery procedure, I can again think of a couple of obvious ones:
- perform load testing in a production-like environment, with the most recent data from the production environment
- migrate to a 64-bit JVM on only one of the machines in the production environment, and run it side-by-side with the other 32-bit JVM, then observe and analyze the differences
Hi Sebastiano,
ReplyDeleteGreat reply, you are right on. A 32-bit to 64-bit upgrade with no JVM tuning will lead to increased GC frequency as per your explanation.
Your recommendations are right on as well.
I will provide more details on analysis done, recommendations and tuning strategies next week.
Regards,
P-H
Hi PH,
ReplyDeleteShall Parallel GC be done explicitly? Shouldn't that be left on Server and JVM to scavange the OLD gen, unless and until we really running into memory issue's?
One more thing, shall the min and max memory assigned to the heap of the server needs to be close enough? For instance application wont need 2048 M when it starts, so min can be set as 512 M and max can be set as 2048 M ?
Please let me know your opinion
Thanks,
Hi PH,
ReplyDeleteThank you for this very interesting article. Loved it.
Eli
Thanks Eli!
ReplyDeleteP-H
Hi anonymous,
ReplyDeleteAt the current JVM specs that we are, the Full GC such as parallel GC will always work best if managed by the JVM itself e.g. not using explicit GC. Explicit GC should ONLY be used if you want to do live calculation of the memory footprint retention. It will NOT fix any issue such as leak.
I always recommend to sync the MIN & MAX JVM settings in order to avoid the JVM to dynamically expand itself which results in performance penalty.
Regards,
P-H