This case study describes the complete root cause analysis and resolution of a Java ZipFile OutOfMemoryError problem triggered during the deployment of an Oracle Weblogic Integration 9.2 application.
Environment Specs
Monitoring and troubleshooting tools
Problem overview
Environment Specs
· Java EE server: Oracle Weblogic Integration 9.2 MP2
· OS: Sun Solaris 5.10
· JDK: Sun Java VM 1.5.0_10-b03 - 32-bit
· Java VM arguments: -server –Xms2560m –Xmx2560m -XX:PermSize=256m -XX:MaxPermSize=512m
· Platform type: BPM
· Solaris pmap command
· Java verbose GC (for Java Heap and PermGen memory monitoring)
· Problem type: java.lang.OutOfMemoryError at java.util.zip.ZipFile.open(Native Method)
An OutOfMemoryError problem was observed during the deployment process of one of our Weblogic Integration 9.2 application.
Gathering and validation of facts
As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:
· What is the client impact? HIGH (full outage of our application)
· Recent change of the affected platform? Yes, a minor update to the application was done along with an increase of minimum and maximum size the Java Heap from 2 GIG (2048m) to 2.5 GIG (2560m) in order to reduce the frequency of full garbage collections
· Any recent traffic increase to the affected platform? No
· Since how long this problem has been observed? Since the increase of the Java Heap to 2.5 GIG
· Is the OutOfMemoryError happening on start-up or under load? The OOM is triggered at deployment time only with no traffic to the environment
· What was the utilization of the Java Heap at that time? Really low at 20% utilization only (no traffic)
· What was the utilization of the PermGen space at that time? Healthy at ~ 70% utilization and not leaking
· Did a restart of the Weblogic server resolve the problem? No
- Conclusion #1: The problem trigger seems to be related to the 500 MB increase of the Java Heap to 2.5 GIG. ** This problem initially puzzled the troubleshooting team until more deep dive analysis was done as per below **
Weblogic log file analysis
A first analysis of the problem was done by reviewing the OOM error in the Weblogic managed server log.
java.lang.OutOfMemoryError
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:203)
at java.util.jar.JarFile.<init>(JarFile.java:132)
at java.util.jar.JarFile.<init>(JarFile.java:97)
at weblogic.utils.jars.JarFileDelegate.<init>(JarFileDelegate.java:32)
at weblogic.utils.jars.VirtualJarFactory.createVirtualJar(VirtualJarFactory.java:24)
at weblogic.application.ApplicationFileManagerImpl.getVirtualJarFile(ApplicationFileManagerImpl.java:194)
at weblogic.application.internal.EarDeploymentFactory.findOrCreateComponentMBeans(EarDeploymentFactory.java:162)
at weblogic.application.internal.MBeanFactoryImpl.findOrCreateComponentMBeans(MBeanFactoryImpl.java:48)
at weblogic.application.internal.MBeanFactoryImpl.createComponentMBeans(MBeanFactoryImpl.java:110)
at weblogic.application.internal.MBeanFactoryImpl.initializeMBeans(MBeanFactoryImpl.java:76)
at weblogic.management.deploy.internal.MBeanConverter.createApplicationMBean(MBeanConverter.java:88)
at weblogic.management.deploy.internal.MBeanConverter.createApplicationForAppDeployment(MBeanConverter.java:66)
at weblogic.management.deploy.internal.MBeanConverter.setupNew81MBean(MBeanConverter.java:314)
……………………………………………………………………………………………………………………………………………………………………………………………………………
As you can see, the OutOfMemoryError is thrown during the loading of our application (EAR file). The Weblogic server relies on the Java JDK ZipFile class to load any application EAR / jar files.
- Conclusion #2: The OOM error is triggered during a native call (ZipFile.open(Native Method)) from the Java JDK ZipFile to load our application EAR file. This native JVM operation requires proper native memory and virtual address space available in order to execute its loading operation. The conclusion at this point was that our Java VM 1.5 was running out of native memory / virtual address space at deployment time.
Sun Java VM native memory and MMAP files
Before we go any further in the root cause analysis, you way want to revisit the internal structure of the Sun Java HotSpot VM for JDK 1.5 and JDK 1.6. Proper understanding of the internal Sun Java VM is quite important, especially when dealing with native memory problems.
When using JDK 1.4 / 1.5, any JAR / ZIP file loaded by the Java VM get mapped entirely into an address space. This means that the more EAR / JAR files you are loading to a single JVM, the higher is the native memory footprint of your Java process.
This also means that the higher is your Java Heap and PermGen space; the lower memory is left for your native memory spaces such as C-Heap and MMAP Files which can definitely be a problem if you are deploying too many separate applications (EAR files) to a single 32-bit Java process.
Please note that Sun came up with improvements in JDK 1.6 (Mustang) and changed the behaviour so that the JAR file's central directory is still mapped, but the entries themselves are read separately; reducing the native memory requirement.
I suggest that you review the Sun Bug Id link below for more detail on such JDK 1.4 / 1.5 limitation.
Native memory footprint: Solaris pmap to the rescue!
The Solaris pmap command allows Java EE application developers to analyse how an application uses memory by providing a breakdown of all allocated address spaces of the Java process.
Now back to our OOM problem, pmap snapshots were generated following the OutOfMemoryError and analysis of the data below did reveal that we were getting very close to the upper 4 GIG limit of a 32-bit process on Solaris following our Java Heap increase to 2.5 GIG.
Now find below the reduced raw data along with snapshot with explanation of the findings.
20361: ../bea922/jdk150_10/bin/java -Dweblogic.Name=server1 -Dbea
Address Kbytes RSS Anon Locked Mode *Mapped File*
00010000 64 64 - - r-x-- java
0002E000 16 16 8 - rwx-- java
00032000 56 56 56 - rwx-- [ heap ]
00040000 3840 3840 3840 - rwx-- [ heap ]
00400000 229376 229376 229376 - rwx-- [ heap ]
25400000 6528 176 - - r--s- dev:256,65555 ino:315953
279E8000 32 32 32 - rwx-R [ stack tid=17152 ]
27A00000 752 32 - - r--s- dev:256,65545 ino:21843
.........................................................................
31100000 2704 2696 - - r--s- dev:256,65555 ino:466131
313E6000 40 40 40 - rwx-R [ stack tid=70 ]
31400000 9344 9328 - - r--s- dev:256,65555 ino:373905
31D30000 32 32 - - r--s- dev:256,65555 ino:317152
31D40000 24 24 - - r--s- dev:256,65555 ino:471460
31D94000 48 48 48 - rwx-R [ stack tid=69 ]
31DE8000 32 32 32 - rwx-R [ stack tid=67 ]
31E00000 1952 1936 - - r--s- dev:256,65555 ino:466129
32000000 10864 10840 - - r--s- dev:256,65555 ino:362036
32AA0000 128 128 128 - rwx-- [ anon ]
32B06000 40 40 40 - rwx-R [ stack tid=66 ]
32B20000 8 8 - - r-x-- librmi.so
32B30000 8 8 8 - rwx-- librmi.so
32C00000 8616 8576 - - r--s- dev:256,65555 ino:319156
33480000 8 8 - - r--s- dev:256,65555 ino:89436
.........................................................................
4C400000 102400 90336 90336 - rwx-- [ anon ]
52810000 160 144 - - r--s- dev:256,65545 ino:21876
52A00000 792 352 - - r--s- dev:256,65545 ino:35669
52AD0000 8 8 - - r--s- dev:256,65545 ino:20564
52AE0000 112 112 - - r--s- dev:256,65545 ino:21850
52B76000 8 8 8 - rwx-R [ anon ]
52B7A000 16 16 8 - rwx-R [ anon ]
52B7E000 8 8 8 - rwx-R [ stack tid=16 ]
52B90000 48 48 - - r--s- dev:256,65545 ino:23136
52BA0000 104 104 - - r--s- dev:256,65545 ino:21890
52BC0000 96 96 - - r--s- dev:256,65545 ino:21870
52BE0000 72 72 - - r--s- dev:256,65545 ino:8995
52C00000 81920 57472 57472 - rwx-- [ anon ]
57C10000 216 216 - - r--s- dev:256,65545 ino:21856
57C50000 120 104 - - r--s- dev:256,65545 ino:9002
57C80000 440 344 - - r--s- dev:256,65545 ino:8954
57D76000 16 16 8 - rwx-R [ anon ]
57D7A000 24 24 24 - rwx-R [ stack tid=15 ]
57D90000 8 8 - - r--s- dev:256,65545 ino:8879
57DA0000 192 192 - - r--s- dev:256,65545 ino:8881
57DE0000 88 88 - - r--s- dev:256,65545 ino:8972
57E7E000 8 8 8 - rwx-R [ stack tid=14 ]
57E90000 40 40 - - r--s- dev:256,65545 ino:8899
57EA0000 192 184 - - r--s- dev:256,65545 ino:8985
57EE0000 112 112 - - r--s- dev:256,65545 ino:8866
57F7C000 16 16 16 - rwx-R [ stack tid=10 ]
57F90000 400 280 - - r--s- dev:256,65545 ino:13398
58000000 512 512 512 - rwx-- [ anon ]
58080000 288 288 288 - rwx-- [ anon ]
580C8000 40 40 40 - rwx-- [ anon ]
58100000 3072 3072 3072 - rwx-- [ anon ]
58400000 1024 1024 1024 - rwx-- [ anon ]
58500000 8 8 8 - rwx-- [ anon ]
..........................................................................
58800000 262144 262144 262144 - rwx-- [ anon ]
68800000 147456 135168 135168 - rwx-- [ anon ]
71800000 20480 - - - rwx-- [ anon ]
78800000 1572864 1572864 1572864 - rwx-- [ anon ]
D8800000 924288 524288 524288 - rwx-- [ anon ]
F8810000 16 16 - - r--s- dev:256,65545 ino:8941
F8820000 16 16 - - r--s- dev:256,65545 ino:8935
F897E000 8 8 8 - rwx-R [ stack tid=7 ]
F8990000 112 112 - - r--s- dev:256,65545 ino:8986
F89B0000 136 136 - - r--s- dev:256,65545 ino:1339
F89E0000 64 64 - - r--s- dev:256,65545 ino:8970
F8A7E000 8 8 8 - rwx-R [ stack tid=6 ]
F8A90000 392 392 - - r--s- dev:256,65545 ino:8961
F8B7E000 8 8 8 - rwx-R [ stack tid=5 ]
F8B90000 8 8 - - r--s- dev:256,65545 ino:8944
F8BA0000 24 24 - - r--s- dev:256,65545 ino:8997
F8BB0000 256 248 - - r--s- dev:256,65545 ino:8867
F8C00000 4096 4096 4096 - rwx-- [ anon ]
FA800000 4096 4096 4096 - rwx-- [ anon ]
FAC10000 456 456 - - r--s- dev:256,65545 ino:9003
FAC90000 392 392 - - r--s- dev:256,65545 ino:9001
FB000000 8640 4256 - - r--s- dev:256,65545 ino:36123
FB880000 3072 3072 3072 - rwx-- [ anon ]
FBB90000 16 16 - - r--s- dev:256,65545 ino:84387
FBBA0000 16 16 - - r--s- dev:256,65545 ino:84392
FBBEE000 8 8 8 - rwx-R [ stack tid=13 ]
FBC00000 38976 36904 - - r--s- dev:256,65545 ino:35671
FE220000 16 16 - - r--s- dev:256,65545 ino:84614
.......................................................................
FE5A0000 64 64 - - r-x-- libzip.so
FE5B0000 8 8 8 - rwx-- libzip.so
FE5C0000 144 144 - - r-x-- libjava.so
FE5F4000 8 8 8 - rwx-- libjava.so
FE600000 8 8 8 - rwx-- [ anon ]
FE610000 56 48 - - r-x-- libverify.so
FE62E000 8 8 8 - rwx-- libverify.so
FE640000 64 64 64 - rwx-- [ anon ]
FE654000 8 8 - - rwxs- [ anon ]
FE660000 32 32 - - rw-s- dev:321,8 ino:1319830649
FE670000 24 24 - - r-x-- nss_files.so.1
FE686000 8 8 8 - rwx-- nss_files.so.1
FE690000 8 8 - - r--s- dev:256,65545 ino:218017
FE6A0000 64 64 64 - rwx-- [ anon ]
FE6C0000 32 24 - - r-x-- libhpi.so
FE6D8000 8 8 8 - rwx-- libhpi.so
FE6DA000 8 - - - rwx-- libhpi.so
FE6E0000 16 16 - - r-x-- libmp.so.2
FE6F4000 8 8 8 - rwx-- libmp.so.2
FE700000 680 640 - - r-x-- libm.so.2
FE7B8000 32 32 24 - rwx-- libm.so.2
FE7D0000 8 8 - - r--s- dev:256,65545 ino:35666
FE7E0000 8 8 - - r-x-- libmd5.so.1
FE7F2000 8 8 8 - rwx-- libmd5.so.1
FE800000 7936 6592 - - r-x-- libjvm.so
FEFC0000 384 384 384 - rwx-- libjvm.so
FF020000 16 16 16 - rwx-- dev:256,65545 ino:35653
FF024000 64 56 56 - rwx-- libjvm.so
.............................................................
FF3B0000 208 208 - - r-x-- ld.so.1
FF3EC000 16 16 - - r-x-- libthread.so.1
FF3F4000 8 8 8 - rwx-- ld.so.1
FF3F6000 8 8 8 - rwx-- ld.so.1
FFBBE000 24 - - - ----- [ anon ]
FFBE0000 128 128 128 - rwx-- [ stack ]
-------- ------- ------- ------- -------
total Kb 3922736 3411512 2942456
Root cause: Java Heap increase!
As you can see in the above pmap analysis, our 32-bit Java process size was getting very close to the 4 GIG limit; leaving no room for additional EAR file deployment.
We can see a direct correlation between the Java Heap increase and the appearance of this OutOfMemoryError. Since the Java 1.5 VM is mapping the entire EAR file to a native memory address space; proper native memory and address space must be available to fulfil such ZipFile.open() native operation.
The solution was to simply revert back our Java Heap increase which did allow the deployment of our EAR file.
Other long term solutions will be discussed shortly such as vertical scaling of the Weblogic Integration (adding more JVM’s / managed servers to existing physical servers), switching to the 64-bit JVM and / or upgrade to the Sun Java VM 1.6.
Conclusion and recommendations
- - When facing an OutOfMemoryError with the Sun Java VM, always do proper analysis and your due diligence to determine which memory space is the problem (Java Heap, PermGen, native / C-Heap)
-
- When facing an OutOfMemoryError due to native memory exhaustion, always generate Solaris pmap snapshots of your Java process and do your analysis. Do not increase the Java Heap (Xms / Xmx) as this will make the problem even worse
- When facing an OutOfMemoryError due to native memory exhaustion, always generate Solaris pmap snapshots of your Java process and do your analysis. Do not increase the Java Heap (Xms / Xmx) as this will make the problem even worse
-
- Be very careful before attempting to increase your Java Heap or PermGen space. Always ensure that you understand your total Java VM memory footprint and that you have enough native memory available for the non Java Heap memory allocations such as MMAP Files for your application EAR files
- Be very careful before attempting to increase your Java Heap or PermGen space. Always ensure that you understand your total Java VM memory footprint and that you have enough native memory available for the non Java Heap memory allocations such as MMAP Files for your application EAR files
1 comments:
fantastic solution mate, I love your detailed analysis and approach. this is a tricky error which takes some time to solve if you have memory leak by the way Tomcat also has history of going down with OutOfMemroyError , I have blogged my experience as java.lang.OutOfMemroyError:PermGen space in Tomcat you may find useful.
Post a Comment