This case study describes the complete steps from root cause analysis to resolution of a native memory problem experienced with a Weblogic 10.0 environment using Oracle JRockit R27.5.0.
Results: the reduction of the Java Heap did provide an instant relief as it did provide more physical RAM / native memory for our application and JRockit needs. We are still considering the approach #2 and #3 for future releases.
Environment specifications
· Java EE server: Weblogic 10.0
· OS: Microsoft Windows server 2007 (32 bit) with 4G of RAM
· JDK: Oracle JRockit(R) R27.5.0 (JDK 1.5.0_14)
· RDBMS: Oracle Database 10g Enterprise Edition Release 10.2.0.2.0
· Platform type: Internal Billing System
Monitoring and troubleshooting tools
· JVM Thread Dump (JRockit JDK 1.5 format)
· netstat OS command
· Windows System Monitor tool
Problem overview
· Problem type: Intermittent OutOfMemoryError and java.net.SocketException: No buffer space available
Intermittent problems was experienced on a Weblogic 10.0 environment and causing full outage. The problem did manifest as 2 different flavours:
1) java.net.SocketException: No buffer space available
Caused by: java.net.SocketException: No buffer space available (maximum connections reached?): connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:520)
at weblogic.net.http.HttpsClient.openWrappedSSLSocket(HttpsClient.java:652)
at weblogic.net.http.HttpsClient.openServer(HttpsClient.java:282)
at weblogic.net.http.HttpsClient.openServer(HttpsClient.java:524)
at weblogic.net.http.HttpsClient.<init>(HttpsClient.java:226)
at weblogic.net.http.HttpsClient.<init>(HttpsClient.java:214)
at weblogic.net.http.HttpsURLConnection.getHttpClient(HttpsURLConnection.java:262)
at weblogic.net.http.HttpURLConnection.getInputStream(HttpURLConnection.java:414)
at weblogic.net.http.SOAPHttpsURLConnection.getInputStream(SOAPHttpsURLConnection.java:37)
at weblogic.wsee.connection.transport.TransportUtil.getInputStream(TransportUtil.java:82)
at weblogic.wsee.connection.transport.http.HTTPClientTransport.receive(HTTPClientTransport.java:196)
at weblogic.wsee.ws.dispatch.client.ConnectionHandler.handleResponse(ConnectionHandler.java:162)
2) java.lang.OutOfMemoryError
--------------------------------------
/app/ejb/AppEJB.jar#orderService.; remaining name 'comp/env/loggingConfig'
javax.ejb.EJBException: EJB Exception: : java.lang.OutOfMemoryError: class allocation, 206363036 loaded, 8428437362228854784 footprint JVM@check_alloc (src/jvm/model/classload/classalloc.c:118). 3
at java.lang.ClassLoader.defineClass1(Native Method)
--------------------------------------
Problem mitigation did involve restarting the affected Weblogic server(s) when problem was observed.
Gathering and validation of facts
As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:
· Recent change of the affected platform? Yes, the platform did recently migrate the application from Weblogic 7.0 and Sun JDK 1.4 to Weblogic 10.0 and JRockit 27.5
· Any recent traffic increase to the affected platform? No
· Was Windows server performance data collected during the problem? Yes, the performance data was collected during the problem and did reveal the server was running low on physical RAM and virtual memory
· Any upstream and/or downstream system problem reported? No problem reported for any downstream application; including the internal Oracle10g database
· Did a restart of the Weblogic server resolve the problem? No, restarting the Weblogic server did only temporary alleviate the problem for about just a few hours
Netstat analysis
Given the severe Socket Exception found in log: java.net.SocketException: No buffer space available. The initial thinking was to analyse the number of connected and hanging sockets during the problem e.g. investigate any possible Socket leak issue.
This was achieved by executing the Windows command: netstat *note that you can also use netstat –an to bypass the DNS resolution*
Active Connections
Proto Local Address Foreign Address State
TCP SERVER_HOSTNAME :1039 SERVER_HOSTNAME xyz.xyz.xyz.xy:1040 ESTABLISHED
TCP SERVER_HOSTNAME :1040 SERVER_HOSTNAME xyz.xyz.xyz.xy:1039 ESTABLISHED
TCP SERVER_HOSTNAME :1044 SERVER_HOSTNAME xyz.xyz.xyz.xy:1045 ESTABLISHED
TCP SERVER_HOSTNAME :1045 SERVER_HOSTNAME xyz.xyz.xyz.xy:1044 ESTABLISHED
TCP SERVER_HOSTNAME :1046 SERVER_HOSTNAME xyz.xyz.xyz.xy:5988 CLOSE_WAIT
TCP SERVER_HOSTNAME :1047 SERVER_HOSTNAME xyz.xyz.xyz.xy:1196 ESTABLISHED
TCP SERVER_HOSTNAME :1196 SERVER_HOSTNAME xyz.xyz.xyz.xy:1047 ESTABLISHED
TCP SERVER_HOSTNAME :microsoft-ds 111.11.11.11:1451 ESTABLISHED
TCP SERVER_HOSTNAME :1028 SERVER_HOSTNAME xyz.xyz.xyz.xy:7001 ESTABLISHED
TCP SERVER_HOSTNAME :1031 xyz.xyz.xyz.xy:1521 ESTABLISHED
TCP SERVER_HOSTNAME :1032 xyz.xyz.xyz.xy:1521 ESTABLISHED ………………………………………………………………………………………………………..
However, the analysis was not conclusive since we found only a low volume of active and hanging sockets and no pattern that may suggest any sort of Socket leak as the root cause.
JRockit, OutOfMemoryError and log analysis
Analysis of this error found in the log was the key finding.
--------------------------------------
/app/ejb/AppEJB.jar#orderService.; remaining name 'comp/env/loggingConfig'
javax.ejb.EJBException: EJB Exception: : java.lang.OutOfMemoryError: class allocation, 206363036 loaded, 8428437362228854784 footprint JVM@check_alloc (src/jvm/model/classload/classalloc.c:118). 3
at java.lang.ClassLoader.defineClass1(Native Method)
--------------------------------------
This error did indicate that JRockit was unable to execute a simple Class allocation. These types of Java operations require native memory to be available (class declarations are loaded in native memory).
Here is some background on the Oracle JRockit memory management:
· Java heap – This is the memory that the JVM uses to allocate java objects.
· The maximum value of java heap memory is specified using –Xmx flag in the java command line. If the maximum heap size is not specified, then the limit is decided by the JVM considering factors like the amount of physical memory in the machine and the amount of free memory available at that moment. It is always recommended to specify the max java heap value.
· Native memory – This is the memory that the JVM uses for its own internal operations. The amount of native memory heap that will be used by the JVM depends on the amount of code generated, threads created, memory used during GC for keeping java object information and temporary space used during code generation, optimization etc.
· JRockit tend to uses more native memory in exchange for better performance. JRockit does not have an interpretation mode, compilation only, so due to its additional native memory needs the process size tends to use a couple of hundred MB larger than the equivalent Sun JVM size.
· JRockit does not have a separate permanent generation space (PermSize) as it is unbounded in the native memory section (for JRockit: heap+native memory [footprint] = process size).
Windows system performance data analysis
The team then did review the performance data captured by the Windows System Monitor tool.
The logs did indicate that Windows was running quite low on physical RAM; correlating with the earlier OutOfMemoryError analysis.
Root cause
The review and analysis of the 2 errors found did confirm the following root cause:
· Migrating from Weblogic 7.0 with Sun JDK 1.4 to Weblogic 10.0 with JRockit 27.5 did increase the native memory footprint required to run our application and ultimately leading to a break point under load.
· The OutOfMemoryError was due to the Windows server running out of physical RAM and preventing JRockit to successfully allocate its Java class definitions in the native memory.
· The java.net.SocketException: No buffer space available error did also manifest itself as a different flavour but also caused by a shortage of native / virtual memory.
Solution and tuning
3 areas were looked at during the resolution phase:
· After proper analysis, we decided to reduce of the Java Heap size by 256 MB which had plenty of buffers, in order to increase our native memory capacity.
· Revisit and reduce the application native memory footprint. That task was put on hold until implementation and monitoring of the first solution.
· Consider upgrading to a 64bit JRockit VM. That task was put on hold until implementation and monitoring of the first solution.
Results: the reduction of the Java Heap did provide an instant relief as it did provide more physical RAM / native memory for our application and JRockit needs. We are still considering the approach #2 and #3 for future releases.
Recommendations and best practices
· Always perform proper capacity planning when working on a Weblogic and JDK upgrade; including proper analysis of your application Java Heap and native footprint before and after upgrade.
· When using Oracle JRockit, especially for a 32bit system, ensure to fine tune your memory ratio properly e.g. Java Heap vs. native memory.
0 comments:
Post a Comment