/ OutOfMemoryError: Out of swap space - Problem Patterns ~ Java EE Support Patterns

3.05.2012

OutOfMemoryError: Out of swap space - Problem Patterns

Today we will revisit a common Java HotSpot VM problem that you probably already experienced at some point in your JVM troubleshooting experience on Solaris OS; especially on a 32-bit JVM.

This article will provide you with a description of this particular type of OutOfMemoryError, the common problem patterns and the recommended resolution approach.

If you are not familiar with the different HotSpot memory spaces, I recommend that you first review the article Java HotSpot VM Overview before going any further in this reading.

java.lang.OutOfMemoryError: Out of swap space? – what is it?

This error message is thrown by the Java HotSpot VM (native code) following a failure to allocate native memory from the OS to the HotSpot C-Heap or dynamically expand the Java Heap etc... This problem is very different than a standard OutOfMemoryError (normally due to an exhaustion of the Java Heap or PermGen space).

A typically error found in your application / server logs is:

Exception in thread "main" java.lang.OutOfMemoryError: requested 53459 bytes for ChunkPool::allocate. Out of swap space?

Also, please note that depending of the OS that you use (Windows, AIX, Solaris etc.) some OutOfMemoryError due to C-Heap exhaustion may not give you detail such as “Out of swap space”. In this case, you will need to review the OOM error Stack Trace and determine if the computing task that triggered the OOM and determine which OutOfMemoryError problem pattern your problem is related to (Java Heap, PermGen or Native Heap exhaustion).

Ok so can I increase the Java Heap via –Xms & -Xmx to fix it?

Definitely not! This is the last thing you want to do as it will make the problem worse. As you learned from my other article, the Java HotSpot VM is split between 3 memory spaces (Java Heap, PermGen, C-Heap). For a 32-bit VM, all these memory spaces compete between each other for memory. Increasing the Java Heap space will further reduce capacity of the C-Heap and reserve more memory from the OS.

Your first task is to determine if you are dealing with a C-Heap depletion or OS physical / virtual memory depletion.

Now let’s see the most common patterns of this problem.

Common problem patterns

There are multiple scenarios which can lead to a native OutOfMemoryError. I will share with you what I have seen in my past experience as the most common patterns.

-        Native Heap (C-Heap) depletion due to too many Java EE applications deployed on a single 32-bit JVM (combined with large Java Heap e.g. 2 GB) * most common problem *
-        Native Heap (C-Heap) depletion due to a non-optimal Java Heap size e.g. Java Heap too large for the application(s) needs on a single 32-bit JVM
-        Native Heap (C-Heap) depletion due to too many created Java Threads e.g. allowing the Java EE container to create too many Threads on a single 32-bit JVM
-        OS physical / virtual memory depletion preventing the HotSpot VM to allocate native memory to the C-Heap (32-bit or 64-bit VM)
-        OS physical / virtual memory depletion preventing the HotSpot VM to expand its Java Heap or PermGen space at runtime (32-bit or 64-bit VM)
-        C-Heap / native memory leak (third party monitoring agent / library, JVM bug etc.)

Troubleshooting and resolution approach

Please keep in mind that each HotSpot native memory problem can be unique and requires its own troubleshooting & resolution approach.

Find below a list of high level steps you can follow in order to further troubleshoot:

-        First, determine if the OOM is due to C-Heap exhaustion or OS physical / virtual memory. For this task, you will need to perform close monitoring of your OS memory utilization and Java process size. For example on Solaris, a 32-bit JVM process size can go to about 3.5 GB (technically 4 GB limit) then you can expect some native memory allocation failures. The Java process size monitoring will also allow you to determine if you are dealing with a native memory leak (growing overtime / several days…)

-        The OS vendor and version that you use is important as well. For example, some versions of Windows (32-bit) by default support a process size up to 2 GB only (leaving you with minimal flexibility for Java Heap and Native Heap allocations). Please review your OS and determine what is the maximum process size e.g. 2 GB, 3 GB or 4 GB or more (64-bit OS)

-        Like the OS, it is also important that you review and determine if you are using a 32-bit VM or 64-bit VM. Native memory depletion for a 64-bit VM typically means that your OS is running out of physical / virtual memory

-        Review your JVM memory settings. For a 32-bit VM, a Java Heap of 2 GB+ can really start to add pressure point on the C-Heap; depending how many applications you have deployed, Java Threads etc… In that case, please determine if you can safely reduce your Java Heap by about 256 MB (as a starting point) and see if it helps improve your JVM memory “balance”.

-        Analyze the verbose GC output or use a tool like JConsole to determine your Java Heap footprint. This will allow you to determine if you can reduce your Java Heap in a safe manner or not

-        When OutOfMemoryError is observed. Generate a JVM Thread Dump and determine how many Threads are active in your JVM; the more Threads, the more native memory your JVM will use. You will then be able to combine this data with OS, Java process size and verbose GC; allowing to determine where the problem is

Once you have a clear view of the situation in your environment and root cause, you will be in a better position to explore potential solutions as per below:

-        Reduce the Java Heap (if possible / after close monitoring of the Java Heap) in order to give that memory back to the C-Heap / OS
-        Increase the physical RAM / virtual memory of your OS (only applicable if depletion of the OS memory is observed; especially for a 64-bit OS & VM)
-        Upgrade your HotSpot VM to 64-bit (for some Java EE applications, a 64-bit VM is more appropriate) or segregate your applications to different JVM’s (increase demand on your hardware but reduce utilization of C-Heap per JVM)
-        Native memory leak are trickier and requires deeper dive analysis such as analysis of the Solaris pmap / AIX svmon data and review of any third party library (e.g. monitoring agents). Please also review the Oracle Sun Bug database and determine if your HotSpot version you use is exposed to known native memory problems

Still struggling with this problem? Don’t worry, simply post a comment / question at the end of this article. I also encourage you to post your problem case to the root cause analysis forum.

35 comments:

Hello, we are experiencing the out of swap space problem in our production server for 4 months now. It does not happen on a regular base, it may be 2-3 days before the crash, or 10 days. I have tried almost everything regarding the jvm.config with no success.

The problem has started since we switched to a new windows 2008 64bit server.

We are running Coldfusion 8 32bit, windows has 8 gb of physical ram.

The jvm args are -Xmx1024m -XX:MaxPermSize=192m.

I am definatelly sure that it has something to do with the OS. Can you provide any help?

Hi k liakos and thanks for your comments & questions,

Since OOM events are happening after a few days, this could be a symptom of native memory leak (Java process size growing over time). OOM out of swap space means that the OS is refusing a native memory allocation to the JVM process. This can happen as a result of Java 32-bit native memory space depletion (4 GB limit for 32-bit JVM) or OS overall depletion.

It is very important, as a starting point, to identify which one of the problem pattern, as per above, you are dealing with e.g. OS vs. 32-bit native memory space depletion.

Can you please answer the following questions:

- What is the Java process size observed from Windows when OOM is obsreved? e.g. look at the Windows Virtual Bytes counter in PerfMon
- What is the overall Windows OS RAM/virtual memory utilization when OOM event is observed?
- How many Java threads were active in the JVM at the time of the OOM event?

If not enabled already, please ensure that you configure all perfmon counters allowing you to determine your problem type.

Regards,
P-H

Thanks for your answer!

Here are some clues about our system:
- Coldfusion 8,0,1 Enterprise
- Java Version 1.6.0_24 32bit
- Operating System Windows Server 2008 R2 64 bit 8GB RAM installed

We had the coldfusion running on a dedicated standalone non vm windows 2008 server with 4gb ram. No problems at all. We switched to a datacenter vm windows 2008 server with 4gb ram. The problems started from the first week. I thought we had a physical ram problem so i upgraded to 8bg ram. No effect at all.

The last 4 months we had 36 jvm crashes, all with the same reason: java.lang.OutOfMemoryError: requested 1122096 bytes for Chunk::new. Out of swap space?

The coldfusion has not crashed for any other reason than that.

I enabled virtualvm to monitor the jvm heap, but i cannot get a heap dump during the problem allthough i have enabled HeapDumpOnOutOfMemoryError on coldfusion and and the "Heap dump on OOME" on visualvm. Can't figure out why.

I have made a lot of search in forums and read many articles. All comes to the point where the jvm itself is not the problem, but something combined with the OS ram.

My windows knowledge is limited so i do not know to monitor the os ram behavior. Can you help me with this?

As far as the questions you are asking, i only have the error logs from Coldfusion8/runtime/bin. Can we dig anything out of these?

This is during the last crash:

Heap
PSYoungGen total 27712K, used 15129K [0x3a980000, 0x3d1a0000, 0x4fed0000)
eden space 24192K, 48% used [0x3a980000,0x3b4dc188,0x3c120000)
from space 3520K, 99% used [0x3c180000,0x3c4ea2a0,0x3c4f0000)
to space 8448K, 0% used [0x3c960000,0x3c960000,0x3d1a0000)
PSOldGen total 370304K, used 336351K [0x0fed0000, 0x26870000, 0x3a980000)
object space 370304K, 90% used [0x0fed0000,0x24747f70,0x26870000)
PSPermGen total 104832K, used 99394K [0x03ed0000, 0x0a530000, 0x0fed0000)
object space 104832K, 94% used [0x03ed0000,0x09fe0a50,0x0a530000)

Memory: 4k page, physical 8388088k(4229416k free), swap 16774324k(12724416k free)

Thank in advance for your help.

Hi and thanks for the detail,

Monitoring the Java Heap will not help here since this is not where the problem is. We can see that from your last crash, your Java Heap utilization is not even at 50% vs. 1 GB max capacity. Generating a Heap Dump will not help either, since Heap Dump will only show you the Java Heap footprint + some PermGen & class loading data; again this is not where the problem is.

The OS is also showing plenty of physical & virtual memory left unless you are dealing with memory constraint from your allocated Windows VM.

The next step is to deep dive at the Windows OS and monitor the Java process size (Virtual Bytes) using perfmon. This memory space is outside the Java Heap. The upper limit for a 32-bit JVM is 4 GB and even lower on some Windows OS version. Once the Java process size is close to this value, Windows OS will start to refuse further memory allocations and JVM will throw java.lang.OutOfMemoryError: requested ? bytes for Chunk::new. Out of swap space?

My recommendation is a per below:

- Engage your Windows OS administrator (unless you have access yourself)

1) Open perfmon (perfmon.exe) tool from Windows OS

2) Under the Monitoring Tools >> Performance Monitor right click in the right screen and select "Add Counters..."

3) Select: Process >> Virtual Bytes
4) Now in the tab below, Select your desired Java process
5) CLick on the "add>>" buttom to add the new Virtual Bytes counter

You should see a new counter with Virtual Bytes for your Java (ColdFusion) process. This will allow you to monitor it and determine if it is growing over time. Look at the Last, average and Maximum values.

Once it is setup, lease let me know what value you see. It will be very important to capture the Virtual Bytes value next time you see an OOM event.

My last recommendation, next time you see OOM even, please generate a JVM thread dump as well and determine how many Threads were active at the time of the crash. Thread require native memory and can also be a source of native memory depletion.

For Thread Dump, you can use the /bin/jstack.exe utility.

Thanks.
P-H

Thanks again for the answer.

I have enabled the perfmon counter for the virtual bytes of jrun. The jrun is currently allocating 1.436.848.128 = 1,33 GB (normal). It is not growing past this value for now.

As for the thread dump i have the jvm dumps at the time of the crash. Nothing unusual there. The total number varies from 300 to 500 threads max, which is the normal i see every day for normal use.

The problem with dumps is that there is a service that automatically restarts the jvm when it is unresponsive for long time. So i rarely have the chance to experience the crash real time.

And the other major problem is that i can not reprodure the crash, no matter what i try. This is realy frustrating when i have to deal with bugs.

Lets go to the possible solutions now:

- The last thing i tried was to set the min total heap size equal to the max. I read somewhere that this is good to avoid ram allocation problems when the heap shrinks and expands. But from what you tell me, the problem is not the heap itself, but the rest of the java process. So i dont expect any results from this.

- The other solution that has the best chance to succeed, is to switch to 64 bit jvm. Coldfusion 8 supports it. Another post here http://www.codingthearchitecture.com/2008/01/14/jvm_lies_the_outofmemory_myth.html suggests the same thing. This way i will not have the 32 bit limitations, and moreover will allow me to to expand the total heap size past the 1gb limit and get the most out of the 8gb physical memory.

What do you think about this solution? I am realy curious to find out the problem, but it is also very urgent to find a permanent solution.

Ok i got the first weird behavior from monitoring the jrun virtual bytes! It has grown up to 1.87 GB, allthough taskmgr and resource monitor report 1.3 GB.

Do you have a clue what this means? I need to be prepared to check everything in case the crash happens, so please tell me what to look for according to this situation.

Hi,

This is normal, Task Manager is not fully reliable from a Java process size footprint perspective. Virtual Bytes will give you the true picture so please keep a close eye on it.

For now, simply make sure that you gather the Virtual Bytes data the next time you see an OOM event. This will allow us to conclude if you are dealing with a native memory leak.

In term of solutions, this will of course depend of the root cause. Upgrading to a 64-bit JVM is an option to consider, assuming you are not dealing with a fast memory leak. In his scenario, it may buy you some time but eventually your OS would run out of memory.

For now, check mem usage on a regular basis and attempt to determine a growing pattern hourly or daily.

OOM due to native memory problems are more complex in nature but not impossible to resolve.

Regards,
P-H

Your suggestion is that if i have a memory leak, it does make any difference if i have 64 bit jvm or larger heap. It will eventually exhast my system memory.

Another question please: Is it safe to monitor the jrun threads with the perfmon by selecting the counter processes >> threads >> jrun?

Regarding my case, is there anything else usefull i should set a counter on with the perfmon?

Thanks for all the help so far.

Hi,

Yes it is totally safe to monitor jrun threads using this counter. But please ensure that you also capture a Thread Dump as soon as you seen any new OOM occurences.

Threads & Virtual Bytes are the 2 primary counters that you need right now, along with thread dump generation. The leak theory will be only proven when observed, you also need to look for a "sudden" event behaviour reaching out the 4 GB process size. This is why Thread Dump generation is critical when OOM is observed.

Have you seen OOM errors such as "unable to create new native thread" ?

Hope this help.
Regards,
P-H

The same time the OOM occurs, there are several entries in the application log "unable to create new native thread".

Is this a symptom or a cause of the OOM?

Perfect, glad I asked. Again, all symtoms of native memory space depletion. In this scenario, excessive number of threads created could be at the root cause (threads require native memory available as per stacksize etc.) but could also as a symptom. This will be proven with threads counter and thread dump capture next time you see OOM.

Can you share the Java thread stacktrace generated from the errors "unable to create new native thread".

It will be interestng and important to correlate the thread utilization vs. Virtual Bytes.

Thanks.
P-H

You can view the log as reported by coldfusion before a crash here https://docs.google.com/open?id=0B7whP3AgNi1IRUkwRmNwVk5mRlU

If you do a search in the txt you will find 14 "java.lang.OutOfMemoryError: unable to create new native thread" entries

Hi,

I just reviewed your logs. We can see that Coldfusion is trying to allocate a new JDBCConnection and failing due to "unable to create new native thread". This implementation strategy (coldfusion engine), by default, will be heavy on threads and native memory. This means that "non happy" path scenarios can increase pressure point on native memory space.

## New thread created during JDBC Connection reservation
java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at coldfusion.server.j2ee.sql.pool.ConnectionRunner.fetchConnection(ConnectionRunner.java:42)
at coldfusion.server.j2ee.sql.pool.JDBCPool.create(JDBCPool.java:522)
at coldfusion.server.j2ee.sql.pool.JDBCPool._checkOut(JDBCPool.java:447)
................

What i'm suspecting right now is that you are either leaking JDBC Connection/threads or dealing with sudden surge/event (e.g. DB slowdown etc.). This leak or sudden event depletes your JVM native memory space, which is capped at 4 GB for 32-bit JVM process. At that point, OOM are thrown.

Find below summary of recommendations:

- Continue with close threads & Virtual Bytes monitoring and draw correlation between the 2. Make sure to capture JVM thread dump * VERY important* next re-occurence or when close to 4 GB process limit
- Add close monitoring of the JRun/Coldfusion JDBC Pool
- Review your Jrun/Coldfusion JDBC & thread turing parameters, make sure they do not allow too many connections/threads

A typical 32-bit JVM should not allow more than 300-400 threads created/active at a given time.

Regards,
P-H

Hello, here we are are again with a new crash today.

I was lucky enough to have the coldfusion monitor open at the time of the crash. To be accurate, i resyarted the coldfusion server before i let the crach happen.

So let me tell you about the facts i witnessed when the server became unresponsive:
- The jvm heap size was around 600mb. Pretty normal.
- The garbage collection was not behaving as usual. This means i was not seeing spikes as normal on the jvm heap graph, but flat lines.
- The threads number had dropped to 304. This is almost the same number as when the coldfusion starts with no applications running ( only native threads ). This can be explained because new threads could not be created.
- Some applications were running normal. This means some threads were still running normal. This is something strange.
- Because this is a production server, a saved 3 heap dumps and 3 thread dumps and i restarted the jvm. If a had left the problem to continue, i would have got a complete crash and the out of swap message in the log.

Now the facts from the performance monitor. Nothing outrageous there:
- The jrun starts with 1.9GB of virtual bytes. At the time of the crash it reached a peak of 2.115GB, the highest number for the last 5 days. Relativelly small increase. But this is something to work with.
- The jrun threads start from 300. At the time of the crash there were 365 threads. This was not the peak however. There were values up to 378 threads recorded the previous days. Normal i think.
- Available mbytes in server were plenty enough. 3.5 GB of free physical memory.

You can find the thread dump here: https://docs.google.com/open?id=0B7whP3AgNi1ISEtFM3QzaW1FaEE

I do not know how to debug the thread dump.

So what is the conclusion from all these? Is the jrun 2.115GB the upper limit for the process? Isn;t this pretty low? How do i deal with the situation?

Hi k.liakos,

After a closer review and your latest results, I'm suspecting that the problem is that you are reaching the 2 GB process size limit for Java on Windows when using 32-bit JVM. Can you paste the EXACT numbers of virtual bytes you saw?

This 2 GB cap is also applicable for 64-bit Windows for Java since HotSpot no longer support /3GB switch since version 1.6.

## Oracle HotSpot limitation reference
http://www.oracle.com/technetwork/java/hotspotfaq-138619.html
"As of Java SE 6, the Windows /3GB boot.ini feature is not supported."

http://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx

At this point my main recommendation is that you upgrade your HotSpot JVM to 64-bit. This will allow you to scale it properly and take advantage of your 64-bit OS and RAM availability.

At short term, you can also try to reduce your Java Heap by 128, this can buy you some time until your upgrade your JVM to 64-bit. This is only an option if you see your Java Heap not higher then 60% after a major collection (Full GC).

Thanks.
P-H

The exact number of the jrun peak virtual bytes is 2.115.321.856 bytes = 1.97005 GB

My previous post numbers were not bit converted.

Thanks for all the answers.

Just as of vuriosity i will redure the heap size by 128mb to see what will happen. And the next step is to migrate to 64but jvm, which i have allready set up successefully in a our dev server.

Thanks for all the answers

You are welcome.

It is now very clear that you are reaching the 2 GB limit which I definitely believe is your root cause as per limitations I highlighted.

Please keep me posted on your next steps and results.

Regards,
P-H

Fow now i reduced the jvm heap max size to 896mb

The jrun started with 1.766.178.816 virtual bytes, so i am expecting it not to reach the 2 gb limit.

I will monitor the behavior for a week to see if everything is as expected. And then i will probably switch to the 64bit jvm to avoid any similar limitations.

Sounds like a plan.

Thanks.
P-H

Here we are again after a week.

Let me present some facts:
-The server has been running for a week smoothly
-The max jvm heap was set to 960mb
-The jrun virtual bytes initial value was 1.780.224.000
-After 8 hours of running the jrun virtual bytes were 1.938.030.592
-From that time on and for the next 7 days i was observing a minor daily increase of the virtual bytes
-The threads stabilised after a couple of days to standard range values ( 330-360 average )
-Today eventually the virtual bytes have reached the value of 2.090.000.00

This means the jrun is getting closer to the 2.147.483.648 virtual bytes value, which is the upper limit for the process. So it is a matter of two, maybe three days, before the jvm crashes again with an out of swap space error message.

I thought i had to give the 32bit jvm a chance to prove that it can run smoothly under certain jvm settings. But the measurements show me that i have a native jrun memory leak, which has nothing to do with the jvm heap size.

Shall i wait for the crash to happen? Certailny not beacause this is our production server. I am keeping a close eye and probably i will force a jvm restart today.

I am starting to worry if there is a problem with our applications. Because if this is the case, then the 64bit will have the same results( probably crashing with more delay)

Hi k.liakos,

Thanks for posting your latest results. That would give you a survival or leak rate around 7-10 days.

That being said, it is always hard to conclude on a true native leak with a JVM process limit of 2 GB. HotSpot performs some optimization overtime which will use native memory overtime but should stabilize at some point.

There is definitely no value to wait for the crash to happen.
Given it is a production environment I recommend the following:

- Setup a regular and preventive restart schedule e.g. every 7-10 days
- In parallel, upgrade your HotSpot JVM to 64-bit

Once you upgrade to 64-bit, we can revisit this native memory increase behavior and determine if the problem still exist along with next steps.

Another strategy would be to attempt to simulate as per below:
- Simulate the native leak/process size increase up to the crash e.g. OOM using 32-bit JVM
- Re-run same load test scenario this time with 64-bit JVM and compare results/determine if leak still exist

Regards,
P-H

Hi PH,

I have gone through article,very informative.thanks for the help.I am at client site,they facing problems with swap spacc and jvm.Below is the details of Problems.

What is the impact of swap space usage on Unix servers?
How is it used by the JVM and how it can affect functionality of the JVM?
We have seen occurrences of swap taking down whole server.what posibilities that can hamper application behavior because of swap space?

Note: we are on Virtual Machine users and enviroments. 32 bit jvm's

Appreciate your solutions for the above queries.Or point us to right info.


Let me give some feedback from my experience.

After installing 64bit Coldfusion JVM, (5 months now) the out of swap space never appeared again. To be honest, the Coldfusion server never experienced any out of memory problems of any kind again.

So, if you have the ability to go 64bit, don't even bother with any other kind of solution.

Hi Pradeep,

As per k.liakos, typically an upgrade from 32-bit to 64-bit is recommended; especially when dealing with native heap depletion.

That being said, I still recommend to perform a root cause analysis of your client problem(s) so you can provide clear facts and justification for such upgrade. You also have to rule out OS related problem.

The HotSpot JVM does requires SWAP memory available from the Solaris OS for native related tasks such as Thread creation etc. You will need to monitor the SWAP memory utilization very closely and determine the root cause of the failures observed e.g. OS level depletion vs. 32-bit JVM level (Java process memory address space depletion) depletion.

Regards,
P-H

Hi P-H,

Thanks for the articles! Very insightful!

We had a system crush two weeks ago due to "java.lang.OutOfMemoryError" exception. Here is a snippet of the error I got from the log:

java.lang.OutOfMemoryError
at sun.security.pkcs11.wrapper.PKCS11.C_DecryptUpdate(Native Method)
at sun.security.pkcs11.P11Cipher.implUpdate(P11Cipher.java:550)
at sun.security.pkcs11.P11Cipher.engineUpdate(P11Cipher.java:465)
at javax.crypto.Cipher.update(DashoA13*..)
at com.sun.net.ssl.internal.ssl.CipherBox.decrypt(CipherBox.java:228)

Based on your post regarding the error pattern, I reckon that it was because the C-Heap depletion and it failed to assign memory to a new thread. We are running 64 bit JVM and OS has 8 GB memory. The heap size is defined as follows:

CATALINA_OPTS=$CATALINA_OPTS$JDD_SEP"-Xmx4000m"
CATALINA_OPTS=$CATALINA_OPTS$JDD_SEP"-Xms4000m"
CATALINA_OPTS=$CATALINA_OPTS$JDD_SEP"-XX:MaxPermSize=160m"

I intend to reduce the heap size settings from the above to prevent another possible same scenario to happen. Would you recommend me to do so? If so, what values should I put for the first two settings?

Besides, you mentioned from another post that "For a 64-bit VM, the C-Heap capacity = Physical server total RAM & virtual memory – Java Heap - PermGen", do "Java Heap" and "PermGen" mean the actual space it occupies or the maximum size we specified in configuration?

Hi Larry,

C-Heap depletion or process address depletion will mainly occur for 32-bit JVM. For 64-bit JVM processes, native OOM normally indicates physical + virtual memory depletion.

Did you have a chance to monitor the footprint of your 64-bit Java process? Since you are using a 4 GB Heap + 160 MB PermGen + X # threads , I expect a total Java process size around 5 GB+.

Do you know what was the physical/virtual memory utilization of your OS at the time of the OOM?

I don't recommend reducing the Java Heap yet, until we get a better view on the problem. Also, before reducing the Java heap, you will need to understand the footprint and GC health otherwise you could trade off one problem for another.

Regards,
P-H

Hi P-H,

Thanks for your input into this matter. I have been closely monitoring the our system before and after our scheduled system restart on Sunday. However, I didn't realize the importance of capturing the footprints of Java processes using "pmap" as you suggested in another article, I used "vmstat" and "prstat" instead. Please refer to the attached files "vmstat output" and "prstat output" for more detailed information.

I didn't get a chance to get the system statistic when OOM happened last time. But from the "vmstat output", you can see that the free physical memory is almost running out and "sr" is pushing high values. It was almost another OOM. In the same time, "prstat output" captured the java processes and their memory utilization. Please shed some more light on these statistics to troubleshoot the problem.

Much appreciate it,
Larry

Hi,

First thank you for all the information. It is really usefull.
I have a problem where I am getting this error.

# A fatal error has been detected by the Java Runtime Environment:
#
# java.lang.OutOfMemoryError: requested 8589934608 bytes for Chunk::new. Out of swap space?
#
# Internal Error (allocation.cpp:215), pid=8244, tid=10028
# Error: Chunk::new
#
# JRE version: 6.0_18-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode windows-amd64 )
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
#

--------------- T H R E A D ---------------

Current thread (0x0000000005833000): JavaThread "CompilerThread1" daemon [_thread_in_native, id=10028, stack(0x0000000007ad0000,0x0000000007bd0000)]

Stack: [0x0000000007ad0000,0x0000000007bd0000]

Current CompileTask:
C2:452 ! net.sf.saxon.event.ReceivingContentHandler.startElement(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Lorg/xml/sax/Attributes;)V (651 bytes)


--------------- P R O C E S S ---------------

Java Threads: ( => current thread )
0x00000000089d3800 JavaThread "connection manager" daemon [_thread_blocked, id=8480, stack(0x0000000009d90000,0x0000000009e90000)]
0x000000000804a800 JavaThread "XLoaderTask_loadXml" [_thread_blocked, id=2140, stack(0x0000000008370000,0x0000000008470000)]
0x00000000080ab800 JavaThread "XLoaderTask_flat2xml" [_thread_in_vm, id=11920, stack(0x0000000008270000,0x0000000008370000)]
0x0000000005836000 JavaThread "Low Memory Detector" daemon [_thread_blocked, id=10216, stack(0x0000000007bd0000,0x0000000007cd0000)]
=>0x0000000005833000 JavaThread "CompilerThread1" daemon [_thread_in_native, id=10028, stack(0x0000000007ad0000,0x0000000007bd0000)]
0x000000000582e800 JavaThread "CompilerThread0" daemon [_thread_blocked, id=9792, stack(0x00000000079d0000,0x0000000007ad0000)]
0x000000000582a000 JavaThread "Attach Listener" daemon [_thread_blocked, id=11440, stack(0x00000000078d0000,0x00000000079d0000)]
0x0000000005829000 JavaThread "Signal Dispatcher" daemon [_thread_blocked, id=10440, stack(0x00000000077d0000,0x00000000078d0000)]
0x00000000057bf000 JavaThread "Finalizer" daemon [_thread_blocked, id=10032, stack(0x00000000076d0000,0x00000000077d0000)]
0x00000000057b6800 JavaThread "Reference Handler" daemon [_thread_blocked, id=8052, stack(0x00000000075d0000,0x00000000076d0000)]
0x0000000001e28000 JavaThread "main" [_thread_blocked, id=10300, stack(0x0000000001e50000,0x0000000001f50000)]

Other Threads:
0x00000000057b1800 VMThread [stack: 0x00000000074d0000,0x00000000075d0000] [id=9960]
0x000000000583c000 WatcherThread [stack: 0x0000000007cd0000,0x0000000007dd0000] [id=4108]
VM state:not at safepoint (normal execution)
VM Mutex/Monitor currently owned by a thread: None

Heap
PSYoungGen total 691840K, used 451002K [0x00000000da950000, 0x00000001053f0000, 0x00000001053f0000)
eden space 684672K, 64% used [0x00000000da950000,0x00000000f5abe9d8,0x00000001045f0000)
from space 7168K, 100% used [0x00000001045f0000,0x0000000104cf0000,0x0000000104cf0000)
to space 7168K, 0% used [0x0000000104cf0000,0x0000000104cf0000,0x00000001053f0000)
PSOldGen total 286656K, used 166448K [0x00000000853f0000, 0x0000000096be0000, 0x00000000da950000)
object space 286656K, 58% used [0x00000000853f0000,0x000000008f67c1d0,0x0000000096be0000)
PSPermGen total 43904K, used 21793K [0x000000007fff0000, 0x0000000082ad0000, 0x00000000853f0000)
object space 43904K, 49% used [0x000000007fff0000,0x0000000081538518,0x0000000082ad0000)

Dynamic libraries:
0x0000000000400000 - 0x000000000042e000 C:\Program Files\Java\jdk1.6.0_18\bin\java.exe
0x0000000077ec0000 - 0x0000000077ffc000 C:\WINDOWS\system32\ntdll.dll
...
0x000007ff77170000 - 0x000007ff7717b000 C:\WINDOWS\System32\wshtcpip.dll

VM Arguments:
jvm_args: -Xms256m -Xmx2048m -Xbootclasspath/p:config -Djava.util.logging.config.file=D:\xGen\config\xgen_logging.properties -Dfile.encoding=ISO-8859-1
java_command: com.gbp.apl65.xgen.presentation.XLoader TERCIOS TERCIOS_EBUZON2_130925_6
Launcher Type: SUN_STANDARD

Environment Variables:
...



--------------- S Y S T E M ---------------

OS: Windows Server 2003 family Build 3790 Service Pack 2

CPU:total 16 (8 cores per cpu, 2 threads per core) family 6 model 26 stepping 5, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, ht

Memory: 4k page, physical 8377804k(1055808k free), swap 20136256k(8284244k free)

vm_info: Java HotSpot(TM) 64-Bit Server VM (16.0-b13) for windows-amd64 JRE (1.6.0_18-b07), built on Dec 17 2009 13:24:11 by "java_re" with MS VC++ 8.0 (VS2005)

time: Fri Oct 04 10:59:21 2013
elapsed time: 273 seconds


Any advice?

I am testing decreasing java heap memory to see if it works.
Thnak you four your help

Regards

Hi,
We are seeing the swap space increasing to more than 85% even though JVM heap utilization is under 50%
free -m
total used free shared buffers cached
Mem: 16050 13573 2477 0 126 465
-/+ buffers/cache: 12980 3070
Swap: 2047 1746 301

OS is linux, JVM is IBM WAS7
please advise the direction for me to debug the swap usage.

Hi,

As starting point, please look at your Java VM process(es) memory size (top & ps commands). See how much memory the IBM JVM is/are using vs your total capacity.

Also, can you please share your JVM memory settings such as --Xms & -Xmx.

Regards,
P-H

Hi ,

I need a urgent help and recommendation on the issue. We are having production issue. Where in tomcat gets crashed itself.

We are using 1.7 JDK 64 bit and RHEL 64 bit OS

java version "1.7.0_15"
Java(TM) SE Runtime Environment (build 1.7.0_15-b03)
Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

Linux saa3app2011xrp 2.6.32-504.8.1.el6.x86_64 #1 SMP Fri Dec 19 12:09:25 EST 2014 x86_64 x86_64 x86_64 GNU/Linux


-Xms4096m -Xmx4096m -XX:PermSize=256m -XX:MaxPermSize=256m

Getting below error

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 32744 bytes for ChunkPool::allocate
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (allocation.cpp:222), pid=1905293, tid=139870058260224
#
# JRE version: 7.0_15-b03
# Java VM: Java HotSpot(TM) 64-Bit Server VM (23.7-b01 mixed mode linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

--------------- T H R E A D ---------------

Current thread (0x00007f3618310800): VMThread [stack: 0x00007f3609132000,0x00007f3609233000] [id=1905321]

Stack: [0x00007f3609132000,0x00007f3609233000], sp=0x00007f3609230c20, free space=1019k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x8a540a] VMError::report_and_die()+0x2ea
V [libjvm.so+0x40abfb] report_vm_out_of_memory(char const*, int, unsigned long, char const*)+0x9b
V [libjvm.so+0x22b037] Chunk::operator new(unsigned long, unsigned long)+0x107
V [libjvm.so+0x22b0af] Arena::grow(unsigned long)+0x3f
V [libjvm.so+0x89bd2c] vframe::new_vframe(frame const*, RegisterMap const*, JavaThread*)+0x16c
V [libjvm.so+0x89be9e] vframe::sender() const+0x7e
V [libjvm.so+0x89a1ae] vframe::java_sender() const+0xe
V [libjvm.so+0x288a1c] get_or_compute_monitor_info(JavaThread*)+0x23c
V [libjvm.so+0x289dcc] bulk_revoke_or_rebias_at_safepoint(oopDesc*, bool, bool, JavaThread*)+0x43c
V [libjvm.so+0x28a565] VM_BulkRevokeBias::doit()+0x35
V [libjvm.so+0x8ae35c] VM_Operation::evaluate()+0x4c
V [libjvm.so+0x8acd70] VMThread::evaluate_operation(VM_Operation*)+0x80
V [libjvm.so+0x8ad2f6] VMThread::loop()+0x1e6
V [libjvm.so+0x8ad990] VMThread::run()+0x70
V [libjvm.so+0x746cf0] java_start(Thread*)+0x100

VM_Operation (0x00007f360902f290): BulkRevokeBias, mode: safepoint, requested by thread 0x00007f361831a800


Please help me in determining the root cause of this issue.

Hi Varunn,

The error indicates the JVM ran out of native memory (C-Heap). This would indicate in your case that you ran out of OS virtual memory from your Linux production server.

Can you please verify the physical RAM capacity and availability of your Linux production server/VM?

Thanks.
P-H

Below are the memory details that got recorded during the crash

/proc/meminfo:
MemTotal: 264507108 kB
MemFree: 671268 kB
Buffers: 1689156 kB
Cached: 147725232 kB
SwapCached: 648 kB
Active: 218932660 kB
Inactive: 39529128 kB
Active(anon): 104740964 kB
Inactive(anon): 4667380 kB
Active(file): 114191696 kB
Inactive(file): 34861748 kB
Unevictable: 11820 kB
Mlocked: 11820 kB
SwapTotal: 4194300 kB
SwapFree: 4175480 kB
Dirty: 18836 kB
Writeback: 0 kB
AnonPages: 109059208 kB
Mapped: 578772 kB
Shmem: 350152 kB
Slab: 3050908 kB
SReclaimable: 2661592 kB
SUnreclaim: 389316 kB
KernelStack: 339784 kB
PageTables: 386140 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 136447852 kB
Committed_AS: 183563508 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 691024 kB
VmallocChunk: 34221513036 kB
HardwareCorrupted: 0 kB
AnonHugePages: 101425152 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 4104 kB

Can this be a cause of the issue ?

JVM ran out of allowed memory mapping(65530), due to which it failed to create new native threads and in turn got crashed.

Hi Varun,

Yes, your data indicates a potential very low physical RAM usage at the time of the crash which correlates with the error e.g. the JVM C-Heap or C++ code is unable to perform native malloc operations.

Please have a look at your OS top memory consumer processes and determine the footprint. You will likely have to increase physical RAM capacity and/or look for any other rogue processes consuming too much memory.

Thanks.
PH

Post a Comment