Reason: weblogic.security.SecurityInitializationException: Authentication for user <user> denied
weblogic.security.SecurityInitializationException: Authentication for user <user> denied
at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.doBootAuthorization(Unknown Source)
at weblogic.security.service.CommonSecurityServiceManagerDelegateImpl.initialize(Unknown Source)
at weblogic.security.service.SecurityServiceManager.initialize(Unknown Source)
at weblogic.security.SecurityService.start(SecurityService.java:141)
at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
Truncated. see log file for complete stacktrace
Possible root causes and solutions
Root cause #1
The Weblogic boot.properties file is corrupted or contains invalid principal and credentials
Solution >> boot.properties reset
·Backup and clear the cache and data directories under <WL Domain>/servers/<Admin & Managed server>
·Recreate boot.properties (put back your plain text username and password) under <WL Domain>/servers/<Admin & Managed server>/security directory and restart the affected server(s)
Root cause #2
The Weblogic boot.properties file is valid but the security realm is corrupted or in an invalid state
Solution >> Weblogic Admin username and password reset
·Backup your Weblogic server domain
·Rename or delete <WL Domain>/security/DefaultAuthenticatorInit.ldift
This post is to inform you that I currently offer free Thread Dump analysis via email or real-time case study from my Blog. I currently have 10 years of experience with Java EE and Thread Dump analysis from different JVM vendors (Sun, IBM, JRockit etc.).
A JVM Thread Dump provides crucial information on your Java EE server (Weblogic, JBoss, WAS etc.) and application Thread processing and health. Thread dump analysis is one of the most important skills to acquire for any Java EE production support person.
Why is my consulting service free? My goal is to simply create a solid knowledge base of Thread Dump problem patterns on my Blog and share with Java EE production support individuals.
Thread Dump analysis – a simple 3 steps process
1)Email me @phcharbonneau with your Thread Dump snapshot data (captured during your incident). Please also provide the specifications of your environment (Java EE server vendor & version, Java VM vendor and version etc.) and a high level description of your problem
2)Wait for my initial reply. I will do my best to provide you with a quick answer in less than 24 hours along with possible root causes
3)With your approval, I may create a Thread Dump problem pattern case study on my Blog to share your Java EE problem along with its resolution to my other readers with no detail on your application name, code or client information
Thank you and looking forward to hear from your problems!
This post is to inform you that I currently offer free Thread Dump analysis via email or real-time case study from my Blog. I currently have 10 years of experience with Java EE and Thread Dump analysis from different JVM vendors (Sun, IBM, JRockit etc.).
A JVM Thread Dump provides crucial information on your Java EE server (Weblogic, JBoss, WAS etc.) and application Thread processing and health. Thread dump analysis is one of the most important skills to acquire for any Java EE production support person.
Why is my consulting service free? My goal is to simply create a solid knowledge base of Thread Dump problem patterns on my Blog and share with Java EE production support individuals.
Thread Dump analysis – a simple 3 steps process
1)Email me @phcharbonneau with your Thread Dump snapshot data (captured during your incident). Please also provide the specifications of your environment (Java EE server vendor & version, Java VM vendor and version etc.) and a high level description of your problem
2)Wait for my initial reply. I will do my best to provide you with a quick answer in less than 24 hours along with possible root causes
3)With your approval, I may create a Thread Dump problem pattern case study on my Blog to share your Java EE problem along with its resolution to my other readers with no detail on your application name, code or client information
Thank you and looking forward to hear from your problems!
This article will provide you some background on the Java HotSpot PermGen space and a 4 steps process on how you can monitor and tune your PermGen space for your Weblogic Java EE server.
Java HotSpot PermGen overview
The HotSpot PermGen (Permanent Generation) is a memory space specific to the Sun Java HotSpot VM and also applicable for any other Java EE server using the HotSpot VM such as Red Hat JBoss or IBM WAS. It is a special memory space for Class descriptors and other internal JVM objects.
If you are facing any PermGen related problem such as OutOfMemoryError, please refer to my Blog article OutOfMemoryError PermGen patterns for description and troubleshooting of the most common Weblogic PermGen issues.
PermGen tuning step #1 – Identify your JVM vendor
First, please make sure that PermGen space is actually applicable for your Java EE environment. JVM’s such as IBM SDK and Oracle JRockit do not have an actual PermGen space. For those VM’s, the memory is allocated either to the Java Heap or the native C Heap.
** PermGen space is only applicable for the Sun Java HotSpot VM 1.4, 1.5 & 1.6. There are actually some requests for enhancements for JDK 1.7 that may actually eliminate the PermGen and use the native C memory instead for the VM's representation of the Java program (Class descriptors / code). **
PermGen tuning step #2 – Review your current PermGen space settings
The minimum and maximum PermGen space configuration is achieved via the following parameters:
## Example: PermGen space with a minimum size of 256 MB
-XX:PermSize=256m
## Example: PermGen space with a maximum size of 256 MB
-XX:MaxPermSize=512m
The default Weblogic PermGen space settings are normally found from the commEnv.sh / commEnv.cmd scripts and located under:
<WL11g HOME>/wlserver_10.3/common/bin/
When using Weblogic Nodemanager to start and stop your managed servers (used by most Weblogic production systems), PermGen settings can be updated via the Weblogic console under >>> Home / Managed server 1...n / Server Start / Arguments field:
PermGen tuning step #3 – Review your current PermGen runtime footprint
PermGen space capacity and footprint can be analyzed by turning on verbose GC and monitoring the Full GC collections on a daily basis. This will allow you to determine your current PermGen footprint and identify any potential issue such as PermGen memory leak.
Verbose GC for the HotSpot VM can be enabled by simply adding the parameters below to your start-up VM parameters:
## Example: Enable verbose GC and print the output to a custom log
You can also use the free open source tool GC Viewer to graphically review your Java Heap memory footprint from the GC output data.
A bigger PermGen space requirement is to be expected for applications using a significant amount of code and class-loader trees e.g. many EAR files and / or applications using frameworks that depend heavily on dynamic class-loading / Refection API.
PermGen tuning step #4 – Tune your PermGen space, if required
Increasing your maximum capacity of the PermGen space is often required but you first need to ensure that you are not facing a memory leak such as class-loader leak. Please review your verbose GC output very closely and ensure that it is not growing constantly on a daily basis. You may need several days to confirm a PermGen memory leak since applications such Weblogic portal using dynamic JSP reload may increase the PermGen utilization over time (until all your JSP’s are compiled; loaded to their class-loader).
In order to increase your PermGen space to let’s say 512 MB, simply edit the Weblogic commEnv.sh or commEnv.cmd script and change the default from 128m to 512m or add / modify the MaxPermSize to 512m (-XX:MaxPermSize=512m) via the Server Start arguments (as per step #2 Weblogic console snapshot) if using Nodemanager.
** A Weblogic managed server(s) restart will be required in order to activate your new PermGen space settings. **
This case study describes the complete root cause analysis and resolution of an Oracle 10gR2 database problem (ORA-01115) affecting a Weblogic Integration 9.2 Java EE application.
Environment specifications
-Java EE server: Oracle Weblogic Integration 9.2 MP2
-Database server: Oracle 10gR2
-Middleware OS: Sun Solaris 5.10
-Database server OS: IBM AIX 5.3 TL5
-JDK: Sun Java HotSpot(TM) Server VM Version 1.5.0_10-b03
An IO Exception error was observed from the Weblogic Integration server logs. This Exception was thrown from the remote Oracle database when trying to insert data within WLI internal tables.
Gathering and validation of facts
As usual, a Java EE problem investigation requires gathering of technical and non technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:
·What is the client impact? HIGH (full outage of our application)
·Recent change of the affected platform? No
·Any recent traffic increase to the affected platform? No
·Since how long this problem has been observed? This error suddenly appeared for the first time early in the morning
·What is the health of the Oracle database server? Our Oracle DBA did confirm a problem regarding access to one of our Oracle data file (/oracle/db/oracle_data01.dbf')
·Did a restart of the Weblogic Integration server resolve the problem? No
·Conclusion #1: The problem seems to be isolated on the Oracle database server side and related to a possible IO problem accessing one of the Oracle internal data files.
Weblogic log file analysis: ORA-01115!
A first analysis of the problem was done by reviewing the Weblogic Integration managed server log errors.
ORA-01110: data file 30: '/oracle/db/oracle_data01.dbf'
ORA-27091: unable to queue I/O
ORA-27072: File I/O error
IBM AIX RISC System/6000 Error: 110: Media surface error
Additional information: 7
Additional information: 255461
Additional information: -1
at weblogic.jdbc.base.BaseBlobOutputStream.write(Unknown Source)
at weblogic.jdbc.base.BaseBlobOutputStream.write(Unknown Source)
at com.bea.wli.knex.runtime.core.bean.BMPContainerBean$OracleTableAccess.doStoreByInsert(BMPContainerBean.java:860)
at com.bea.wli.knex.runtime.core.bean.BMPContainerBean.doInsert(BMPContainerBean.java:1835)
at com.bea.wli.knex.runtime.core.bean.BMPContainerBean.ejbStore(BMPContainerBean.java:1787)
at com.bea.wli.bpm.runtime.ProcessContainerBean.ejbStore(ProcessContainerBean.java:204)
at com.bea.wli.bpm.runtime.ProcessContainerBean.ejbStore(ProcessContainerBean.java:191)
at com.bea.wlwgen.PersistentContainer_sq78z1_Impl.ejbStore(PersistentContainer_sq78z1_Impl.java:191)
at weblogic.ejb.container.manager.ExclusiveEntityManager.beforeCompletion(ExclusiveEntityManager.java:609)
at weblogic.ejb.container.manager.ExclusiveEntityManager.beforeCompletion(ExclusiveEntityManager.java:580)
at weblogic.ejb.container.internal.TxManager$TxListener.beforeCompletion(TxManager.java:1028)
at weblogic.transaction.internal.ServerSCInfo$CallBeforeCompletionsAction.run(ServerSCInfo.java:1529)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:363)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:147)
at weblogic.transaction.internal.ServerSCInfo.doBeforeCompletion(ServerSCInfo.java:1161)
at weblogic.transaction.internal.ServerSCInfo.callBeforeCompletions(ServerSCInfo.java:1143)
at weblogic.transaction.internal.ServerSCInfo.startPrePrepareAndChain(ServerSCInfo.java:115)
at weblogic.transaction.internal.ServerTransactionImpl.localPrePrepareAndChain(ServerTransactionImpl.java:1288)
at weblogic.transaction.internal.ServerTransactionImpl.globalPrePrepare(ServerTransactionImpl.java:2100)
at weblogic.transaction.internal.ServerTransactionImpl.internalCommit(ServerTransactionImpl.java:259)
at weblogic.transaction.internal.ServerTransactionImpl.commit(ServerTransactionImpl.java:227)
at weblogic.ejb.container.internal.MDListener.execute(MDListener.java:463)
at weblogic.ejb.container.internal.MDListener.transactionalOnMessage(MDListener.java:335)
at weblogic.ejb.container.internal.MDListener.onMessage(MDListener.java:291)
at weblogic.jms.client.JMSSession.onMessage(JMSSession.java:4072)
at weblogic.jms.client.JMSSession.execute(JMSSession.java:3962)
at weblogic.jms.client.JMSSession$UseForRunnable.run(JMSSession.java:4490)
at weblogic.work.ServerWorkManagerImpl$WorkAdapterImpl.run(ServerWorkManagerImpl.java:518)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)
As you can see, the ORA-01115 error was thrown (along with ORA-01110 ORA-27091 ORA-27072) during the insert operation of WLI data. An AIX related error was also thrown (Media surface error).
-Conclusion #2: The problem seems to be related to a physical inaccessibility (OS and / or disk) of this critical Oracle data file. A logical file corruption problem was also kept in mind as a possible root cause.
OS problem replication: The media surface is damaged!
After the initial log review and problem isolation, an AIX system admin was engaged to further troubleshoot this problem. A simple copy of the affected Oracle data file was attempted:
:> cp /oracle/db/oracle_data01.dbf .
cp: /oracle/db/oracle_data01.dbf: The media surface is damaged.
This simple problem replication step did confirm that the AIX OS also had lost control of the file.
The following AIX command was also executed in order to review the AIX OS error report:
errpt –a
…………………………………………………………………………………..
Description
DISK OPERATION ERROR
Probable Causes
MEDIA
User Causes
MEDIA DEFECTIVE
Recommended Actions
FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
PERFORM PROBLEM DETERMINATION PROCEDURES
Failure Causes
MEDIA
DISK DRIVE
Recommended Actions
FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
PERFORM PROBLEM DETERMINATION PROCEDURES
…………………………………………………………………………………..
-Conclusion #3: At this point, it was determined that the problem was very likely related to a physical problem with the SAN / RAID cluster. A technician was dispatched to perform further analysis of our RAID (Redundant Array of Independent Disks) environment.
Root cause: RAID disk failure!
Health check of the RAID environment did confirm that one of the 7 available disks failed earlier in the morning; just a few minutes before the first occurrence of the ORA-01115 error in the Weblogic logs.
Unfortunately, the disk failure triggered corruption of the RAID parity; affecting one of our Oracle data file. New files could be written to the RAID (6 disks left) but some files were not available due to parity corruption.
A disk failure has negative impact on parity. A disk can fail for various reasons such as the disk may be overused and due to excess variation in the power supply, it may result in the failure of the disk. Failure to replace it immediately can cause adverse affects on the system performance and may even damage the disks which are used in backup. The RAID parity provides protection against drive failures but if it encounters a parity error, it will start writing a bad data on the drive.
In our case, the technician was able to resolve the parity corruption without having to restore any data from the backup tapes. Resolution of the parity automatically allowed Oracle to gain back control of the affected date file; without any Oracle listener or database restart.
A complete Weblogic restart was performed as the final resolution step in order to re-initialize the affected WLI JDBC Connection pools.
Conclusion and recommendations
-When facing ORA-01115 with Oracle, please ensure to do a fast and complete root cause analysis
-Avoid taking unnecessary and non efficient resolution steps such as an early Weblogic restart or Oracle database restart
P-H is a Java EE consultant currently working for CGI Inc. in Canada. He has 10 years+ of experience developing and troubleshooting Java EE systems. Email: phcharbonneau@hotmail.com.