/ February 2012 ~ Java EE Support Patterns

2.29.2012

HashMap.get High CPU – Case Study

This article will describe the complete root cause analysis and solution of a HashMap High CPU problem (infinite looping) affecting a Weblogic 10.0 environment running on the Java HotSpot VM 1.5.

This case study will again demonstrate this importance of mastering Thread Dump analysis skill and CPU correlation techniques such as Solaris prstat.

Environment specifications

-        Java EE server: Oracle Weblogic Portal 10.0
-        Middleware OS: Solaris 10
-        Java VM: Java HotSpot VM 1.5
-        Platform type: Portal application

Monitoring and troubleshooting tools

-        JVM Thread Dump (HotSpot format)
-        Solaris prstat (CPU contributors analysis)

Problem overview

-        Problem type: High CPU observed from our Weblogic production environment

A high CPU problem was observed from our Solaris physical servers hosting a Weblogic Portal 10 environment. Users also reporting major slowdown of the portal application.

Gathering and validation of facts

As usual, a Java EE problem investigation requires gathering of technical and non-technical facts so we can either derived other facts and/or conclude on the root cause. Before applying a corrective measure, the facts below were verified in order to conclude on the root cause:

·        What is the client impact? HIGH
·        Recent change of the affected platform? No
·        Any recent traffic increase to the affected platform? Yes
·        How does this high CPU manifest itself?  A sudden CPU increase was observed and is not going down; even after load goes down e.g. near zero level.
·        Did an Oracle OSB recycle resolve the problem? Yes, but problem is returning after few hours or few days (unpredictable pattern)

-        Conclusion #1: The high CPU problem appears to be intermittent vs. pure correlation with load
-        Conclusion #2: Since high CPU remains after load goes down, this typically indicates either the presence of some infinite looping or heavy processing Threads

Solaris CPU analysis using prstat

Solaris prstat is a powerful OS command allowing you to obtain the CPU per process but more importantly CPU per Thread within a process. As you can see below from our case study, the CPU utilization was confirmed to go up as high as 100% utilization (saturation level).

## PRSTAT (CPU per Java Thread analysis)
prstat -L -p 8223 1 1

PID USERNAME  SIZE   RSS STATE  PRI NICE  TIME      CPU  PROCESS/LWPID
8223 bea10      2809M 2592M sleep   59    0  14:52:59        38.6% java/494
8223 bea10      2809M 2592M sleep   57    0   12:28:05       22.3% java/325
8223 bea10      2809M 2592M sleep   59    0   11:52:02       28.3% java/412
8223 bea10      2809M 2592M sleep   59    0   5:50:00         0.3% java/84
8223 bea10      2809M 2592M sleep   58    0   2:27:20         0.2% java/43
8223 bea10      2809M 2592M sleep   59    0   1:39:42         0.2% java/41287
8223 bea10      2809M 2592M sleep   59    0   4:41:44         0.2% java/30503
8223 bea10      2809M 2592M sleep   59    0   5:58:32         0.2% java/36116
……………………………………………………………………………………

As you can see from above data, 3 Java Threads were found using together close to 100% of the CPU utilization.

For our root cause analysis, we did focus on Thread #494 (decimal format) corresponding to 0x1ee (HEXA format).

Thread Dump analysis and PRSTAT correlation

Once the culprit Threads were identified, the next step was to correlate this data with the Thread Dump data (which was captured exactly at the same time as prstat).

A quick search within the generated Thread Dump file did reveal the Thread Stack Trace (Weblogic Stuck Thread #125) for 0x1ee as per below.

"[STUCK] ExecuteThread: '125' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=1 tid=0x014c5030 nid=0x1ee runnable [0x536fb000..0x536ffc70]
       at java.util.HashMap.get(HashMap.java:346)
       at org.apache.axis.encoding.TypeMappingImpl.getClassForQName(TypeMappingImpl.java:715)
       at org.apache.axis.encoding.TypeMappingDelegate.getClassForQName(TypeMappingDelegate.java:170)
       at org.apache.axis.encoding.TypeMappingDelegate.getClassForQName(TypeMappingDelegate.java:160)
       at org.apache.axis.encoding.TypeMappingImpl.getDeserializer(TypeMappingImpl.java:454)
       at org.apache.axis.encoding.TypeMappingDelegate.getDeserializer(TypeMappingDelegate.java:108)
       at org.apache.axis.encoding.TypeMappingDelegate.getDeserializer(TypeMappingDelegate.java:102)
       at org.apache.axis.encoding.DeserializationContext.getDeserializer(DeserializationContext.java:457)
       at org.apache.axis.encoding.DeserializationContext.getDeserializerForType(DeserializationContext.java:547)
       at org.apache.axis.encoding.ser.BeanDeserializer.getDeserializer(BeanDeserializer.java:514)
       at org.apache.axis.encoding.ser.BeanDeserializer.onStartChild(BeanDeserializer.java:286)
       at org.apache.axis.encoding.DeserializationContext.startElement(DeserializationContext.java:1035)
       at org.apache.axis.message.SAX2EventRecorder.replay(SAX2EventRecorder.java:165)
       at org.apache.axis.message.MessageElement.publishToHandler(MessageElement.java:1141)
       at org.apache.axis.message.RPCElement.deserialize(RPCElement.java:236)
       at org.apache.axis.message.RPCElement.getParams(RPCElement.java:384)
       at org.apache.axis.client.Call.invoke(Call.java:2467)
       at org.apache.axis.client.Call.invoke(Call.java:2366)
       at org.apache.axis.client.Call.invoke(Call.java:1812)
…………………………………………………………………………………………………………………………………………

Thread Dump analysis – HashMap.get() infinite loop condition!

As you can see from the above Thread Stack Trace, the Thread is currently stuck in an infinite loop over a java.util.HashMap that originates from the Apache Axis TypeMappingImpl Java class.

This finding was quite revealing. The 2 others Threads using high CPU also did reveal infinite looping condition within the same Apache Axis HashMap Object.

Root cause: non Thread safe HashMap in Apache Axis 1.4

Additional research did reveal this known defect affecting Apache Axis 1.4; which is the version that our application was using.

As you may already know, usage of non Thread safe / non synchronized HashMap under concurrent Threads condition is very dangerous and can easily lead to internal HashMap index corruption and / or infinite looping. This is also a golden rule for any middleware software such as Oracle Weblogic, IBM WAS, Red Hat JBoss which rely heavily on HashMap data structures from various Java EE and caching services.

Such best practice is also applicable for any Open Source third party API such as Apache Axis.

The most common solution is to use the ConcurrentHashMap data structure which is designed for that type of concurrent Thread execution context.

Solution

Our team did apply the proposed patch from Apache (synchronize the non Thread safe HashMap) which did resolve the problem. We are also currently looking at upgrading our application to a  newer version of Apache Axis.

Conclusion

I hope this case study has helped you understand how to pinpoint the root cause of high CPU Threads and the importance of proper Thread safe data structure for high concurrent Thread / processing applications.

Please don’t hesitate to post any comment or question.

Find office supplies promo codes to save money for your department's bottom line, so you can spend more on the latest software.

2.24.2012

Java Heap Space – JRockit VM

This article will provide you with an overview of the JRockit Java Heap Space vs. the HotSpot VM. It will also provide you some background on Oracle future plans regarding JRockit & HotSpot.

Oracle JRockit VM Java Heap: 2 different memory spaces

The JRockit VM memory is split between 2 memory spaces:

-        The Java Heap (YoungGen and OldGen)
-        The Native memory space (Classes pool, C-Heap, Threads...)

Memory Space
Start-up arguments and tuning
Monitoring strategies
Description
Java Heap
-Xmx (maximum Heap space)

-Xms (minimum Heap size)

EX:
-Xmx1024m
-Xms1024m
- verbose GC
- JMX API
- JRockit Mission Control tools suite
The JRockit Java Heap is typically split between the YoungGen (short-lived objects), OldGen (long-lived objects).



Native memory space
Not configurable directly.

For a 32-bit VM, the native memory space capacity = 2-4 Gig – Java Heap 

** Process size limit of 2 GB, 3 GB or 4 GB depending of your OS **

For a 64-bit VM, the native memory space capacity = Physical server total RAM & virtual memory – Java Heap

- Total process size check in Windows and Linux
- pmap command on Solaris & Linux
- JRockit JRCMD tool
The JRockit Native memory space is storing the Java Classes metadata, Threads and objects such as library files, other JVM and third party native code objects.


Where is the PermGen space?

Similar to the IBM VM, there is no PermGen space for the JRockit VM. The PermGen space is only applicable to the HotSpot VM. The JRockit VM is using the Native Heap for Class metadata related data. Also, as you probably saw from my other article, Oracle Sun is also starting to remove the PermGen space for the HotSpot VM.

Why is the JRockit VM Java process using more memory than HotSpot VM?

The JRockit VM tend to uses more native memory in exchange for better performance. JRockit does not have an interpretation mode, compilation only, so due to its additional native memory needs the process size tends to use a couple of hundred MB larger than the equivalent Sun JVM size. This should not be a big problem unless you are using a 32-bit JRockit with a large Java Heap requirement; in this scenario, the risk of OutOfMemoryError due to Native Heap depletion is higher for a JRockit VM (e.g. for a 32-bit VM, bigger is the Java Heap, smaller is memory left for the Native Heap).

What is Oracle’s plan for JRockit?

Current Oracle JVM strategy is to merge both HotSpot and JRockit product lines to a single JVM project that will include the best features of each VM. This will also simplify JVM tuning since right now failure to understand the differences between these 2 VM’s can lead to bad tuning recommendations and performance problems.

Please feel free to post any comment or question on the JRockit VM.

2.22.2012

Weblogic Free Consultation

This post is to inform you that I provide free Weblogic consulting services via this Blog root cause analysis forum & email. I will do my very best to provide you with guidance and share with you my tips on how resolve common Weblogic problems along with best practices.

My expertise background is as per below:

·        10 years of experience with development, support and hardening of Java EE systems and currently working as a full time employee at CGI Inc. Canada
·        Deep understanding and experience with Java EE containers such as Oracle Weblogic, Red Hat JBoss, Tomcat etc.
·        Deep understanding and experience of JVM tuning (Sun HotSpot, IBM JRE, Oracle JRockit)
·        Specialized in root cause analysis, performance tuning and advanced troubleshooting

Why is my IT consulting service from this Blog free? My goal is to create a solid knowledge base of Java EE support patterns on this Blog and share with Java EE production support individuals across the world. You can simply reward me by sharing this Blog with your other work colleagues and friends. With your approval, I might also use your problem case and create an article.
 
Weblogic consulting – a simple 2 steps process

·        Email me @phcharbonneau with your captured performance data or download link (Thread Dump, app & server logs etc. captured during your incident). Please also provide the specifications of your environment (Java EE server vendor & version, Java VM vendor and version etc.) and a high level description of your problem.
·        Wait for my initial reply. I will do my best to provide you with a quick answer in less than 24 hours. An IM session could be scheduled, if required.
  
I'm looking forward to hear from you!

Regards,
Pierre-Hugues Charbonneau 
http://javaeesupportpatterns.blogspot.com/ 

How to analyze Thread Dump – Part 3: HotSpot VM

This is part 3 of our Thread Dump analysis series which will provide you with an overview of what is a JVM Thread Dump for the HotSpot VM and the different Threads that you will find. Detail for the IBM VM Thread Dump format will be provided in the part 4.

** UPDATE: Thread Dump analysis tutorial videos now available here.

Please note that you will find the Thread Dump sample used for this article from the root cause analysis forum.

JVM Thread Dump – what is it?

A JVM Thread Dump is a snapshot taken at a given time which provides you with a complete listing of all created Java Threads.

Each individual Java Thread found gives you information such as:

-        Thread name; often used by middleware vendors to identify the Thread Id along with its associated Thread Pool name and state (running, stuck etc.)

-        Thread type & priority ex: daemon prio=3 ** middleware softwares typically create their Threads as daemon meaning their Threads are running in background; providing services to its user e.g. your Java EE application **

-        Java Thread ID ex: tid=0x000000011e52a800 ** This is the Java Thread Id obtained via java.lang.Thread.getId() and usually implemented as an auto-incrementing long 1..n**

-        Native Thread ID ex: nid=0x251c** Crucial information as this native Thread Id allows you to correlate for example which Threads from an OS perspective are using the most CPU within your JVM etc. **

-        Java Thread State and detail ex: waiting for monitor entry [0xfffffffea5afb000] java.lang.Thread.State: BLOCKED (on object monitor)
** Allows to quickly learn about Thread state and its potential current blocking condition **

-        Java Thread Stack Trace; this is by far the most important data that you will find from the Thread Dump. This is also where you will spent most of your analysis time since the Java Stack Trace provides you with 90% of the information that you need in order to pinpoint root cause of many problem pattern types as you will learn later in the training sessions

-        Java Heap breakdown; starting with HotSpot VM 1.6, you will also find at the bottom of the Thread Dump snapshot a breakdown of the HotSpot memory spaces utilization such as your Java Heap (YoungGen, OldGen) & PermGen space. This is quite useful when excessive GC is suspected as a possible root cause so you can do out-of-the-box correlation with Thread data / patterns found

Heap
 PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
  eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
  from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
  to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
 PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
  object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
 PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
  object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)

Thread Dump breakdown overview

In order for you to better understand, find below a diagram showing you a visual breakdown of a HotSpot VM Thread Dump and its common Thread Pools found:


As you can there are several pieces of information that you can find from a HotSpot VM Thread Dump. Some of these pieces will be more important than others depending of your problem pattern (problem patterns will be simulated and explained in future articles).

For now, find below a detailed explanation for each Thread Dump section as per our sample HotSpot Thread Dump:

# Full thread dump identifier
This is basically the unique keyword that you will find in your middleware / standalong Java standard output log once you generate a Thread Dump (ex: via kill -3 <PID> for UNIX). This is the beginning of the Thread Dump snapshot data.

Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.0-b11 mixed mode):

# Java EE middleware, third party & custom application Threads
This portion is the core of the Thread Dump and where you will typically spend most of your analysis time. The number of Threads found will depend on your middleware software that you use, third party libraries (that might have its own Threads) and your application (if creating any custom Thread, which is generally not a best practice).

In our sample Thread Dump, Weblogic is the middleware used. Starting with Weblogic 9.2, a self-tuning Thread Pool is used with unique identifier “'weblogic.kernel.Default (self-tuning)”

"[STANDBY] ExecuteThread: '414' for queue: 'weblogic.kernel.Default (self-tuning)'" daemon prio=3 tid=0x000000010916a800 nid=0x2613 in Object.wait() [0xfffffffe9edff000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
        at java.lang.Object.wait(Object.java:485)
        at weblogic.work.ExecuteThread.waitForRequest(ExecuteThread.java:160)
        - locked <0xffffffff27d44de0> (a weblogic.work.ExecuteThread)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:181)

# HotSpot VM Thread
This is an internal Thread managed by the HotSpot VM in order to perform internal native operations. Typically you should not worry about this one unless you see high CPU (via Thread Dump & prstat / native Thread id correlation).

"VM Periodic Task Thread" prio=3 tid=0x0000000101238800 nid=0x19 waiting on condition


# HotSpot GC Thread
When using HotSpot parallel GC (quite common these days when using multi physical cores hardware), the HotSpot VM create by default or as per your JVM tuning a certain # of GC Threads. These GC Threads allow the VM to perform its periodic GC cleanups in a parallel manner, leading to an overall reduction of the GC time; at the expense of increased CPU utilization.

"GC task thread#0 (ParallelGC)" prio=3 tid=0x0000000100120000 nid=0x3 runnable
"GC task thread#1 (ParallelGC)" prio=3 tid=0x0000000100131000 nid=0x4 runnable
………………………………………………………………………………………………………………………………………………………………

This is crucial data as well since when facing GC related problems such as excessive GC, memory leaks etc, you will be able to correlate any high CPU observed from the OS / Java process(es) with these Threads using their native id value (nid=0x3). You will learn how to identify and confirm this problem is future articles.

# JNI global references count
JNI (Java Native Interface) global references are basically Object references from the native code to a Java object managed by the Java garbage collector. Its role is to prevent collection of an object that is still in use by native code but technically with no "live" references in the Java code.

It is also important to keep an eye on JNI references in order to detect JNI related leaks. This can happen if you program use JNI directly or using third party tools like monitoring tools which are prone to native memory leaks.

JNI global references: 1925

# Java Heap utilization view
This data was added back to JDK 1 .6 and provides you with a short and fast view of your HotSpot Heap. I find it quite useful when troubleshooting GC related problems along with HIGH CPU since you get both Thread Dump & Java Heap in a single snapshot allowing you to determine (or to rule out) any pressure point in a particular Java Heap memory space along with current Thread computing currently being done at that time. As you can see in our sample Thread Dump, the Java Heap OldGen is maxed out!

Heap
 PSYoungGen      total 466944K, used 178734K [0xffffffff45c00000, 0xffffffff70800000, 0xffffffff70800000)
  eden space 233472K, 76% used [0xffffffff45c00000,0xffffffff50ab7c50,0xffffffff54000000)
  from space 233472K, 0% used [0xffffffff62400000,0xffffffff62400000,0xffffffff70800000)
  to   space 233472K, 0% used [0xffffffff54000000,0xffffffff54000000,0xffffffff62400000)
 PSOldGen        total 1400832K, used 1400831K [0xfffffffef0400000, 0xffffffff45c00000, 0xffffffff45c00000)
  object space 1400832K, 99% used [0xfffffffef0400000,0xffffffff45bfffb8,0xffffffff45c00000)
 PSPermGen       total 262144K, used 248475K [0xfffffffed0400000, 0xfffffffee0400000, 0xfffffffef0400000)
  object space 262144K, 94% used [0xfffffffed0400000,0xfffffffedf6a6f08,0xfffffffee0400000)

I hope this article has helped to understand the basic view of a HotSpot VM Thread Dump.

The Thread Dump Analysis Part 4 is now available and will provide you a high level overview of a Thread Dump and breakdown for the IBM VM 1.6 Thread Dump format.

2.16.2012

Java Heap Space – IBM VM

This short article will provide you with a high level overview of the different Java memory spaces for the IBM VM. 

This understanding is quite important given the implementation & naming convention differences between HotSpot & IBM VM.

IBM VM: 2 different memory spaces

The IBM VM memory is split between 2 memory spaces:

-        The Java Heap (nursery and tenured spaces)
-        The Native Heap (C-Heap)

Memory Space
Start-up arguments and tuning
Monitoring strategies
Description
Java Heap
-Xmx (maximum Heap space)

-Xms (minimum Heap size)

EX:
-Xmx1024m
-Xms1024m

GC policy Ex:
-Xgcpolicy:gencon (enable gencon GC policy)
- verbose GC
- JMX API
- IBM monitoring tools
The IBM Java Heap is typically split between the nursery and tenured space (YoungGen, OldGen).

The gencon GC policy (combo of concurrent and generational GC) is typically used for Java EE platforms in order to minimize the GC pause time.


Native Heap
 (C-Heap)
Not configurable directly.

For a 32-bit VM, the C-Heap capacity = 4 Gig – Java Heap

For a 64-bit VM, the C-Heap capacity = Physical server total RAM & virtual memory – Java Heap

- svmon command
The C-Heap is storing class metadata objects including library files, other JVM and third party native code objects.

 Where is the PermGen space?

This is by far the most typical question I get from Java EE support individuals supporting an IBM VM environment for this first time. The answer: there is no PermGen space for the IBM VM. The PermGen space is only applicable to the HotSpot VM. The IBM VM is using the Native Heap for Class metadata related data. Also, as you probably saw from my other article, Oracle / Sun is also starting to remove the PermGen space for the HotSpot VM.

The next article will provide you a tutorial on how to enable and analyze verbose GC for an IBM VM. Please feel free to post any comment or question on the IBM VM.