/ Java EE Support Patterns

4.11.2018

Oracle WebLogic Native IO & Java Muxers

This article will provide the complete root cause analysis details and resolution of a Java performance problem affecting a legacy Oracle WebLogic 11g production environment and involving Socket Muxers.

This performance problem was identified while performing a workload migration and performance assessment of a WLS11g environment to RedHat OpenShift container and PaaS platform.

While “Muxers” is an old concept, this post will demonstrate the importance of adequate knowledge of native IO configuration and runtime behavior within a Java EE container such as Oracle WebLogic.

Environment Specifications

  • Workload location: On-Premises Data Center
  • Business domain: Telecommunications
  • NLB & Web server: F5 NLB & Apache
  • Java EE container: Oracle WebLogic 11g
  • JDK/JRE: Oracle HotSpot JVM 1.7 64-bit
  • OS: Solaris 11

APM & Troubleshooting Tools

  • Cisco AppDynamics
  • WebLogic 11g Admin console & logs
  • JVM Thread Dump analysis

References


Problem & Observations

The problem was first communicated by our production Ops team following recent performance degradation complaints by the end-users under peak load. An initial root cause analysis exercise did reveal the following facts and observations:

  • Response time spikes were observed on regular basis and especially under peak load.
  • An analysis of AppDynamics data did expose unexpected delay for inbound traffic via HTTPS.
  • Processing time of the application web requests (after body/payload received) was found to be optimal and < 1 sec.
  • An initial review of the WebLogic Threads and JVM Thread Dump did not expose any bottleneck or contention within the application code.
  • Network packet analysis did not expose any network latency but isolated the response time delay within the WebLogic server tier.

JVM Thread Dump analysis – second pass

Another analysis iteration was performed of the JVM Thread Dump data captured which did reveal the following findings:
 

As we can see from the above image, it was identified that “Java Muxers” threads were being used for the overall WebLogic Network I/O. In general, it is not recommended enabling the Java Muxers since they offer poor scalability and suboptimal performance vs. native Muxers or more recent NIO Muxers. Java Muxers block on “reads” until there is data to be read from a socket and does not scale well when dealing with a large influx of inbound web requests.

The following Thread stacktrace can be found from the thread dump when using NIO (Oracle WebLogic 12.2.x).




Following the above finding, a review of the WebLogic 11g configuration was performed but did not reveal any problem (native IO enabled). The next phase of the RCA was now to determine why Java Muxers were enabled by WebLogic on start-up.

Root Cause and Solution

The root cause was finally identified following a review of the WebLogic start-up logs.

<[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1248787500274> <BEA-000447> <Native IO Disabled. Using Java IO.>

As per above, it was found that Native IO was disabled on start-up due to a problem with the “Performance Pack”, which includes the Native Muxers, falling back on Java IO but still allowing the WebLogic server to start properly.

Furthermore, it was identified that the JVM 1.7 start-up parameters did not include the “-d64” which was confusing & preventing WebLogic from loading the proper 64-bit Performance Pack library, thus disabling Native IO and falling back on the Java Muxers.

Following the implementation of the solution (restoration of the Native Muxers) to the production environment, we could observe a significant improvement of the application performance and improved scalability.

10.17.2016

Oracle Open World and Java One 2016 summary later this week

This post is to inform you that I will publish later this week a summary of the highlights last September of my on-site visit in San Francisco and areas of focus for 2017.

I will also publish a few YouTube videos later this month and demonstrate certain trending technologies and some latest Java & JVM troubleshooting techniques.

Please stay tuned.

Thank you.
P-H

8.08.2016

Java 8 Performance Optimization - DZone Refcard Update

I am happy to inform you that I published recently an update to the existing refcard on Java Performance Optimization which is now available from DZone. The updated material now better reflects the Java 8 features and provides a dedicated section and guidelines about the JVM Metaspace.

I recommend that you download your FREE copy today.


For now, find below a small snippet:

By default, the Metaspace memory space is unbounded and will use the available process and/or OS native memory available for dynamic expansions. The memory space is divided into chunks and allocated by the JVM via mmap.We recommend keeping the default, dynamic resize mode as a starting point for simpler sizing combined with close monitoring of your application metadata footprint over time for optimal capacity planning…”

Thank you.
P-H

4.26.2016

DZone's Guide to Building and Deploying Applications on the Cloud

This post is to inform you that DZone has just released a great guide regarding Building and Deploying Applications on the Cloud. I recommend that you download your copy today!

Here is a snippet:

".....
Overhyped or not, the cloud has deeply changed how we build and run software—and not just because IaaSes make VMs trivial to spin up and PaaSes make environments easy to set up. As a user you know what’s changed, and you understand the concept “as a service” (well, ever since you started running *nix); and, thank goodness, you don’t really have to worry about the physical details that make those services run. 
..." 

9.08.2015

Java 8 - CPU Flame Graph

Brendan Gregg and Martin Spier from Netflix recently shared a very interesting article titled Java in Flames, describing their latest experimentation with a new JDK option (-XX:+PreserveFramePointer ) that allowed them to create a full CPU consumers view as a "flame" graph. This article is an advanced read but extremely interesting for Java Performance enthusiasts.

This option is now included in the recently released JDK 8u60.

We will create our own experiment shortly and post a video exploring this CPU profiling capability real-time vs. existing CPU profiling tools & techniques. As mentioned in the article, a clear added-value would be to automate and visualize CPU utilization delta (deviation from an established baseline) between releases or code changes. 

This approach would allow fast detection of CPU bottleneck or improvements following software changes, improving the overall performance and scalability of the production environment over the long run, as well as keeping the cloud or on-premise hardware cost under control.

Here is a small snippet from the original article:

"Java mixed-mode flame graphs provide a complete visualization of CPU usage and have just been made possible by a new JDK option: -XX:+PreserveFramePointer. We've been developing these at Netflix for everyday Java performance analysis as they can identify all CPU consumers and issues, including those that are hidden from other profilers..."

7.21.2015

JVM Buzzwords Java developers should understand

This article will share with you a few JVM "buzzwords" that are important for Java developers to understand and remember before performing any JVM performance and garbage collection tuning. A few tips are also provided including some high level performance tuning best practices at the end of the article. Further recommendations regarding the Oracle HotSpot concurrent GC collectors such as CMS and G1 will be explored in future articles.

Before reading any further, I recommend that you first get familiar with the JVM verbose GC logs. Acquiring this JVM data analysis skill is essential, especially when combined with more advanced APM technologies.

JVM Buzzwords

Allocation Rate
Java objects allocated to the YoungGen space,
a.k.a. “short-lived’ objects.
Promotion Rate
Java objects promoted from the YoungGen to the OldGen space.
LIVE Data
Java objects sitting in the OldGen space, a.k.a. “long-lived’ objects.
Stop-the-world Collection
Garbage collections such as Full GC and causing a temporary suspension of your application threads until completed.


First Things First: JVM GC Logs
  • Provides out-of-the-box fine-grained details on the Java heap and GC activity.
  • Use tools such as GCMV (GC Memory Visualizer) in order to assess your JVM pause time and memory allocation rate vs. sizing the generations by hand.



Allocation & Promotion Rates
  • It is important to keep track of your application allocation and promotion rates for optimal GC performance.
  • Keep the GCAdaptiveSizePolicy active, as part of the JVM ergonomics. Tune by hand only if required.


LIVE Data Calculation
  • Your live application data corresponds to the OldGen occupancy after a Full GC.
  • It is essential that your OldGen capacity is big enough to hold your live data comfortably and to limit the frequency of major collections and impact on your application load throughput.
Recommendation: as a starting point, tune your Java Heap size in order to achieve an OldGen footprint or occupancy after Full GC of about 50%, allowing a sufficient buffer for certain higher load scenarios (fail-over, spikes, busy business periods...).

  • *Hot Spot*: watch for OldGen memory leaks!
  • What is a memory leak in Java? Constant increase of the LIVE data over time...


LIVE Data Deep Dive
  • JVM GC logs are great…but how you can inspect your live data?
  • Java Heap Histogram snapshots and JVM Heap Dump analysis are powerful and proven approaches to better understand your application live data.
  • Java profiler solutions and tools such as Oracle Java Mission Control , Java Visual VM provide advanced features for deep Java heap inspection and profiling, including tracking of your application memory allocations.

Stop-the-world Collections: GC Overhead
  • YoungGen collections are less expensive but be careful with excessive allocation rate.
  • It is recommended to initially size (JVM default) the YoungGen at 1/3 of the heap size.
  • Remember: both YoungGen and OldGen collections are stop-the-world events!
  • PermGen and Metaspace (JDK 1.8+) are collected during a Full GC, thus it is important to keep track of the Class meta data footprint and GC frequency.




Final Words & Recommendations

Best Practices
  • Optimal Java Performance is not just about Java…explore all angles.
  • Always rely on facts instead of guesswork.
  • Focus on global tuning items first vs. premature fine-grained optimizations.
  • Perform Performance & Load Testing when applicable.
  • Take advantage of proven tools and troubleshooting techniques available.
To Avoid
  • There are dozens of possible JVM parameters: don’t over-tune your JVM!
  • You always fear what you don’t understand: good application knowledge > no fear  > better tuning recommendations.
  • Never assume that your application performance is optimal.
  • Don’t try to fix all problems at once, implement tuning incrementally.
  • Don’t get confused and keep focus on the root cause of performance problems as opposed to the symptoms.
  • Excessive trial and error approach: symptom of guesswork.

7.09.2015

SSL SHA-2 and Oracle WebLogic

This post is to inform you that I will be releasing an article shortly on the industry adoption of SHA-2 SSL certificates and potential impact to your Java EE production environments. It will be especially useful if your secured application is still using an older version of Oracle WebLogic, packaged with the deprecated Certicom-based SSL implementation which does not support SHA-2 (SHA-256 signature algorithm).

In the meantime, I recommend that you consult the high level SHA-2 migration guide from Entrust. It is a very good starting-point and will help increase your awareness level on this upcoming SHA-1 to SHA-2 upgrade.

5.21.2015

Java Application Scalability

Eric Smith from AppDynamics recently released a great article on application scalability.

Essentially the main point is that the ability or effectiveness of scaling vertically/horizontally your application depend on various factors, more complex than just looking at the OS CPU and memory utilization.

Proper usage of the right tools and capture of application specific metrics are crucial in order to identify tuning opportunities. This approach will also help you determine the right initial and incremental infrastructure/middleware sizing for your on-premise or in the cloud production environment, reducing your client hardware/hosting long-term cost and improve the ROI.

For example, if you we look at your Java application LIVE data (OldGen footprint after a major collection). Some applications have LIVE data which depend mainly on the concurrent load and/or active users e.g. session footprint and other long-lives cached objects. These applications will benefit well from vertical or horizontal scaling as load is split across more JVM processes and/or physical VM's, reducing pressure point on the JVM fundamentals such as the garbage collection process.

On the contrary, Java applications dealing with large LIVE data footprint due to excessive caching, memory leaks etc. will poorly scale since this memory footprint is "cloned" entirely or partially over the new JVM processes or physical VM's. These applications will benefit significantly from an application and JVM optimization project which can both improve the performance and scalability, thus reducing the need to "over-scale" your environment in long-term.

5.09.2015

DevOps and Continuous Delivery - Weekly articles

I would like to inform my fellow readers that I am currently preparing a cluster of fresh articles on Java Performance following intense troubleshooting and performance tuning work over the past 12 months.

In the meantime, I recommend the following list of fresh DevOps related articles from Electric Cloud which offer different perspectives on this practice.

Please stay tune for more updates...

Thanks.
P-H

2.10.2015

DevOps - 2015 Guide to Continuous Delivery from DZone

I would like to share with my fellow readers that DZone has published a great 2015 guide about Continuous Delivery, which is a core principle and goal of the DevOps methodology.

If you are part of an organization about to implement DevOps principles, emerging tools or simply wish to improve your knowledge and awareness on Continuous Delivery, I highly recommend that you download your own copy today, it is FREE!


Thank you.

P-H

JCG Member DZone MVB