|
Java Virtual Machine Performance TuningThis page describes how to tune Java VM garbage collection for optimum performance and longevity as it frees dynamically allocated memory that is no longer referenced. This is a companion to my pages on Java and performance tuning Java Virtual Machine Specification, 2nd Edition by Tim Lindholm and Frank Yellin
| Topics this page:
|
|
Summary of Lessons Learned
I have learned several lessons in my experience with troubleshooting Java for stability and scalability:
Real Time JavaProgramming for Garbage Collection
As instances are completed, local variables instantiated within them are no longer considered "live" because they cannot be reached from anywhere in the running program. They become "garbage" because they no longer participate in the future course of program execution. There are three ways programmers can make objects eligible for garbage collection:
Java doesn't allow programmers to add a finalize() method to classes
as C programmers are required to do explicitly (or by using
libraries to emulate Java-like garbage collection as a replacement for
the GCMM (Garbage Collection and Memory Management) library But Java programmers can specify a finalize method to associate a cleanup method with an object type. Since it is generally not possible to predict when a finalize method will be run, ill-considered use of finalize can lead to unpredictable race conditions. Eliminating the need for programmers to track of what memory has been allocated and then "finalize" it to release its memory space avoids many headaches. Automatic garbage collection also ensure system integrity because programmers cannot accidentally (or purposely) crash the JVM by incorrectly freeing memory. Since Garbage Collection (GC) identifies garbage space available again for new objects, a more accurate and up-to-date metaphor might be "memory recycling." The garbage collector also combats fragmentation, when free blocks of heap memory remaining between blocks occupied by live objects are too small (does not have enough contiguous free space) for new objects to use. This results in the need to extend the size of the heap for new objects — causing extra paging that degrade the performance of the executing program. jconsole executable that comes with Java 5 (in JDK_HOME/bin) uses JVM 1.5's JMX instrumentation agent to monitor either local java processID, remote host:port, or JMX agents of MBean servers. The J2SE 1.5 Monitoring & Management Console provides a formatted Summary. Among Java Performance Documentation is Performance Documentation for the Java HotSpot VM Urban performance legends, revisited" 27 Sep 2005 (article in the Java theory and practice series) by Brian Goetz starts by claiming that: allocation in modern JVMs is far faster than the best performing malloc implementations. |
"Thanks for the Memory" 21 Apr 2009 article by IBM Engineer Andrew Hall is the best article I've seen on this subject. Javaworld article By Bill Venners Tuning Garbage Collection with the 1.4.2 Java[tm] Virtual Machine other external sources for garbage collection documentation Hints, tips, and myths about writing garbage collection-friendly classes by Brian Goetz SCJP questions on Garbage Collection The evolution of a high-performing Java virtual machine by W. Gu, et. al. (August 23, 1999) describes how the team identified and isolated performance bottlenecks with execution profilers. SAP Memory Analyzer [wiki] enables you to look into the Java heap to find big chunks of memory and identify what is keeping the memory alive. 32 and 64 bit versions of this Eclipse RCP application are available (for evaluation) on Windows, Mac OS X Carbon, Linux, and Solaris. recognizes standard HPROF binary heap dumps from Sun, SAP, and HP JDK/JVM from version 1.4.2_12 and 5.0_ 7 and 6.0 upwards. Java 6 -server fared surprisingly well on Intel Pentium 4 in the Computer Language Benchmarks Game Big Table vendors HyperTable vs. HBase was won largely because of garbage collection issues. |
Different JVMsKeeping JVM Garbage Collection pause times low is becoming more and more important to provide predictable performance. Major middleware vendors provide their own JVM. Microsoft .NETUsing System.Collections.Generic, the Microsoft .NET Framework has a way to (in C# language) display counts of how many objects are in each generation:Console.WriteLine( GetTotalCollections() ); Static int GetTotalCollections(){ return GC.CollectionCount(0) + GC.CollectionCount(1) + GC.CollectionCount(2) ); } REFERENCE: All Methods in the .NET framework GC namespace Microsoft's .NET languages provide the low-level System.Threading API for starting, stopping, and joining threads and the high-level System.Threading.Threads API for concurrent and async programming. BEA JRockitRob Levy, the CTO at BEA Systems claims that BEA's JRockit JVM for BEA WebLogic Server 9.0 "continues to impress and set the industry benchmark as the world's fastest JVM for large-scale, mission-critical, server side applications."BEA's JRockit 5.0 Memory Leak Detector hooks into Java 5.0 SP1 (not 1.4) apps started (with the -Xmanagement option) in real-time production mode (through default port 7091) because the trace tool can be connected and then disconnected dynamically from the Java MBeans (JMX) management server rather than being specified in JVM startup parameters. What a concept! BEA's JRockit Runtime Analyzer hooks into Java 1.4.2+ apps. Plus, JRockit 5.0 R26 increases the maximum heap size on Windows to almost 3 GB.
BEA WebLogic Real Time (WLRT) (for Windows or Linux) uses JRockit 5.0 R26
Deterministic Garbage Collection (DetGC)
(JVM setting " These timings were reported by Tom Barnes using a Grinder Jython client to measure his "Trader" app which uses the Java Spring open source framework 1.2.6. BEA achieves this by applying QoS (Quality of Service) alogrithms limiting the total number of pauses within a prescribed window. MBeans are managed on a separate Management Console server (installed by default to JRockit_JDK/bin).
|
BEA WebLogic JMS Performance Guide" white paper by Tom Barnes, senior software engineer on BEA's WebLogic Performance Team. |
Try This: Default values
If you are running SAP's Sherlok, use 1.1 GB. Instead of huge heaps, use additional server nodes. On Unix machines, if you have 4GB RAM, start with this: -Xmx3550m -Xms3550m -Xmn2g -Xss128k -ParallelGCThreads=8 -XX:+UseParallelOldGC Values for 64-bit OS are about 30 percent higher. |
Command Line Argument Flags
Arguments with Java can be specified in two places:
|
(Total Heap Size)In parentheses is the total JVM heap space, which is based on the value of flags such as -Xms3670k -Xmx1400m. This example specifies a minimum of 3.67MB and a maximum of 1.4GB. Numbers can include 'm' or 'M' for megabytes, 'k' or 'K' for kilobytes, and 'g' or 'G' for gigabytes. For example, 32k is the same as 32768 (32 * 1024).
The JVM automatically expands the space (from the Xms value up to the Xmx value) when
the percentage of free space falls below the minimum defined by flag To prevent shrinking, set the -XX:MaxHeapFreeRatio=<maximum> flag at or above the default 70 percent. Alka Gupta, "Garbage Collection (GC) Analysis and Performance Tuning Using the GC Portal", July 2003 SAP's SherlokSAP developed "Sherlok" to determine the resource consumption of Java processes.Sherlok 1.4 can output to telnet port (by default) or SAP Portal iViews (html frames) running on SAP J2EE Engine 6.20 with sherlok.so/.sl/.dll in J2EE Instance/cluster/server.
The most important parameters in the sherlok.properties file are displayed with
|
$35 Java Platform Performance: Strategies and Tactics (Addison-Wesley; May 31, 2000) by Steve Wilson, Jeff Kesselman $50 Java Performance Tuning (2nd Edition) [on JVM 1.4] (O'Reilly; January 2003) by Jack Shirazi, of the Java Performance tuning website Performance Planning for Managers (by Jack Shirazi 02/22/2001) presents a ten-point plan for managing adequate Java application performance. Java Performance Tuning Strategy (by Jack Shirazi 11/09/2000) VM Gear or a KL Group Comparison of multiple CPUs on IBM vs. Sun JVM High-Performance Client/Server (Nov. 1997) by Chris Loosley & Frank Douglas
|
From New/Young to Old/Tenured
The young generation is also called the new generation, as new allocations of memory are created for apps in the eden-space within the young generation. When eden runs out of space, a minor "scavenge" is done to one of two alternating survior spaces within the young generation. One survivor space is empty at any time, and serves as a destination of the next, alternating between to-space and from-space.
The allocation of space among the three areas within the new generation
can be controlled by a flag such as SAP recommends -XX:SurvivorRatio=2 -XX:TargetSurvivorRatio=90 The first options controls how big is eden space compared to survivor space (the latter contains two semispaces); the second option sets the desired percentage of the survivor space heap which must be used before objects are promoted to the old generation. The JVM aims to keep survivor spaces half full by recalculating at each garbage collection a threshold number of times an object can be copied before being tenured.
Adding flag Desired survivor size 131072 bytes, new threshold 31 (max 31) - age 1: 5496 bytes, 5496 total - age 13: 760 bytes, 6256 total - age 14: 760 bytes, 7016 total - age 17: 760 bytes, 7776 total - age 20: 14720 bytes, 22496 total
Since most objects have short lifetimes, objects are copied between survivor spaces until they become old enough to be copied to the Tenured Generation, also called the old generation, a larger area collected less often. The tenured generation also receives objects when the to-space is full. Garbage in the tenured generation is collected during a major collection.
|
Concurrency for Big MachinesAdd flagThe SAP J2EE Startup Framework doesn't accept "-server" or "-client" command line parameters. Instead, set entry jstartup/vm/type=server in the instance profile. -XX:+UseParallelGC is the default for Sun's GC for 1.5 JVM. others. -XX:+UseParallelOldGC option was added with J2SE 1.5.0_06. These settings conflict with -Xms and -Xmx options because they instruct the JVM to push memory use to the limit: the overall heap is more than 3850MB, the allocation area of each thread is 256K, the memory management policy defers collection as long as possible, and (beginning with J2SE 1.3.1_02) some GC activity is done in parallel. Because it makes the JVM greedy with memory resources, it is not appropriate for many programs and makes it easier to run out of memory. For example, by using most of the 4GB address space for heap, some programs will run out of space for stack memory. In such cases -Xss may sometimes help by reducing stack requirements. Since only one collector can work at a time, the flags described above should not be specified with flags specifying the incremental train collector specified by flag -Xincgc for the Solaris Exact VM 1.3.0 obsoleted by the Java HotSpot VM. ParNew-XX:+UseParNewGC enables the parallel version of the copying collector (newly available with JVM 1.4.2) rather than the default single threaded young generation copying collector.But UseParNewGC doesn't work (and therefore shouldn't be used) with JDK 1.4.2_07 or older on Windows2003/IA64 platform (see SAP note 716604). Its companion is flag -XX:+UseConcMarkSweepGC to specify use of the Concurrent Mark Sweep (CMS) collector, also referred to as the "concurrent low pause collector" because it aims to minimize pauses due to garbage collection by doing tenured garbage collection simultaneously with application threads.
This works with flag -XX:+CMSParallelRemarkEnabled and
With the IBM 1.4.2 JVM, flag
Concurrent marking is set with flag However, if collection happens too quickly, the heap can become fragmented. JNI objects are pinned and cannot be moved until unpinned by JNI.::
A fragmented heap can cause allocation requests to fail even though a lot of free space
is available. The IBM 1.4.2 JVM lists pinnned and dosed objects when run with the flag If the concurrent collector is unable to finish before the tenured generation fills up, all application threads are paused for a non-concurrent Full GC, also called STW (Stop-The-World) because it suspends all other threads while it works. Such collections are a sign that some adjustments need to be made to the concurrent collection parameters. A Full GC can also be invoked from within an application invoking System.gc() or Runtime.gc(). This is not advisable. Flag -XX:+DisableExplicitGC makes the GC ignore such calls. The JVM issues a out of memory error only when a Full GC fails to recover usable space. ??? "floating debris" may be created
|
Techniques used in the concurrent collector (for the collection of the tenured generation)
With the Mark-Sweep-Compact (MSC) algorithm:
See this Java applet simulating a garbage-collected heap, described in
A Generational Mostly-concurrent Garbage Collector [Printezis/Detlefs, ISMM2000] |
The Permanent GenerationIf an OutOfMemory error is raised even though there is a lot of available heap space, it's very likely caused by a surfeit of permanent generation space. The default 32MB maximum is usually not enough for complex programs, programs that dynamically generate and load many classes (such as some implementations of JSPs), or several applications are deployed. The permanent generation is sized independently from the other generations because it's where the JVM allocates classes, methods, and other "reflection" objects.
These objects are not really permanent unless you use the
Specify flag When running on servers with multiple cpus, multiply the memory by the number of cpu's. For example, to run a 4 way CPU using the default:
-XX:MaxPermSize=256m -XX:PermSize=256m is SAP's recommendation for SAP 7.0 on 32-bit Windows (Note 723909).
|
Java application performance tuning IV: Using JVMTI, BCI and JMX by Joseph Coha |
Other Issues
JVM specifications do not define how the heap the garbage collector must work. The designer of each JVM must decide how to implement garbage collection.
If your application uses a lot of threads, see if contention for heap locks is reduced with
flags -XX:+UseTLAB (Thread Local Allocation of oBjects) is SAP recommended. On Solaris and HP-UX this switch is set by default. On Windows and Linux you have to set it explicitly. -Djava.awt.headless=true should be added if your app is "headless" (a back-end services app that interacts with operators only through the command line and not graphically). This saves the memory of loading graphic libraries not needed. -Xbatch ??? -XX:SoftRefLRUPolicyMSPerMB=1 is recommended by SAP Note 723909. -Dsun.io.useCanonCaches=false SAP Note 723909 requires for its Knowledge Management servers to guarantee functional correctness of CM repositories and file system repositories with file system persistence. The .NET Framework provides several Perfmon counters described in these notes:
.NET CLR / # of Exceps Thrown
.NET Memory / % Time in GC
.NET Memory / # of pinned objects
|
Rico Mariani's .NET Performance Tidbits describes how to use tasklist to get the process id for VADump (Virtual Address Dump) command-line tool to show the type, size, state, and protection of each segment of virtual address space. It's used to make sure virtual address space is not over-allocated. described here and windbg (Windows Debugger) with the mscorwks.dll loaded from the same place that hold SOS. CLRProfiler to see the allocation profile of .NET managed applications. It provides a histogram of allocated types, allocation and call graphs, a time line showing GCs of various generations and the resulting state of the managed heap after those collections, and a call tree showing per-method allocations and assembly loads. |
Java Benchmarks
|
MS.NET Benchmarks
mscorsvr.dll for server machines.
|
Graphing GC logsEach JVM manufacturer (Sun, IBM, BEA) outputs GC logs in their own format. IBM provides a Memory Dump Diagnostic for Java (MDD for Java) which analyzes heap dumps. It can also compare two heap dumps to detect a memory leak. MDD for Java requires IBM Support Assistant (ISA) to provide extra tools and components for troubleshooting as well as providing a place to write problems (PMRs). |
Related:
| Your first name: Your family name: Your location (city, country): Your Email address: |
Top of Page
Thank you! |