Posts Tagged ‘Software Development’

Causes of Java PermGen Memory Leaks

Friday, May 9th, 2008

I’ve been hunting down PermGen memory leaks lately with YourKit profiler. It’s been an interesting experience, tracking down these little buggers is frequently harder than you might think.

To save others the pain, I’ve documented all the problems I’ve found, and how I solved them, in this blog entry. I’ll keep updating this as I find new ones.

Java Bean Introspection

The Java bean introspector keeps a cache that doesn’t get flushed. It’s hard to know if this is a problem for you without profiling – your application probably doesn’t use the introspector, but some libraries that you use might.

To fix this problem you need to create a ServletContextListener in your app and add the following clean-up code to the “contextDestroyed” method.

[sourcecode language='java']
Introspector.flushCaches();
[/sourcecode]

If you’re using Spring, you should add the IntrospectorCleanupListener to your web.xml instead (this calls Instrospector.flushCaches() for you, as well as doing some additional cleanup).

Commons Pool Eviction Timer

In old versions of commons pool, the eviction timer was not being cleaned up when the webapp shut down. This issue has been fixed in the version 1.4 of commons-pool, so if you use commons-pool, make sure you’re using at least version 1.4.

As a side note, if you’re using commons-dbcp (which uses commons-pool) you should also make sure you’re using at least version 1.2.2, as it fixes a swag of issues some of which we’ve seen in our production systems.

MySQL Connector/J Statement Cancellation Timer

Version 5.1.6 (and earlier) of the MySQL JDBC driver (Connector/J) has a problem whereby the statement cancellation timer in the ConnectionImpl class is never cancelled, resulting in the timer thread hanging around even after you’ve unloaded your app. If you use the MySQL JDBC driver this will be a problem for you.

I’ve raised a bug report with the MySQL folks. In the meantime you can create a ServletContextListener in your app and add the following clean-up code to the “contextDestroyed” method.

[sourcecode language='java']
try {
if (ConnectionImpl.class.getClassLoader() == getClass().getClassLoader()) {
Field f = ConnectionImpl.class.getDeclaredField(”cancelTimer”);
f.setAccessible(true);
Timer timer = (Timer) f.get(null);
timer.cancel();
}
}
catch (Exception e) {
System.out.println(”Exception cleaning up MySQL cancellation timer: ” + e.getMessage());
}
[/sourcecode]

JDBC Driver Manager

The JDBC Driver Manager is notorious for causing PermGen memory leaks. When a JDBC driver starts up it gets registered with the global DriverManager, but never de-registered. If you are using any sort of JDBC driver, this will be a problem for you.

To fix this problem you need to create a ServletContextListener in your app and add the following clean-up code to the “contextDestroyed” method.

[sourcecode language='java']
try {
for (Enumeration e = DriverManager.getDrivers(); e.hasMoreElements(); ) {
Driver driver = (Driver) e.nextElement();
if (driver.getClass().getClassLoader() == getClass().getClassLoader()) {
DriverManager.deregisterDriver(driver);
}
}
}
catch (Throwable e) {
System.out.println(”Unable to clean up JDBC driver: ” + e.getMessage());
}
[/sourcecode]

DOM4J ThreadLocal leaks

Older versions of DOM4J store data in ThreadLocal variables that never get cleaned up. This problem has been fixed in recent versions of DOM4J. If you’re using DOM4J make sure you use at least version 1.6.1.

Xerces XML Libraries

Tomcat uses the SAXParser to read in configuration files. If you include Xerces in your webapp’s WEB-INF/lib directory Tomcat will use this rather than the one in the JDK. This is bad because Tomcat maintains a reference to this parser, which will prevent your webapp’s ClassLoader from being collected.

The fix is to remove the xerces and XML parser API libs from your webapp. If you’re running Java 5 or better these are included in the JRE so you don’t need them. If you’re using an older JRE, put the jars into the container’s shared library folder (”common/lib” in Tomcat).

Quartz Scheduler ShutdownHook

The Quartz ShutdownHookPlugin registers a hook with the Java runtime that doesn’t get cleaned up if you simply unload your webapp (it will only get cleaned up when you shut down the JVM).

The solution is to not use the ShutdownHookPlugin at all. If you’re using the Spring Quartz wrappers, you simply configure the Spring bean factory to clean up Quartz when the bean factory is destroyed. Otherwise it’s a simple job to add a ServletContextListener to clean up Quartz when your application shuts down.

Tomcat and Commons Logging

Earlier versions of Tomcat had a problem with leaking PermGen memory when a webapp included commons-logging in its WEB-INF/lib. Later versions fixed this problem, so make sure you’re running at least version 5.5.16.

Other Resources

Here are some other resources that you might find useful in finding/fixing your PermGen leaks:

Good luck!

Optus Music Store is compatible with Firefox… NOT!

Thursday, May 8th, 2008

I was doing a review of legal Australian music download services tonight, and somehow landed on the Optus Zoo Mobile Music page. Apparently Optus and MTV signed a deal a while back allowing Optus to distribute MTV music to their mobile subscribers.

So I clicked through to the music store and I’m greeted by the following page:

Optus MTV Music Store

Curious as to what I would possibly have to do to make Firefox work (after all, it is one of the most standards compliant browsers on the planet) I followed the link… and to my disbelief I was told I had to install the “IE Tab” Firefox extension to make the site “work in Firefox”. (This extension essentially allows you to embed an IE instance right inside Firefox). Don’t believe me? If you’re using Firefox or Safari, go and have a look at it for yourself.

Now I don’t know who their web developers are, but they should fire them immediately and hire some that know what they’re doing. This is NOT Firefox compatibility! This is lazy, sloppy, amateur software development, pure and simple, and I’m amazed (and frustrated) that there are big companies still producing shonky stuff like this.

There is simply no excuse for only supporting IE these days. Non-IE browsers account for over 20% of the market, and if you can’t support them, you don’t deserve their business.

I sent them some feedback to this effect… I’m not expecting a response. :)

Update: I found an associated site which explains a little further:

The content owners from whom we obtain our music and videos require that we use digital rights management (DRM), therefore our database and system run on Windows Media DRM.

Using this technology, delivery of the license and content of the music files are combined and delivered to your PC at once.

This DRM technology works only with Internet Explorer 6.0 and above, and Mozilla Firefox (with IE Tab installed). Other browsers are unfortunately incompatible.

So I guess I’ll forgive the developers… and blame Microsoft instead (after all, everybody loves to hate Microsoft)! Actually no, I’ll blame the record labels – just another example of the ridiculous lengths that they go to to prevent people from buying their products.

Finding PermGen Memory Leaks with YourKit

Monday, May 5th, 2008

Pretty much anyone who has written a Java web application has grappled with the dreaded “java.lang.OutOfMemoryError: PermGen space” errors when reloading web applications multiple times. It’s a problem so long running, low-level and common that you would think it would have been solved for good by now… but alas Java developers still come up against it daily.

Frank Kieviet of Sun has written some good blogs describing the problem and how to go about finding and fixing them. The latter article describes a method of finding PermGen leaks using standard JDK jmap (Memory Map) and jhat (Heap Analysis Tool) tools, stating that other existing profilers were not able to find these leaks.

Well, the article is a bit old now, and I’ve discovered that version 7 of the excellent (but boringly named) YourKit commercial profiler is now able to hunt down these leaks very quickly indeed. There’s not too much specific info on how to do this though, so here’s how…

1. Enable profiling in your container

This is pretty straight-forward, but how you do it depends on which JVM, OS and container you’re using, and how you’re running it. It’s explained in detail on the YourKit site, but generally speaking you just need to:

  1. Add the YourKit binaries directory to your system path
  2. Add “-agentlib:yjpagent” to your JVM startup parameters. I’m using Tomcat on Windows (standalone, not through my IDE), so this simply meant adding this to my JAVA_OPTS environment variable.

You can verify that all this is working by starting your container and attempting to connect to it with YourKit (Select “Connect to locally running profiled application…” on the “Welcome” screen).

2. Start, use and undeploy your web application

Simply load up and use your web application as normal. Once you’ve exercised it a little, undeploy it from your container in the usual way. I prefer to undeploy it completely rather than simply stopping it to make sure that the container properly disposes of all associated resources.

3. Capture a memory snapshot

Connect YourKit to your application if you haven’t already, then simply click the “Capture Memory Snapshot” button in the YourKit toolbar:

Capturing a YourKit Memory Snapshot

When prompted, open the resulting memory dump in YourKit.

4. Find “Classes Without Instances”

On the “Inspections” tab of the memory dump, choose “Classes without Instances” and then “Run This Inspection Only”:

YourKit \

5. Find an application class that should have been unloaded

The results for the inspection will show a list of Java ClassLoaders that have loaded classes that now have no instances. The trick now is to find the ClassLoader that loaded your application (if you’re running Tomcat this will be an instance of org.apache.catalina.loader.WebAppClassLoader). You can usually tell by expanding a few of the ClassLoader entries and looking for classes that specifically belong to your application.

Once you’ve found one of your classes, right click on it and choose “Paths from GC Roots”. In the example below I’ve found a MySQL JDBC connection that was used by my application that should have been unloaded.

YourKit Inspection Results

6. Analyse the GC paths

This is where things get interesting. The “GC Roots” screen shows you all the paths from the various Garbage Collector roots to the class in question. Consider the example below:

YourKit GC Roots

This screen shows the shortest GC path to the JDBC connection class we were looking at in the previous step. We can see that the GC path starts from the threadLocals property of one of the container threads (unremoved ThreadLocals are notorious for causing PermGen leaks). Moving up the chain we can see that the dom4j library is involved, appearing just before the ClassLoader itself. The ClassLoader in turn references the JDBC connection class.

From this I would guess that the dom4j library is inserting some data into a ThreadLocal at some point, but never cleaning it up. This results in the container thread (which never dies) holding on to a dom4j class definition that was loaded by the web application’s ClassLoader. This in turn prevents the ClassLoader from being garbage collected, and hence none of the classes loaded by the ClassLoader are collected either!

A bit of investigation revealed this to be the case, and the solution was simply to upgrade dom4j to version 1.6.1.

Unfortunately it’s not always obvious from these GC roots exactly what the problem is. A bit of investigation and sometimes guesswork is required, but with a little patience and experience you should get the hang of it. One pattern that I have noticed is that frequently the class/library just before the ClassLoader in the reference chain is the cause of the problem (this is not always the case).

Frustratingly, finding the solution can be frequently just as challenging as finding the cause. Solutions range from upgrading libraries, adding explicit shutdown code to clean things up or getting in touch with library developers for bug fixes. I’ll cover some specific solutions that I have come across in a future entry.

Good luck!