Analyzing Java Heap Dumps

When WebSphere/Java process terminates unexpectedly, a number of artifacts are created:


1. A Portable Heap Dump file in the format of heapdump.YYYYMMDD.TIME.processid.phd.  This file is a snapshot of all the objects in memory at the time of the crash. It is a binary file that needs to be run through a tool to gather information from.  It is usually a large file 500MB - 2GB


2. A Thread Dump file in the format of javacore.YYYYMMDD.TIME.processid.txt.  This is a text file that contains information on the JVM parameters and the threads that were running on the system at the time of the crash.  Searching this file for "Current Thread" will show what thread was active at the time of the crash, although this is not always the cause of the crash. 


 


To analyze the Heap Dump (#1), I currently prefer to use Eclipse Memory Analyzer with the IBM DTFJ Plugins. 


 


1. Download and Install Eclipse Memory Analyzer from http://www.eclipse.org/mat/downloads.php .  I prefer to extract it directly to the c:\ drive. 



 


2.  Once extracted, I edit the MemoryAnalyzer.ini file and change the maximum memory (Xmx) to 4096 depending on how much memory is on my system.  With the default of 1024, you will often get OOM errors when trying to analyze heap dumps. 



 


3.  If you try to open a PHD file with the stock MemoryAnalyzer, you'll get the error :



4. To resolve this, you need to install the IBM Diagnostic Tool Framework for Java (DTFJ) ( https://www.ibm.com/developerworks/java/jdk/tools/dtfj.html ).  To do this, click on Help > Install New Software....



 


5. Click Add



6. Add a Repository with the URL: http://public.dhe.ibm.com/ibmdl/export/pub/software/websphere/runtimes/tools/dtfj/



 


7. Click "Select All" and click Next



8. Click Finish


9. Click Yes



10.  Eclipse Memory Analyzer will now be able to analyze .phd files



 


 


Anayzing a PHD file takes quite a bit of time.  Once the analysis is complete, you will be able to access different views to help determine what the problem is.