It’s the afternoon. You just had lunch, and life looks good.
And then, of course, you get a text message saying that the system has slowed to a crawl. You feel queasy and you don’t know if it was that spicy, extra cheesy burrito, or the prospect of getting back to the office to deal with the user revolt.
You start to ask questions: Is the system stuck? Or just too slow? The CPU doesn’t even seem to be that busy. Could it be the changes that were deployed yesterday? Pepto or Seltzer?
This article will be very handy in the following situations:
- The system seems to be slow
- The CPU doesn’t seem to be very busy
- The system seems to get stuck sometimes, for example when you restart the server, and it seems to freeze for a minute or more
- We want to know what’s going on in the system—or not (perhaps we deployed a process that may not be even running!)
- We need to monitor the application for long periods of time
Although this is a relatively advanced technique, I will try to make this article beginner-friendly and easy to follow. We will focus on Linux, since it’s the most popular operating system used to run Alfresco. Another article dedicated to Windows will come up soon.
You need jstacks!
I call jstacks the output of the Java utility jstack
. This handy utility produces a list, or “dump,” of all the threads running inside a Java application.
The Java application may be Alfresco, Ephesoft, Solr, or any other application running within a Java based web application server. The most common web application server used to run Alfresco is Tomcat.
Why are multiple thread dumps useful?
It’s the difference between looking at a movie and a single picture. Looking at just one snapshot will not tell you what is moving and what is stuck.
Analogously, if we take multiple snapshots (or thread dumps, in this case) we can tell that a Java thread is stuck or taking too long to complete.
How many? How often?
Just a few dumps may not be enough. On the other hand, it may be difficult to analyze many jstacks.
If the dumps are generated every second, we may need to generate dozens to get a proper sense of what’s going on. If we generate them every minute we may not be able to catch the transactions.
Why jstack
?
Because:
- It’s free (I know I got you at “free” but keep reading)
- If you don’t have the
jstack
utility already, it’s easy to get (information below) - You don’t have to stop the web application server
- You don’t have to install a special agent or add any new parameters to your Java command line
- It’s minimally invasive and lightweight
The only option that’s easier and lighter weight than jstack
would be the command kill -3
or kill -QUIT
. Needless to say, be extra cautious when using the command kill
! Besides the obvious risks of a typo, the problem with kill
is that it will send the dump to the standard output of the web application, which means that you will have to figure out how to extract the dump from the log.
Despite all its advantages, jstack
is limited in comparison with other tools. If you can afford the time, money, expertise, memory, or CPU, you may want to look at other methods to produce thread dumps.
How do we get the jstack utility?
You may already have it! Try the following command:
jstack
If that command returns an error, you may only have the Java Runtime Environment (JRE) provided by Alfresco, or the Java Developer Kit (JDK) is not in the execution path. The command jstack
is part of the JDK, so you will have to get it and make sure that it is in the execution path. Hopefully, your Linux distribution will provide an easy method to get it (like apt-get, yum, etc.).
Potential Yikes! Make sure that the version of the JDK is the same, or higher, than the JRE
How do we generate multiple thread dumps?
Step 1: Find the process ID
You may use the command jps
to list the Java processes along with the main class:
jps -l
Your results will look like the following:
14129 org.apache.catalina.startup.Bootstrap
29993 sun.tools.jps.Jps
My Alfresco process is 14129, which shows the main Java class used to run Tomcat (org.apache.catalina.startup.Bootstrap)
. If unsure, another option is using the ‘ps’ command, which produces more details:
ps aux | grep java
Potential Yikes! Make sure that you have enough authority to see the process. If you don’t
see it listed, try using ‘sudo
‘
Step 2: Generate one thread dump
Armed with the process ID, let’s try the jstack
command (ready for the data dump?):
jstack -l PROCESS_ID
Potential Yikes! The “-l
” provides a more detailed listing, don’t forget it! You may need to
use ‘sudo
‘ as well
That was interesting but you will typically want to redirect the output. For example, I will send my Alfresco thread dump to the file my-jstack.txt
:
jstack -l 14129 > my-jstack.txt
The dump lists the Java stack trace of every thread. The dump is not easy to analyze as there may be hundreds of threads running at the same time. We will discuss some options at the end of this article.
Step 3: Execute jstack multiple times
You’ll want to execute the jstack
command several times to generate multiple snapshots of the application. Although you can do this manually (see the video), it is easier and more accurate to use a script like the following:
#!/bin/sh # Generate N thread dumps of the process PID with an INTERVAL between each dump. if [ $# -ne 3 ]; then echo Generates Java thread dumps using the jstack command. echo echo usage: $0 process_id repetitions interval exit 1 fi PID=$1 N=$2 INTERVAL=$3 for ((i=1;i<=$N;i++)) do d=$(date +%Y%m%d-%H%M%S) dump="threaddump-$PID-$d.txt" echo $i of $N: $dump jstack -l $PID > $dump sleep $INTERVAL done
Here’s how to run the script to generate five threads, one second apart, with my process id:
./dmpt.sh 14129 5 10s
The command generated the following files:
-rw-rw-r--. 1 luis luis 108220 May 2 16:33 threaddump-12780-20180502-163359.txt
-rw-rw-r–. 1 luis luis 108220 May 2 16:34 threaddump-12780-20180502-163400.txt
-rw-rw-r–. 1 luis luis 108194 May 2 16:34 threaddump-12780-20180502-163402.txt
-rw-rw-r–. 1 luis luis 108220 May 2 16:34 threaddump-12780-20180502-163403.txt
-rw-rw-r–. 1 luis luis 108220 May 2 16:34 threaddump-12780-20180502-163405.txt
Potential Yikes! Did you note the “s
” in “10s
”? That of course means 10 seconds. The time unit is optional, with “seconds” as the default, but you can use other units
Step 4: Analyze!
Now what do we do with the files? You can analyze each dump separately:
- Open and analyze each dump with an editor (not fun, but instructive!)
- Spotify one by one with the Online Java Thread Dump Analyzer
However, to analyze the changes across multiple dumps, we need a TDA (Thread Dump Analyzer). Some useful examples are:
- Samurai (source code)
- fastThread (I have not tried it yet)
Conclusions
As a free, easy-to-use, easy-to-implement tool that diagnoses some performance problems, jstack
becomes even more powerful when multiple dumps are generated.
For additional information, contact us today.