Reading the thread profile data: synchronization issues (2)
On the previous page, we discussed some locking options to allow our profile thread and another reader thread to access
the data correctly. We decided on an implementation using AtomicInteger, although
we mentioned that synchronized or a ConcurrentHashMap could provide
(slightly heavier-handed) solutions.
So back to our AtomicInteger which will represent our lock. When it has
a value of 0, that means that no thread is using the profile data. When it has a
value of 1, that means that the data is in use and we need to wait for it to return
to zero. So our profiler run() method looks something like this:
private AtomicInteger lock = new AtomicInteger(0);
...
while (!terminateRequest) {
long[] threadIds = threadBean.getAllThreadIds();
ThreadInfo[] infs = threadBean.getThreadInfo(threadIds, maxDepth);
while (!lock.compareAndSet(0, 1)) {
Thread.sleep(1);
}
try {
// ... process infs ...
} finally {
lock.set(0);
}
Thread.sleep(profileInterval);
}
In the vast majority of cases, acquiring the lock will more or less boil down to a simple
CAS operation. If there really is contention (because another thread is reading the
profile data), we know that it will be done with it fairly quickly, but
probably not so quickly that we want to re-try the compareAndSet() immediately.
So we sleep for the smallest amount of time we can1 before trying again.
This isn't a fantastic situation– "a millisecond is a long time in computing"– but
without knowing that the other thread will finish in the time it takes to do a
small number of further CAS operations (and thinking that it won't), it's the most
appropriate thing we can do2. Other options would be:
- use Thread.yield(), with the proviso that its precise operation varies from system to system and even between Hotspot versions (see the aforementioned link for more details);
- allow the reading thread to be notified when the profiling data is ready: for our
purposes— in principle, an elegant solution, but probably overengineering for a situation that will rarely occur.
In terms of locking, the method that reads the data (called in a
different thread) will perform a similar operation, sleeping for 1 millisecond if it
can't acquire the lock immediately. So now let's consider how we'll read the data
and present it back to the caller.
And now for... actually reading the data...!
We need to make sure that the caller getting the profile data sees a consistent "view".
Although we've acquired a lock on the data map overall, if we simply pass back
mutable objects (such as instances of our ProfileInfo class)
from that map to the caller, then the caller risks seeing a
changing or corrupted version of a particular piece of data (count). Apart from just
being plain wrong, this will create unpredictable effects when, say, trying to
order the profile data by count.
A better solution is to return a copy of the data, and in so doing we
can actually make the task of ordering the data easier. So we create a
comparable class called LineAndCount:
public static class LineAndCount implements Comparable<LineAndCount> {
private final String className, methodName;
private final int lineNo, tickCount;
private LineAndCount(StackTraceElement lineInfo, ProfileInfo count) {
this.className = lineInfo.getClassName();
this.methodName = lineInfo.getMethodName();
this.lineNo = lineInfo.getLineNumber();
this.tickCount = count.tickCount;
}
public String getClassName() { return className; }
// ... plus other accessor methods ...
public int compareTo(LineAndCount o) {
return (tickCount < o.tickCount) ? -1 :
((tickCount > o.tickCount) ? 1 : 0);
}
public String toString() {
return String.format("%5d %s.%s:%d", tickCount, className,
methodName, lineNo);
}
}
Now our profiler can provide a method that returns a list of
LineAndCount objects. We put this method inside our profiler class,
ready to be called from another thread that wants to read the profiling
data:
public List getProfile() {
while (!lock.compareAndSet(0, 1)) {
try {
Thread.sleep(1);
} catch (InterruptedException iex) {
Thread.currentThread().interrupt();
throw new RuntimeException("Unexpected interruption waiting for lock");
}
}
List res;
try {
res = new ArrayList(lineCounts.size());
for (Map.Entry e : lineCounts.entrySet()) {
StackTraceElement line = e.getKey();
ProfileInfo profCnt = e.getValue();
res.add(new LineAndCount(line, profCnt));
}
} finally {
lock.set(0);
}
Collections.sort(res);
return res;
}
And now the caller, to display a list of the 40 most time-consuming points
of execution, can call:
List<ThreadProfiler.LineAndCount> counts = prof.getProfile();
int sz = counts.size();
int noCounts = Math.min(40, sz);
for (int i = 1; i <= noCounts; i++) {
LineAndCount lc = counts.get(sz-i);
System.out.println(toString());
}
And there we have a very basic profiler. Some enhancements to think about:
- when deciding on the "interesting" stack trace element, we could skip past
classes in the API library (i.e. not ones in our application's outer package);
- we could take into account CPU time, and list a profile showing methods/lines
by actual CPU time consumed as opposed to "raw" time.
- we wouldn't want the map of profiling information to grow unbounded
(yes, I have crashed my own server by letting it do this). Possible solutions are
to dump and clear the data every now and then, or check for the map getting
to a certain size and then clearing out lines with counts below a certain
threshold.
1. As of Java 5, it is possibly to request a sleep of less than a second,
but as far as I'm aware no current mainstream OS can grant less than a second. On the
other hand, Windows and Linux generally can grant a sleep of one second.
For more details, see the section on Thread.sleep().
2. Interestingly, and depending on the platform,
a CAS operation may be quite a bit slower than a "normal" data read.
So just possibly we may actually be better off repeating the CAS a few times before sleeping. I've not
actually profiled this and am essentially guestimating that sleeping for a second is
the appropriate action, given that iterating through a large-ish set of profiling data
isn't completely trivial. This is a case where a sophisticated synchronized
implementation that did such profiling on the fly could possibly work in our favour.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.