Synchronization and thread safety in Java
As soon as we start using concurrent threads, we need to think about various
issues that fall under the broad description of thread safety.
Generally, we need to take steps to
make sure that different threads don't interact in negative ways:
- if one thread is operating on some data or structure, we don't want another thread
to simultaneously operate on that same data/structure and corrupt the results;
- when Thread A writes to a variable that Thread B accesses, we need to
make sure that Thread B will actually see the value written by Thread A;
- we don't want one thread to hog, take or lock for too long
a resource that other threads need in order to make progress.
The rest of this section gives an overview of techniques used to achieve
these goals. More detailed information is available on the pages linked to below,
and in the following sections:
Synchronization and locking
A key element of thread safety is locking access to shared data while
it is being operated on by a thread. Perhaps the simplest— but not always
the most versatile— way of doing this is via the synchronized keyword.
The essential idea is that if we declare a method synchronized,
then other synchronized methods cannot be simultaneously called on
the same object. For example, this class implements a thread-safe random number
generator:
public void RandomGenerator {
private long x = System.nanoTime();
public synchronized long randomNumber() {
x ^= (x << 21);
x ^= (x >>> 35);
x ^= (x << 4);
return x;
}
};
If multiple threads concurrently try to call randomNumber(),
only one thread at a time will actually execute it; the others will effectively
"queue up" until it's their turn1. This is because the thread actually
executing the method at any given time owns a "lock" on the RandomGenerator
object it is being called on. For the generator to work, we need all three operations
on x to happen "atomically"— i.e. without concurrent modifications to x
from another thread. Without the synchronization, we would have a data race:
two concurrent calls to randomNumber() could interfere with each other's
calculation of x.
Synchronizing also crucially means that the result of x
calculated
in one thread is visible to other threads calling the method: see below.
It is also possible to synchronize any arbitrary
block of code on any given object: for more details, see the section on the
Java synchronized
keyword.
Explicit locks
The built-in synchronization mechanism has some limitations. For example, a thread
will potentially block forever waiting to acquire the lock on an object (e.g. if the thread
owning the lock gets into an infinite loop or blocks for some other reason). For more
fine-grained control over synchronization, Java 5 introduced explicit
Lock objects.
Cooperation
Whether via the synchronized keyword or an explicit Java Lock,
data sharing with object locking is a cooperative process: that is, all
code accessing (reading or writing) the dependent data must synchronize.
Publication and data visibility
A common misconception is that thread safety is just about data races.
A less well understood issue is visibility. Ordinarily, writing
a value to some variable from Thread A doesn't guarantee that the new value will be immediately
visible from Thread B, or even visible at all. And accessing multiple variables doesn't
necessarily happen in "program order".
For various reasons
such as compiler optimisations and CPU memory cache behaviour
(see the section on processor architecture
and synchronization), we need to
explicitly deal with data visibility when data is to be accessed by
multiple threads and/or when order of access is important,
whether the access is concurrent or not. For example, we need to
take steps when:
- an object or value is created by one thread then used by another
(even where we don't expect the accesses to be concurrent);
- one thread uses a variable such as a flag to signal to another thread;
- we read from variable A, then write to variable B, and we strictly expect the first to
"happen before" the second.
In general, correct synchronization solves the visibility problem,
and many of the bugs that occur are when people think they've found a clever (but broken)
way to avoid synchronization.
For more details on visibility, see the section on
variable synchronization.
After Thread A exits a block synchronized on Object X,
that Thread B when entering a block also synchronized on Object X will see
the data as it was visible to thread A when it exited the block.
Implicit in this description is that the synchronized blocks guarantee ordering:
data accesses from Thread A will strictly "happen before" data accesses in Thread B.
Java provides two other keywords, volatile and final,
that in the right circumstances can be used to guarantee visibility:
- if all of the fields on an object are final, then that object
can be safely read from any thread without synchronization;
- if a variable is declared volatile, then this signals that the variable
will be accessed by multiple threads, and also gives visibility guarantees:
for more details, see the section on the
Java volatile keyword, plus
our discussion of when to use volatile.
1. The word "turn" isn't quite appropriate: synchronized
does not provide "fairness" (i.e. the thread waiting longest isn't necessarily
the next one to acquire the lock).
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.