The synchronized keyword in Java
What does variable "synchronization with main memory" mean?
For the sake of keeping descriptions short, I'm going to refer a few times
to "synchronizing" cached copies of variables with "main memory". Firstly,
by "main memory" we mean 'the Java heap, as seen by the JVM'. We don't mean–
and don't need to refer to– anything more technical, such as physical
RAM as opposed to a CPU cache. We make a distinction between this main memory
and other places where we can put values, notably (a) processor registers,
in native code produced by a JIT compiler; (b) the set of 'local variable space'
that is allocated to every method call; (c) other areas of working memory, not
part of the Java heap, that may be allocated locally to a particular thread or
thread stack. Now, we've just said that
under normal circumstances, the JVM can do a couple of interesting things with
variables. Chapter 17 of the Java Language Specification states these and
related conditions in more formal terms, albeit in a profoundly incomprehensible way.
I'll try and summarise them informally here:
- The JVM is generally free to work on a local copy of a variable. For
example, a JIT compiler could create code that loads the value of a Java variable
into a register and then works on that register.
If this happens, other threads will never see
the updated value in the register unless we tell the JVM that they need to.
- A JIT compiler (or, for that matter, the bytecode compiler) is generally free
to re-order bytecodes or instructions for optimisation purposes, provided that the overall logic
of the program is not affected. So, for example, it could delay writing the value from
a register back to the "main" copy of a variable belonging to a Java object.
The JVM specification effectively says that entering and exiting synchronized
blocks and methods has to be a "safe barrier" to these operations. If we read
and write to variables inside synchronized blocks from different threads,
we do always expect Thread 1 to see the value set by Thread 2; just seeing
a locally cached copy in a register isn't correct. So on entry to and exit from a
synchronized block, the relevant reads/writes to main memory have to take place,
and they have to take place in the correct sequence. We can't re-order the
write to take place after we exit the synchronized block, and we can't
re-order the read to take place before we enter. In other words,
the JVM is not allowed to do this:
LOAD R0, [address of some Java variable] ; Cache a copy of the variable
enter-synchronization
ADD R0, #1 ; Do something with the (cached copy) of the variable
or this:
enter-synchronized-block
LOAD R0, [address of some Java variable] ; Cache a copy of the variable
MUL R0, #2 ; Do something with it
leave-synchronized-block
STORE R0, [address of variable] ; Write the new value back to the variable
It's possible to say all this in a very obtuse way (as I say, see Chapter 17 of
the language spec). But at the end of the day it's kind of common sense:
if the whole point of synchronization is to make sure all threads see the updated
"master" copy of variables, it's no use updating them after you've left the
synchronized block.
In some of the descriptions that follow, we'll
refer to "synchronizing cached variables with main memory" and sometimes refer to this
as being the source of an overhead. But in fact, some of the overhead is more
subtle than this as we've just seen, and comes from
the synchronization "barrier" preventing optimisation (code re-ordering).
The notion of "synchronization with main memory" is kept essentially to keep our
descriptions shorter, but it's important to have seen what's really going on.
Looking at these details also shows us why without them, we may think that
removing synchronization in some cases will work when it's actually incorrect.
A common, but incorrect, "optimisation" is to synchronize when writing to a
variable but not on the read. But this is incorrect because without synchronization:
(a) the reading thread
is not guaranteed to update its working copy of the variable with that in main memory, so
may never actually read an updated value of the variable;
and (b) even if it does read from main memory, there is nothing to stop it reading
while the write method is still in the middle of executing, before it has flushed the
value back to main memory.
On the next page, we continue by looking at
declaring
a method as synchronized.
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.