Mapped buffers (file mapping)
The NIO framework provides the facility to map sections of a file
to a ByteBuffer. Once the mapping is set up, reading from the buffer
automatically reads data from the corresponding section of the file, and
writing to the buffer similarly results in the contents of the file being
updated accordingly.
This facility to create a mapped byte buffer can be useful in cases
where one or more of the following are true:
- you need random access to sections of the file and are going
to read/write data of different types— in other words, where the
various get/put methods of ByteBuffer are convenient for accessing
the data in the file;
- at the time of setting up the mapping, there's a chance that only
a small portion of the mapped area (or indeed none of it) will be accessed;
- you don't want file data to reside on the Java heap and prefer to
take advantage of the virtual memory system.
Note that there isn't necessarily any benefit in using a mapped byte
buffer in terms of "raw" read/write performance. For example, if you are simply
reading or writing serially through a file, then in terms of performance, you
generally may as well use a plain old FileInputStream or
FileOutputStream— with appropriate buffering, of course. Similarly,
accessing a file via a mapped buffer— at least under Windows—
doesn't appear to have any impact on the performance of file accesses by other means.
For example, if you (a) map the contents of a file, (b) read the data via
the mapped buffer, then (c) read that same file via a normal FileInputStream,
the "normal" read in (c) will generally benefit from the caching that
occurs in (b).
Therefore, the decision to use a mapped byte buffer should really be made
on the basis of the functionality you require: is accessing the file data
via a ByteBuffer the most convenient means for your application?
Is it desirable for your application to take advantage of the
virtual memory system (mapped buffers, not being part of the Java heap,
are eligible to be paged in/out as memory permits), or is more consistent
access time beneficial to your application?
How to use a mapped ByteBuffer
If it is, then to set up a mapped buffer, you first need to obtain a
FileChannel. Recall that to do so,
you first open the file via one of the regular means (RandomAccessFile,
FileInputSteam or FileOutputStream), then call the
getChannel() method. Finally, you call FileChannel.map()
as follows:
File f = ...
RandomAccessFile raf = new RandomAccessFile(f, "rw");
FileChannel fc = raf.getChannel();
ByteBuffer fileArea = fc.map(MapMode.READ_WRITE, offset, len);
Now, you can essentially use the get/put methods of fileArea just
as you would with any ByteBuffer.
Instead of MapMode.READ_WRITE, two other mapping modes may be
provided: MapMode.READ_ONLY and MapMode.PRIVATE.
A read only mapping is self-explanatory; a private mapping is writable,
but writes are reflected only in that MappedByteBuffer's contents (and
not written to the file). If a writable
mapping is requested (including a private one!),
then the underlying means by which the file was opened
must also support this (so e.g. you won't get a writable mapping from a channel
opened via a FileInputStream!).
Characteristics of mapped ByteBuffers
Here are some other characteristics of mapped buffers that you should
consider (these statements are true under Windows; I welcome feedback about
other operating systems):
- multiple mappings of the same section of a given file will generally
by consistent with one another within your Java application (i.e. within
your application's process), but not necessarily across processes1;
- mapped buffers will not necessarily be consistent with
modifications made by other means (e.g. via regular stream access)2;
- mapping a section of a file beyond its length then writing to the
buffer will result in the file growing accordingly;
- mappings continue to exist even when the parent channel is closed;
- the ByteBuffer returned by the FileChannel.map()
method is actually a MappedByteBuffer, which provides a couple of
additional calls (see below);
- mappings are unmapped when the MappedByteBuffer becomes
garbage collectable;
- after a mapping is set up, the corresponding data isn't necessarily
paged into physical memory until it is explicitly accessed—
the MappedByteBuffer.load() method touches just enough of the
data to ensure that it is all paged in;
- writes made to a writable buffer are not necessarily written immediately
to the file; the MappedByteBuffer.force() flushes any changes
(and any corresponding metadata changes) to the file.
1. Windows actually provides a facility for mappings to be
made consistent across processes (by using the same handle returned by
CreateFileMapping from the different processes),
but Java provides no access to this facility. I imagine that the designers
judged that this facility is not necessarily available on other OSs and
probably rarely needed.
2. Hitchens, R. (2002), Java NIO (O'Reilly) states that: "Calling
get() will fetch data from the disk file, and this data reflects the
current content of the file, even if the file has been modified by an external
process since the mapping was established" (p. 80). However, this seems to contradict both the point made in footnote 1 plus
the documentation of the Windows CreateFileMapping API call, which
states quite clearly that "A mapped file and a file that is accessed
by using the input and output (I/O) functions (ReadFile and WriteFile) are not necessarily coherent."
If you enjoy this Java programming article, please share with friends and colleagues. Follow the author on Twitter for the latest news and rants.
Editorial page content written by Neil Coffey. Copyright © Javamex UK 2021. All rights reserved.