Java file memory mapped file

JAVA NIO – Memory-Mapped File

In Java’s earlier version, It uses FileSystem conventional API to access system file. Behind the scene JVM makes read () and write () system call to transfer data from OS kernel to JVM. JVM utilize it memory space to load and process file that causes trouble on large data file processes. File page convert to JVM system before processing the file data because OS handle file in page but JVM uses byte stream that is not compatible with page. In JDK version 1.4 and above Java provides MappedByteBuffer which help to establish a virtual memory mapping from JVM space to filesystem pages. This removes the overhead of transferring and coping the file’s content from OS kernel space to JVM space. OS uses Virtual Memory to cache the file in outside of Kernel space that could be sharable with other non-kernel process. Java maps the File pages to MappedByteBuffer directly and process these file without loading into JVM.

Belo diagram show how JVM mapped the file with MappedByteBuffer and process the file without loading the file in JVM.

JAVA NIO

MappedByteBuffer directly map with open file in Virtual Memory by using map method in FileChannel. The MappedByteBuffer object work like buffer but its data stored in a file on Virtual Memory. The get() method on MappedByteBuffer fetch the data from file which will represent current file data stored inside disk. In similar way put () method update the content directly on disk and modified content will be visible to other readers of the file. Processing file through MappedByteBuffer has big advantage because it doesn’t make any system call to read/write on file that improve the latency. Apart from that File in Virtual Memory cache the memory pages that will directly access by MappedByteBuffer and doesn’t consumes JVM space. The only drawback is the it throw page fault if requested page is not in Memory.

Читайте также:  You have one in your php

Below some of the key benefits of MappedByteBuffer:

  1. The JVM directly process on Virtual Memory hence it will avoid system read () and write() call.
  2. JVM doesn’t load file in its memory besides it uses Virtual Memory that bring the ability to process large data in efficient manner.
  3. OS mostly take care of reading and writing from shared VM without using JVM
  4. It could be used more than one process based on locking provided by OS. We will be discussing locking in later.
  5. This also provides ability map a region or part of file.
  6. The file data always mapped with disk file data without using buffer transfer.

Note: JVM invokes the shared VM, page fault generated that push the file data from OS to shared memory. The other benefit of memory mapping is that operating system automatically manages VM space.

public abstract class FileChannel extends AbstractInterruptibleChannel implements SeekableByteChannel, GatheringByteChannel, ScatteringByteChannel < public abstract MappedByteBuffer map(MapMode mode, long position, long size) throws IOException; public static class MapMode < >> public static final MapMode READ_ONLY public static final MapMode READ_WRITE public static final MapMode PRIVATE

As per above code map () method on FileChannel pass the arguments as mode, position and size and return MappedByteBuffer for part of file. We can also map entire file by using as below

buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileChannel.size());

If you increase the size larger than file size then file will also become large to match the size of MappedByteBuffer in write mode but throw IOException in read mode because file can not modified in read mode. There are three type of mapping mode in map() method

  • READ_ONLY: Read only
  • READ_WRITE: Read and update the file
  • PRIVATE: Change in MappedByteBuffer will not reflect to File.
Читайте также:  METANIT.COM

MappedByteBuffer will also not be visible to other programs that have mapped the same file; instead, they will cause private copies of the modified portions of the buffer to be created.

Map()method will throw NonWritableChannelException if it try to write on MapMode.READ_WRITE mode. NonReadableChannelException will be thrown if you request read on channel not open on read mode.

Below sample code show how to use MappedByteBuffer

public class MemoryMappedSample < public static void main(String[] args) throws Exception < int count = 10; RandomAccessFile memoryMappedFile = new RandomAccessFile("bigFile.txt", "rw"); MappedByteBuffer out = memoryMappedFile.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, count); for (int i = 0; i < count; i++) < out.put((byte) 'A'); >for (int i = 0; i < count; i++) < System.out.print((char) out.get(i)); >> >

Источник

Java MappedByteBuffer

Learn about Java memory-mapped files and learn to read and write content from a memory mapped file with the help of RandomAccessFile and MemoryMappedBuffer.

If you know how java IO works at lower level, then you will be aware of buffer handling, memory paging and other such concepts. For conventional file I/O, in which user processes issue read() and write() system calls to transfer data, there is almost always one or more copy operations to move the data between these filesystem pages in kernel space and a memory area in user space. This is because there is not usually a one-to-one alignment between filesystem pages and user buffers.

There is, however, a special type of I/O operation supported by most operating systems that allows user processes to take maximum advantage of the page-oriented nature of system I/O and completely avoid buffer copies. This is called memory-mapped I/O and we are going to learn few things here around memory-mapped files.

2. Java Memory-Mapped Files

Memory-mapped I/O uses the filesystem to establish a virtual memory mapping from user space directly to the applicable filesystem pages. With a memory-mapped file, we can pretend that the entire file is in memory and that we can access it by simply treating it as a very large array. This approach greatly simplifies the code we write in order to modify the file.

To do both writing and reading in memory mapped files, we start with a RandomAccessFile , get a channel for that file. Memory mapped byte buffers are created via the FileChannel.map() method. This class extends the ByteBuffer class with operations that are specific to memory-mapped file regions.

A mapped byte buffer and the file mapping that it represents remain valid until the buffer itself is garbage-collected. Note that you must specify the starting point and the length of the region that you want to map in the file; this means that you have the option to map smaller regions of a large file.

Example 1: Writing to a memory mapped file

import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMappedFileExample < static int length = 0x8FFFFFF; public static void main(String[] args) throws Exception < try(RandomAccessFile file = new RandomAccessFile("howtodoinjava.dat", "rw")) < MappedByteBuffer out = file.getChannel() .map(FileChannel.MapMode.READ_WRITE, 0, length); for (int i = 0; i < length; i++) < out.put((byte) 'x'); >System.out.println("Finished writing"); > > >

The file created with the above program is 128 MB long, which is probably larger than the space your OS will allow. The file appears to be accessible all at once because only portions of it are brought into memory, and other parts are swapped out. This way a very large file (up to 2 GB) can easily be modified.

Like conventional file handles, file mappings can be writable or read-only.

  • The first two mapping modes, MapMode.READ_ONLY and MapMode.READ_WRITE, are fairly obvious. They indicate whether you want the mapping to be read-only or to allow modification of the mapped file.
  • The third mode, MapMode.PRIVATE, indicates that you want a copy-on-write mapping. This means that any modifications you make via put() will result in a private copy of the data that only the MappedByteBuffer instance can see. No changes will be made to the underlying file, and any changes made will be lost when the buffer is garbage collected. Even though a copy-on-write mapping prevents any changes to the underlying file, you must have opened the file for read/write to set up a MapMode.PRIVATE mapping. This is necessary for the returned MappedByteBuffer object to allow put()s.

You’ll notice that there is no unmap() method. Once established, a mapping remains in effect until the MappedByteBuffer object is garbage collected.

Also, mapped buffers are not tied to the channel that created them. Closing the associated FileChannel does not destroy the mapping; only disposal of the buffer object itself breaks the mapping.

A MemoryMappedBuffer has a fixed size, but the file it’s mapped to is elastic. Specifically, if a file’s size changes while the mapping is in effect, some or all of the buffer may become inaccessible, undefined data could be returned, or unchecked exceptions could be thrown.

Be careful about how files are manipulated by other threads or external processes when they are memory-mapped.

4. Benefits of Memory Mapped Files

Memory-Mapped IO have several advantages over normal I/O:

  1. The user process sees the file data as memory, so there is no need to issue read() or write() system calls.
  2. As the user process touches the mapped memory space, page faults will be generated automatically to bring in the file data from disk. If the user modifies the mapped memory space, the affected page is automatically marked as dirty and will be subsequently flushed to disk to update the file.
  3. The virtual memory subsystem of the operating system will perform intelligent caching of the pages, automatically managing memory according to system load.
  4. The data is always page-aligned, and no buffer copying is ever needed.
  5. Very large files can be mapped without consuming large amounts of memory to copy the data.

5. How to read a Memory-Mapped File

To read a file using memory mapped IO, use below code template:

import java.io.File; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMappedFileReadExample < private static String bigExcelFile = "bigFile.xls"; public static void main(String[] args) throws Exception < try (RandomAccessFile file = new RandomAccessFile(new File(bigExcelFile), "r")) < //Get file channel in read-only mode FileChannel fileChannel = file.getChannel(); //Get direct byte buffer access using channel.map() operation MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, 0, fileChannel.size()); // the buffer now reads the file as if it were loaded in memory. System.out.println(buffer.isLoaded()); //prints false System.out.println(buffer.capacity()); //Get the size based on content size of file //You can read the file from this buffer the way you like. for (int i = 0; i < buffer.limit(); i++) < System.out.print((char) buffer.get()); //Print the content of file >> > >

6. How to write into a Memory-Mapped File

To write data into a file using memory mapped IO, use below code template:

import java.io.File; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMappedFileWriteExample < private static String bigTextFile = "test.txt"; public static void main(String[] args) throws Exception < // Create file object File file = new File(bigTextFile); //Delete the file; we will create a new file file.delete(); try (RandomAccessFile randomAccessFile = new RandomAccessFile(file, "rw")) < // Get file channel in read-write mode FileChannel fileChannel = randomAccessFile.getChannel(); // Get direct byte buffer access using channel.map() operation MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_WRITE, 0, 4096 * 8 * 8); //Write the content using put methods buffer.put("howtodoinjava.com".getBytes()); >> >

Drop me your comments and thoughts in the comments section.

Источник

Power of Java MemoryMapped File

In JDK 1.4 an interesting feature of Memory mapped file was added to Java, which allows to map any file to OS memory for efficient reading. A memory mapped file can be used to develop an IPC type of solution. This article is an experiment with memory mapped file to create IPC.

Some details about Memory Mapped File, definition from WIKI

A memory-mapped file is a segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file or file-like resource. This resource is typically a file that is physically present on-disk, but can also be a device, shared memory object, or other resource that the operating system can reference through a file descriptor. Once present, this correlation between the file and the memory space permits applications to treat the mapped portion as if it were primary memory.

Sample Program

Below we have two Java programs, one is a writer and the other is a reader. The writer is the producer and tries to write to Memory Mapped file, the reader is the consumer and it reads messages from the memory mapped file. This is just a sample program to show you the idea, it does’t handle many edge cases but it is good enough to build something on top of a memory mapped file.

MemoryMapWriter

import java.io.File; import java.io.FileNotFoundException; import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMapWriter < public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException < File f = new File("c:/tmp/mapped.txt"); f.delete(); FileChannel fc = new RandomAccessFile(f, "rw").getChannel(); long bufferSize=8*1000; MappedByteBuffer mem =fc.map(FileChannel.MapMode.READ_WRITE, 0, bufferSize); int start = 0; long counter=1; long HUNDREDK=100000; long startT = System.currentTimeMillis(); long noOfMessage = HUNDREDK * 10 * 10; for(;;) < if(!mem.hasRemaining()) < start+=mem.position(); mem =fc.map(FileChannel.MapMode.READ_WRITE, start, bufferSize); >mem.putLong(counter); counter++; if(counter > noOfMessage ) break; > long endT = System.currentTimeMillis(); long tot = endT - startT; System.out.println(String.format("No Of Message %s , Time(ms) %s ",noOfMessage, tot)) ; > >

MemoryMapReader

import java.io.File; import java.io.FileNotFoundException; import java.io.IOException; import java.io.RandomAccessFile; import java.nio.MappedByteBuffer; import java.nio.channels.FileChannel; public class MemoryMapReader < /** * @param args * @throws IOException * @throws FileNotFoundException * @throws InterruptedException */ public static void main(String[] args) throws FileNotFoundException, IOException, InterruptedException < FileChannel fc = new RandomAccessFile(new File("c:/tmp/mapped.txt"), "rw").getChannel(); long bufferSize=8*1000; MappedByteBuffer mem = fc.map(FileChannel.MapMode.READ_ONLY, 0, bufferSize); long oldSize=fc.size(); long currentPos = 0; long xx=currentPos; long startTime = System.currentTimeMillis(); long lastValue=-1; for(;;) < while(mem.hasRemaining()) < lastValue=mem.getLong(); currentPos +=8; >if(currentPos < oldSize) < xx = xx + mem.position(); mem = fc.map(FileChannel.MapMode.READ_ONLY,xx, bufferSize); continue; >else < long end = System.currentTimeMillis(); long tot = end-startTime; System.out.println(String.format("Last Value Read %s , Time(ms) %s ",lastValue, tot)); System.out.println("Waiting for message"); while(true) < long newSize=fc.size(); if(newSize>oldSize) < oldSize = newSize; xx = xx + mem.position(); mem = fc.map(FileChannel.MapMode.READ_ONLY,xx , oldSize-xx); System.out.println("Got some data"); break; >> > > > >

Observation

Using a memory mapped file can be a very good option for developing Inter Process communication, throughput is also reasonably well for both produce & consumer. Performance stats by run producer and consumer together:

Each message is one long number

Produce – 10 Million message – 16(s)

Consumer – 10 Million message 0.6(s)

A very simple message is used to show you the idea, but it can be any type of complex message, but when there is complex data structure then serialization can add to overhead. There are many techniques to get over that overhead. More in next blog.

Reference: Power of Java MemoryMapped File from our JCG partner Ashkrit Sharma at the Are you ready blog.

Источник

Оцените статью