Home|Products|Doc/MMF|Community|Company
![]()
This document discusses Memory Mapped File I/O.

When a user starts an application, the OS creates a new process to execute it; in every modern OS, the process has its own virtual address space (VAS). In 32-bit architectures, the VAS of any process is limited to 232 bytes, i.e. 4GB. In 64-bit architectures the theoretical limit is around 17 million TB, nevertheless, the practical limits are currently much smaller: 248 bytes (256TB) on Linux and 244 bytes (16TB) on Windows, for example. Each individual address in a process VAS can point to a byte value, although initially none of them do.
It is useful to think of the memory available to the process in terms of its VAS, and not of the system's physical memory or page file. A system may have xxx MB of physical system memory and a yyy GB page file, but what matters is that the application process has a fixed size VAS. The values addressed by the VAS are not ultimately stored in RAM, rather, they always originate from byte values in a file on disk: this is the crucial feature of virtual memory systems and of memory-mapped file I/O. The OS transparently and (hopefully) efficiently manages the mapping between the VAS and the files that hold its values, usually on a page-by-page basis.
Physical memory comes in various flavors: system ROM, on-chip cache, off-chip cache, L3 cache, and RAM. As far as a running process is concerned, RAM is just another level of cache used by the OS: an L4 cache , so to speak. Thus, available RAM has a lot to do with performance, but nothing to do with how much memory is seen by a process, the latter being dependent on the VAS and nothing else. Physical memory is used by the OS to quickly map values coming from file bytes into VAS addresses. Process memory is VAS memory, not physical memory.
When a process starts, none of the addresses in its VAS refer to a value. Reading or writing any address in the VAS would immediate cause a memory exception. In the following diagram, the VAS address space goes from 0 through # (which represents the architecture's addressing limit):
address 0 #
VAS |----------------------------------------------|
First the application's executable file “app1” is mapped into the VAS. The OS maps addresses in the process VAS to bytes in the executable file, represented as v's in the diagram. The remaining -'s in the VAS line have no values assigned to them (they're truly null pointers, so to speak):
address 0 #
VAS |---vvvv---------------------------------------|
mapping ||||
disk app1
Then the required library files (DLLs on Windows) are mapped. This includes private user space libraries, such face.dll in this example, as well as public system libraries such as kernel32.dll:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv-------------------|
mapping |||| |||||| ||||
disk app1 kernel face
The process then begins running the code in the executable file. The only way the running code in the process can read or write values in its VAS is to have them mapped to bytes from a file. This is a crucial point. Allocating memory with C malloc, for example, implicitly maps bytes of the system page file into the VAS: this is the most common way to use memory on an OS with virtualized memory addressing (practically all modern OS fall into this category).
The page file is usually a single file, but multiple sets of contiguous bytes can be mapped into different segments of a VAS:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv----------v---v-vv-|
mapping |||| |||||| |||| | | ||
disk app1 kernel face page_file
Different segments of the system page file usually map into the VAS of multiple concurrent processes:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv----------v---v-vv-|
mapping |||| |||||| |||| | | ||
disk app1 app2 kernel face page_file
mapping |||| |||||| || |
VAS 2 |--------vvvv--vvvvvv------------------vv--v---|
A process can also explicitly map a data file into its VAS. This is what is called memory-mapping a file. Once this is done, the running code in the process can access the bytes in the file as if they were in memory, without needing to execute fread, fseek, or fwrite commands. Changes made to bytes in the VAS will be reflected on the disk based mapped file under the responsibility of the OS virtual memory system, which is a throughly tested piece of code, highly optimized for speed and reliability, in both Linux and Windows Server OS.
Note that this does not imply that the whole file is loaded into RAM. In fact, it is likely that no part of the file has been loaded into RAM yet: the OS will load into RAM only those pages that are touched (i.e., referenced) by the process code:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv--vvvv----v---v-vv-|
mapping |||| |||||| |||| |||| | | ||
disk app1 app2 kernel face data page_file
mapping |||| |||||| || |
VAS 2 |--------vvvv--vvvvvv------------------vv--v---|
If another process maps the same file into its own VAS, then the contents of the mapped file will be shared among all the processes which map it. This is a very straightfoward way to achieve shared memory among concurrent processes:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv--vvvv----v---v-vv-|
mapping |||| |||||| |||| |||| | | ||
disk app1 app2 kernel face data page_file
mapping |||| |||||| |||| || |
VAS 2 |--------vvvv--vvvvvv---------vvvv-----vv--v---|
If the files are stored on a SAN, then processes on different computers can share memory without further ado:
address 0 #
VAS 1 |---vvvv-------vvvvvv---vvvv--vvvv----v---v-vv-|
mapping |||| |||||| |||| |||| | | ||
disk app1 app2 kernel face data page_file
SAN |||| |||||| |||| || |
network ................................................
mapping |||| |||||| |||| || |
VAS 2 |--------vvvv--vvvvvv---------vvvv-----vv--v---|
![]()