Pagefile
From ProPHOTO WIKI
Contents |
Introduction
Nearly all modern operating systems run within a virtual memory architecture that provides a level of abstraction between the software and the OS that it is running on top of. This technique provides a number of benefits, however the one relevant to this article is the ability of the OS to transparently deal with situations where it needs more memory than is physically available. It does this by swapping data between RAM and a special file on the hard drive called a page file.
Given the memory required by contemporary image processing software, the operation of this system is often a key determinant of overall system performance. Hard drives are a much slower storage medium than RAM is, so when this process occurs it incurs a significant performance penalty. As such, for many types of image processing one must be careful to select the right components to ensure that this penalty is as small as possible.
Virtual Memory
The detailed operation of page files is beyond the scope of this article, however the basic theory is relatively simple. The first principle that has to be understood is that software doesn't typically have direct access to the data stored in memory. Instead, programs use a set of virtual addresses to locate the data that it is working with. Each time the program requests a piece of data, the system uses a lookup table to translate that virtual address into the physical address it is actually stored at.
This technique provides a number of advantages, one of which is the fact that this allows the operating system to relocate the physical location of data without the involvement of the software. Further, if there isn't enough RAM to store all of the data in use, this system also allows the OS to shuffle blocks of data to other locations (such as the hard drive). This allows the system the flexibility to optimize the way that its resources are used, and provides the ability to work with much larger data sets than would otherwise be possible.
Operation of the Pagefile
As noted above, when there is insufficient RAM to store all of the data that the computer must work with, it will temporarily move some of this data to the hard drive. The location that this data is stored in is a special system file called the pagefile. As the paging process is often a significant bottleneck for many photographers, the capabilities of the physical disk where this file is housed can have a major impact in overall performance. To understand these ramifications, however, it is important to understand how this system works.
As shown in figure 1, when a program requests a block of data that has been moved out of memory, it is interrupted and the OS then swaps the data back into memory. Once this process is complete, the OS then passes the program the requested data and it continues on as expected. As this process is performed transparently, the program doesn't require any special code to handle this scenario.
The problem, however, is that the hard drive takes much longer than main memory to retreive this data. As such, when the above process takes place the program accessing this data will see a significant loss of performance. Due to this effect, this paging process will often become the primary bottleneck in many photographic workflows so it is a major factor that has to be considered when designing a machine.
Performance Considerations
Hard Drive Selection
Naturally, the most obvious factor in the performance of the pagefile is the overall performance of the disk(s) on which it is stored. The slower the drive, the more significant the performance penalty of the paging process will be. Unlike loading images, however, paging consists of loading and storing large numbers of small blocks of data. As such, factors like seek time can be as significant as bandwidth in many circumstances.
Spindle Speed and Seek Time
Hard drives consist of a number of mechanical components that must be repossitioned to get data from a different area of the drive. This process naturally takes time, so when accessing small blocks of data distributed over the drive (which often occurs durring paging) it can easily consume a significant percentage of the time taken to service each request.
There are two main variables that determine seek time - the time required to repossition the heads over a specific track, and the time it takes for the disc to rotate to the point where the required data on that track is located. As the time taken to repossition the heads is relatively consistent, the primary variable that determines seek time is the rotational speed of the drive.
Typical desktop hard drives operate at 7,200RPM, whereas high-end drives operate between 10,000-15,000RPM. Notebook hard drives often operate at 4,000-5,400RPM, with high end offerings operating at 7,200RPM. As such, the use of these higher-performance drives can be a significant asset when used for pagefiles.
RAID 0
The use of a RAID 0 drive array for a pagefile can provide an improvement in overall performance by increasing the bandwidth available to the system. This bandwidth allows the OS to move pages in and out of memory faster than it would otherwise be able to do with a single drive.
It is important to note, however, that RAID 0 arrays provide no improvement in seek times (and, with some controllers, can actually be worse) so improvement will not be anywhere close to doubling pagefile performance. Further, if the array is used for anything other than the pagefile, it is important to understand the caveats of using these arrays (see the RAID 0 article).
Drive Contention
The other major variable in pagefile performance is contention for the hard drive used to store the file. If other processes are using the drive while the pagefile is active, the disk must split it's resources between the two tasks. When this happens regularly, the overall performance of the system will be significantly lower than it would be under normal circumstances.
In addition to having to split the available bandwidth, contention also means that the heads must continually move back and forth between the two (or more) locations. This means that two processes competing for the same drive will often see less than half of the performance that they'd have on their own.
Photoshop Scratch Disk
The primary cause of contention with the pagefile is the photoshop scratch disk. The scratch disk performs a similar function to the pagefile, so when Photoshop is being pushed to its limit both files will generally be under heavy load at the same time. As such, if both are placed on the same physical disk then they will have to fight for the same set of resources. Placing these two files on seperate discs is often one of the simplest things that can be done to improve Photoshop performance.
See Also
- Computers Portal - General articles on various computer related topics.
- Computer Optimization - Article covering the overall performance of computer systems designed for photographers.
- Memory - Article covering the details of different forms of system memory.
- Hard Drives - Details on evaluating hard drives and the meaning of the various component specifications.
