UVA Solution: What is page thrashing?

Some operating systems (such as UNIX or Windows in enhanced mode) use virtual memory. Virtual

memory is a technique for making a machine behave as if it had more memory than it really has, by using

disk space to simulate RAM (random-access memory). In the 80386 and higher Intel CPU chips, and in most

other modern microprocessors (such as the Motorola 68030, Sparc, and Power PC), exists a piece of

hardware called the Memory Management Unit, or MMU.

The MMU treats memory as if it were composed of a series of “pages.” A page of memory is a block of

contiguous bytes of a certain size, usually 4096 or 8192 bytes. The operating system sets up and maintains

a table for each running program called the Process Memory Map, or PMM. This is a table of all the pages

of memory that program can access and where each is really located.

Every time your program accesses any portion of memory, the address (called a “virtual address”) is processed

by the MMU. The MMU looks in the PMM to find out where the memory is really located (called the

“physical address”). The physical address can be any location in memory or on disk that the operating system

has assigned for it. If the location the program wants to access is on disk, the page containing it must be read

from disk into memory, and the PMM must be updated to reflect this action (this is called a “page fault”).

Hope you’re still with me, because here’s the tricky part. Because accessing the disk is so much slower than

accessing RAM, the operating system tries to keep as much of the virtual memory as possible in RAM. If

you’re running a large enough program (or several small programs at once), there might not be enough RAM

to hold all the memory used by the programs, so some of it must be moved out of RAM and onto disk (this

action is called “paging out”).

The operating system tries to guess which areas of memory aren’t likely to be used for a while (usually based

on how the memory has been used in the past). If it guesses wrong, or if your programs are accessing lots of

memory in lots of places, many page faults will occur in order to read in the pages that were paged out. Because

all of RAM is being used, for each page read in to be accessed, another page must be paged out. This can lead

to more page faults, because now a different page of memory has been moved to disk. The problem of many

page faults occurring in a short time, called “page thrashing,” can drastically cut the performance of a system. Programs that frequently access many widely separated locations in memory are more likely to cause page

thrashing on a system. So is running many small programs that all continue to run even when you are not

actively using them. To reduce page thrashing, you can run fewer programs simultaneously. Or you can try

changing the way a large program works to maximize the capability of the operating system to guess which

pages won’t be needed. You can achieve this effect by caching values or changing lookup algorithms in large

data structures, or sometimes by changing to a memory allocation library which provides an implementation

of malloc() that allocates memory more efficiently. Finally, you might consider adding more RAM to the

system to reduce the need to page out.

Monday, May 16, 2011

What is page thrashing?