27 Virtual Memory – Windows NT, Solaris 2

Mary Anitha Rajam

 

30.1 Introduction

 

The concept of virtual memory is implemented in different ways in different operating systems. In this module, the implementation of virtual memory in Windows NT and in Solaris 2 are discussed. In addition to the page replacement algorithms and methods of allocation of frames discussed in the earlier modules, this module discusses other considerations for paging.

 

30.2 Virtual Memory – Windows NT 

 

In Windows NT, demand paging is implemented with clustering. In clustering, multiple pages surrounding the faulting page are also brought in. That is, if there is a page fault for page number 20, the pages 19 and 21 (and even more) are also brought into the main memory. This will reduce more page faults occurring in the near future.

 

Any process, when created initially, is assigned with two values, a working-set minimum and maximum. Working-set minimum is the minimum number of page frames the process will have in memory. Working-set maximum is the maximum number of page frames allocated to a process, if memory is available. This maximum number of pages may be exceeded if there are a lot of free frames.

 

There is a process called virtual memory manager that maintains a list of free page frames. If the number of page frames for a process is below its working-set maximum and if a page fault occurs, the manager allocates a free page frame for the process from the list of free page frames. If the number of page frames for a process is at its working-set maximum and if page fault occurs, a page is selected for replacement. The replacement policy is local replacement, that is, one of the pages of this process is selected for replacement. Since this process already has the maximum number of page frames, it is not allocated more page frames.

 

A threshold value is associated with the list of free page frames. If the number of free page frames falls below a threshold value, the manager starts a process called automatic working-set trimming. The manager examines each process and checks if the page frames allocated to the process is more than its working-set minimum. If any process has page frames more than its working-set minimum, the manager removes pages till the minimum is reached. If sufficient free pages are available, a process with minimum pages can also be allocated with more pages.

 

30.3 Virtual Memory – Solaris 2 

 

In Solaris 2, a list of free frames is maintained. When any thread incurs a page fault a page is assigned from the list of free frames. There are three threshold values associated with the list of free frames, namely, minfree, desfree and lotsfree (minfree < desfree < lotsfree).

 

The value of lotsfree is about 1/64 of the size of the physical memory. Four times per second, the kernel checks if the number of free frames is less than lotsfree. When the number of free frames falls below lotsfree, a procedure called page-out is started. This procedure is a variant of the second-chance algorithm (two-handed clock algo) we had learnt in the previous module. The first hand of clock scans all pages in memory and sets the reference bit to 0 for all the pages. The second hand of clock scans all the pages again, if the reference bit is still 0, the page added to free list. Pages are scanned at a scanrate (pages per second). Scanrate ranges from slowscan to fastscan.

 

When free memory is less than lotsfree, scan occurs at slowscan pages per second and progresses to fastscan pages per second. The default value for slowscan is 100 pages per second. The default value for fastscan is (totalPhysicalPages)/2. The distance (in pages) between the hands of clock is determined by a parameter called handspread. The time between front hand clearing and back hand checking is determined by scanrate and handspread. If scanrate is 100 pages per second and handspread is 1024 pages, there is a time gap of 10 seconds between front and back hand.

 

If free memory falls below ‘desfree’, the pageout procedure runs 100 times per second. If pageout is unable to keep free frames at least at desfree, processes are swapped. That is, all pages of selected processes are swapped out. The processes that are idle for long periods are selected to be swapped out. If free memory falls even below ‘minfree’, the pageout procedure is called for every request for a new page.

 

30.4 Prepaging 

 

In demand paging, when a process starts, no page of the process is present in the main memory. Hence, initially, for each memory reference there is a page fault. Thus there are a large number of page faults when a process starts. This is to get the initial locality of the process into memory. Locality is the set of pages of a process, which are active at a particular time. This high paging activity can also happen when a swapped-out process is restarted. To avoid this kind of multiple page faults, prepaging was introduced.

 

In prepaging, all the pages that will be needed are brought into memory at one time. When a process was swapped out earlier, the list of pages in its working set is also remembered. Now, when the process is swapped in and is resumed, the entire working set is brought in. Thus, rather than having a page fault for each page, a set of pages is brought into memory in a single page fault. But, is cost of prepaging less than the cost of servicing page faults? In the negative side, many pages brought into memory may not be used. Then the time spent for transferring the unused pages gets wasted. Therefore, if most of the pages that are brought into memory are used, prepaging is advantageous.

 

30.5 Page Size

 

Selecting a page size is another issue that needs to be considered for the improvement in performance. Page sizes vary from 212 to 222 bytes. The size of the page table makes an impact in different issues.

 

30.5.1 Size of the page table 

 

When the page size in decreased, the number of pages increases. This increases the number of entries in the page table. This in turn increases the size of the page table. Each process has its own page table and thus the page tables of all processes have to be stored in memory. This leads to argue that it is better to have a large page size. But larger page sizes lead to internal fragmentation. Thus memory is better utilized with small page sizes.

 

30.5.2 Time required to read or write a page 

 

I/O time comprises of seek time, latency and transfer times. Transfer time is proportional to the amount transferred (size of page). Thus smaller page size requires less I/O time. Total I/O is also reduced, since locality is improved with smaller page size. When the size of the page is small, the page will more accurately match the program locality. Thus, when the page is small, only relevant information is brought into memory.

 

30.5.3 Number of page faults 

 

To minimize the number of page faults, larger page size is needed. For example, if page size is 1 byte, page fault occurs for each byte. If page size is large, the number of page faults will decrease. We have already seen that a page fault generates a large amount of overhead. It takes a long time to service a page fault. In recent times, larger page sizes are used.

 

30.6 TLB Reach 

 

When a translation look aside buffer is used, hit ratio is the percentage of virtual address translations that are resolved in the TLB rather than in the page table. Larger the TLB, there will be more number of hits and hence better is the hit ratio. Ideally, working set for a process is to be stored in the TLB. When there are entries in the TLB corresponding to the pages in the working set of a process, the number of hits in the TLB will be more. Else, there will be a lot of misses and more time is taken to access a page.

 

There is another metric related to the TLB called the TLB reach. TLB reach is the amount of memory accessible from the TLB, that is, the number of entries multiplied by the size of a page. For example, if the size of a page is 1024 bytes (1 KB) and if the number of entries in the TLB is 10, then the amount of memory that can be reached through the TLB is 1024 x 10 = 10240 bytes (10 KB). If the number of entries in the TLB is increased to 20, the TLB reach becomes 20 KB. Thus, increasing the number of entries in the TLB, increases the TLB reach. Increasing the size of pages can also increase the TLB reach. For example, increasing the size from 1 KB to 4KB can quadruple the TLB reach. But increased page size leads to fragmentation.

 

Having different page sizes can also increase the TLB reach. Solaris uses two different page sizes – 8 KB and 4 MB page sizes. The first 4 MB of kernel code and data are stored in two 4 MB pages. For most applications, 8 KB is sufficient. Thus, when the size of the application is small, small page sizes are used. This reduces fragmentation. For kernel code and data, at least few pages will be fully occupied and large page size is used for those pages.

 

30.6 Program Structure 

 

System performance can be improved if user has an awareness of underlying demand paging. For example, consider the following array of size 128 x 128. Assume that the page size is 128 bytes.

 

int A [128,128];

In this array, the elements are stored in the order A[0][0], A[0][1], A[0][2], …, A[0][127], A[1][0],…, A[1][127], …A[127][127]. Each row in the array has 128 elements. Thus, when the elements are stored, each row is stored in one page.

 

Now, consider the following code snippet:

for (j = 0; j <128; j++)

for (i = 0; i < 128; i++)

A[i,j] = 0;

 

Here the elements in the array are accessed in the order A[0][0], A[1][0], A[2][0], …, A[127][0], A[0][1], A[1][1], …., A[127][127]. That is, the first word in each page is accessed first and then the second word in each page is accessed and so on. Thus, this results in 128 x 128 = 16,384 page faults.

 

Rather, if the code is written in the following manner, the number of page faults reduces.

 

for (i = 0; i < 128; i++)

for (j = 0; j < 128; j++)

A[i,j] = 0;

 

Here, first all the words in the first page is accessed, then all the words in the second page are accessed and so on. Thus there are only 128 page faults. If a programmer is aware of the way in which the elements of the array are stored in memory, the number of page faults can be reduced.

 

30.7 Summary

 

In this module, we learnt how virtual memory is implemented in Windows NT and Solaris 2. Prepaging brings in more pages into the memory at the same time rather than bringing in only the faulted page. Larger page size decreases the size of the page table and the number of page faults. But larger page size results in internal fragmentation. I/O time and locality are improved with smaller page size. With larger page sizes, TLB reach can be increased. It is also useful to use different page sizes. When a programmer is aware of the underlying demand paging scheme, page faults can be minimized.

 

 

References

  1. Abraham Silberschatz, Peter B. Galvin, Greg Gagne, “Operating System Concepts”, Sixth Edition, John Wiley & Sons Inc., 2003.
  2. Andrew S. Tanenbaum, Herbert Bos, “Modern Operating Systems”, Fourth Edition, Pearson Education, 2014.
  3. Gary Nutt, “Operating Systems”, Third Edition, Pearson Education, 2009.