Aside from being too slow, the page tables are also too big, and consumes much memory. We’re making it smaller.

Simple solution: bigger pages

Setting the pages bigger can easily reduce table size, but to reduce internal fragmentation, most systems tend to use small page sizes (~4KB).

Hybrid approach: paging and segments

Instead of allocating pages to the whole program, allocate pages to code, heap and stack respectively. In this way, we can prevent many invalid pages.

Traditional address space w/ paging

Traditional address space w/ paging

We’re offering a page table per segment. Here, we use the base register in MMU to hold the physical address of the page table, and the bound register to hold the number of valid pages. Now, the virtual address is like:

Hybrid virtual address

Hybrid virtual address

The top two bits are used to indicate the segment. Here’s how we get the address of PTE:

SN = (VirtualAddress & SEG_MASK) >> SN_SHIFT
VPN = (VirtualAddress & VPN_MASK) >> VPN_SHIFT
AddressOfPTE = Base[SN] + (VPN * sizeof(PTE))

If VPN is beyond limitation in bound register, an exception is likely to be produced.

However, segmentation itself is not very flexible. If the heap is used sparsely, the problem of external fragmentation will rise again. The size of page tables are arbitrary now, making finding them difficult.

Multi-level page tables

It’s “page tables of page tables”. First, divide the page table into page-sized units (a.k.a. the sum of size of PTEs = per page size)

The second level of page table is page directory. It tells which page of pages is all invalid, and if not, where they are.

Linear (classic) and multi-level page tables

Linear (classic) and multi-level page tables

With page directory, we can release the space of page tables not allocated. PDBR (Page Directory Base Register) is similar to PTBR in the classic page table.

A PDE (Page Directory Entry) at least consists of a valid bit, and the PFN (of a certain page of page table). If at least one contained PTE is valid, then the PDE is valid.

There is a cost of multi-level page tables. 2 memory loads would be performed if there were a TLB miss, and it’s also more complicated. So this is another time-space tradeoff.