This commit is contained in:
168
docs/lectures/osc/13_mem_management5.md
Normal file
168
docs/lectures/osc/13_mem_management5.md
Normal file
@@ -0,0 +1,168 @@
|
||||
12/11/20
|
||||
|
||||
## Page Tables Optimisations
|
||||
|
||||
##### Memory Organisation
|
||||
|
||||
* The **root page table** is always maintained in memory.
|
||||
* Page tables themselves are maintained in **virtual memory** due to their size.
|
||||
* Assume a **fetch** from main memory takes $T$ time - single page table access is now $2\cdot T$ and **two** page table levels access is $3 \cdot T$.
|
||||
* Some optimisation needs to be done, otherwise memory access will create a bottleneck to the speed of the computer.
|
||||
|
||||
### Translation Look Aside Buffers
|
||||
|
||||
* Translation look aside buffers or TLBs are (usually) located inside the memory management unit
|
||||
* They **cache** the most frequently used page table entries.
|
||||
* As they're stored in cache its super quick.
|
||||
* They can be searched in **parallel**.
|
||||
* The principle behind TLBs is similar to other types of **caching in operating systems**. They normally store anywhere from 16 to 512 pages.
|
||||
* Remember: **locality** states that processes make a large number of references to a small number of pages.
|
||||
|
||||

|
||||
|
||||
The split arrows going into the TLB represent searching in parallel.
|
||||
|
||||
* If the TLB gets a hit, it just returns the frame number
|
||||
* However if the TLB misses:
|
||||
* We have to account for the time it took to search the TLB
|
||||
* We then have to look in the page table to find the frame number
|
||||
* Worst case scenario is a page fault (takes the longest). This is where we have to retrieve a page table from secondary memory, so that it can then be searched.
|
||||
|
||||
> Quick maths:
|
||||
>
|
||||
> * Assume a single-level page table
|
||||
>
|
||||
> * Assume 20ns associative **TLB lookup time**
|
||||
>
|
||||
> * Assume a 100ns **memory access time**
|
||||
>
|
||||
> * **TLB hit** => 20 + 100 = 120ns
|
||||
> * **TLB miss** => 20 + 100 + 100 = 220ns
|
||||
>
|
||||
> * Performance evaluation of TLBs
|
||||
>
|
||||
> * For an 80% hit rate, the estimated access time is:
|
||||
> $$
|
||||
> 120\cdot 0.8 + 220\cdot (1-0.8)=140ns
|
||||
> $$
|
||||
> (**40% slowdown** relative to absolute addressing)
|
||||
>
|
||||
> * For a 98% hit rate, the estimated access time is:
|
||||
> $$
|
||||
> 120\cdot 0.98 + 220\cdot (1-0.98)=122ns
|
||||
> $$
|
||||
> (**22% slowdown**)
|
||||
>
|
||||
> NOTE: **page tables** can be **held in virtual memory** => **further slow down** due to **page faults**.
|
||||
|
||||
### Inverted Page Tables
|
||||
|
||||
A **normal page table size** is proportional to the number of pages in the virtual address space => this can be prohibitive for modern machines
|
||||
|
||||
>An **inverted page table's size** is **proportional** to the size of **main memory**
|
||||
>
|
||||
>* The inverted table contains one **entry for every frame** (not for every page) and it **indexes entries by frame number** not by page number.
|
||||
>* When a process references a page, the OS must search the entire inverted page table for the corresponding entry (which could be too slow)
|
||||
> * It does save memory as there are fewer frames than pages.
|
||||
>* To find if your pages is in main memory, you need to iterate through the entire list.
|
||||
>* *Solution*: Use a **hash function** that transforms page numbers (*n* bits) into frame numbers (*m* bits) - Remember *n* > *m*
|
||||
> * The has functions turns a page number into a potential frame number.
|
||||
|
||||

|
||||
|
||||
So when looking for the page's frame location. We have to sequentially search through the table until we hit a match, we then get the frame number from the index - in this case 4.
|
||||
|
||||
#### Inverted Page Table Entry
|
||||
|
||||
> * The **frame number** will be the index of the inverted page table.
|
||||
> * Process Identifier (**PID**) - The process that owns this page.
|
||||
> * Virtual Page Number (**VPN**)
|
||||
> * **Protection** bits (Read/Write/Execute)
|
||||
> * **Chaining Pointer** - This field points towards the next frame that has exactly the same VPN. We need this to solve collisions
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
Due to the hash function, we now only have to look through all entries with **VPN**: 1 instead of all the entries.
|
||||
|
||||
#### Advantages
|
||||
|
||||
* The OS maintains a **single inverted page table** for all processes
|
||||
* It **saves lots of space** (especially when the virtual address space is much larger than the physical memory)
|
||||
|
||||
#### Disadvantages
|
||||
|
||||
* Virtual to physical **translation becomes much slower**
|
||||
* Hash tables eliminates the need of searching the whole inverted table, but we have to handle collisions (which also **slows down translation**)
|
||||
* TLBs are necessary to improve their performance.
|
||||
|
||||
### Page Loading
|
||||
|
||||
* Two key decisions have to be made when using virtual memory
|
||||
* What pages are **loaded** and when
|
||||
* Predictions can be made for optimisation to reduce page faults
|
||||
* What pages are **removed** from memory and when
|
||||
* **page replacement algorithms**
|
||||
|
||||
#### Demand Paging
|
||||
|
||||
> Demand paging starts the process with **no pages in memory**
|
||||
>
|
||||
> * The first instruction will immediately cause a **page fault**.
|
||||
> * **More page faults** will follow but they will **stabilise over time** until moving to the next **locality**
|
||||
> * The set of pages that is currently being used is called it's **working set** (same as the resident set)
|
||||
> * Pages are only **loaded when needed** (i.e after **page faults**)
|
||||
|
||||
#### Pre-Paging
|
||||
|
||||
> When the process is started, all pages expected to be used (the working set) are **brought into memory at once**
|
||||
>
|
||||
> * This **reduces the page fault rate**
|
||||
> * Retrieving multiple (**contiguously stored**) pages **reduces transfer times** (seek time, rotational latency, etc)
|
||||
>
|
||||
> **Pre-paging** loads as many pages as possible **before page faults are generated** (a similar method is used when processes are **swapped in and out**)
|
||||
|
||||
*ma*: memory access time *p*: page fault rate *pft*: page fault time
|
||||
|
||||
**Effective access time** is given by: $T_{a} = (1-p)\cdot ma+pft\cdot p$
|
||||
|
||||
NOTE: This doesn't take into account TLBs.
|
||||
|
||||
The expected access time is **proportional to page fault rate** when keeping page faults into account.
|
||||
$$
|
||||
T_{a} \space\space\alpha \space\space p
|
||||
$$
|
||||
|
||||
* Ideally, all pages would have to be loaded without demanding paging.
|
||||
|
||||
### Page Replacement
|
||||
|
||||
> * The OS must choose a **page to remove** when a new one is loaded
|
||||
> * This choice is made by **page replacement algorithms** and **takes into account**:
|
||||
> * When the page was **last used** or **expected to be used again**
|
||||
> * Whether the page has been **modified** (this would cause a write).
|
||||
> * Replacement choices have to be made **intelligently** to **save time**.
|
||||
|
||||
#### Optimal Page Replacement
|
||||
|
||||
> * In an **ideal** world
|
||||
> * Each page is labelled with the **number of instructions** that will be executed/length of time before it is used again.
|
||||
> * The page which is **going to be not referenced** for the **longest time** is the optimal one to remove.
|
||||
> * The **optimal approach** is **not possible to implement**
|
||||
> * It can be used for post execution analysis
|
||||
> * It provides a **lower bound** on the number of page faults (used for comparison with other algorithms)
|
||||
|
||||
#### FIFO
|
||||
|
||||
> * FIFO maintains a **linked list** of new pages, and **new pages** are added at the end of the list
|
||||
> * The **oldest page at the head** of the list is **evicted when a page fault occurs**
|
||||
>
|
||||
> This is a pretty bad algorithm <s>unsurprisingly</s>
|
||||
|
||||

|
||||
|
||||
Explanation at 53:40
|
||||
|
||||
Shaded squares on the top row are page faults. Shaded squares in the grid are when a new page is brought into memory.
|
||||
|
||||
Reference in New Issue
Block a user