notes/docs/lectures/osc/15_file_systems1.md at 1442cbb300319fc417e417c3984cb841fff16fc2

jay/notes

Fork 0

Files

John Gatward 4280451f12 test

2026-03-25 12:29:00 +00:00

7.3 KiB

Raw Blame History

19/11/20

Disk Scheduling

Hard Drives

Construction of Hard Drives

Disks are constructed as multiple aluminium/glass platters covered with magnetisable material

Read/Write heads fly just above the surface and are connected to a single disk arm controlled by a single actuator

Data is stored on both sides

Hard disks rotate at a constant speed

A hard disk controller sits between the CPU and the drive

Hard disks are currently about 4 orders of magnitude slower than main memory.

Low Level Format

Disks are organised in:

Cylinders: a collection of tracks in the same relative position to the spindle

Tracks: a concentric circle on a single platter side

Sectors: segments of a track - usually have an equal number of bytes in them, consisting of a preamble, data and an error correcting code (ECC).

The number of sectors on each track increases from the inner most track to the outer tracks.

Organisation of hard drives

Disks usually have a cylinder skew i.e an offset is added to sector 0 in adjacent tracks to account for the seek time.

In the past, consecutive disk sectors were interleaved to account for transfer time (of the read/write head)

NOTE: disk capacity is reduced due to preamble & ECC

Access times

Access time = seek time + rotational delay + transfer time

Seek time: time needed to move the arm to the cylinder
Rotational latency: time before the sector appears underneath the read/write head (on average its half a rotation)
Transfer time: time to transfer the data

Multiple requests may be happening at the same time (concurrently), so access time may be increased by queuing time

In this scenario, dominance of seek time leaves room for optimisation by carefully considering the order of read operations.

The estimated seek time (i.e to move the arm from one track to another) is approximated by:


T_{s} = n \times m + s

In which T_{s} denotes the estimated seek time, n the number of tracks to be crossed, m the crossing time per track and s any additional startup delay.

Let us assume a disk that rotates at 3600 rpm

One rotation = 16.7 ms

The average rotational latency T_{r} is then 8.3 ms

Let b denote the number of bytes transferred, N the number of bytes per track, and rpm the rotation speed in rotations per minute, the per track, the transfer time, T_{t}, is then given by:
T_{t} = \frac b N \times \frac {ms\space per\space minute}{rpm}
N bytes take 1 revolution => \frac{60000}{3600} ms = \frac {ms\space per\space minute}{rpm}

b contiguous bytes takes \frac{b}{N} revolutions.

Read a file of size 256 sectors with;

T_{s} = 20 ms (average seek time)

32 sectors per track

Suppose the file is stored as compact as possible (its stored contiguously)

The first track takes: seek time + rotational delay + transfer time 20 + 8.3 + 16.7 = 45ms

Assuming no cylinder skew and neglecting small seeks between tracks we only need to account for rotational delay + transfer time 8.3+16.7=25ms

The total time is 45+7\times 25 = 220ms = 0.22s

In case the access is not sequential but at random for the sectors we get:

Time per sector = T_{s}+T_{r}+T_{t} = 20+8.3+0.5=28.8ms T_{t} = 16.7\times \frac {1}{32} = 0.5

It is important to position the sectors carefully and avoid disk fragmentation

Disk Scheduling

The OS must use the hardware efficiently:

The file system can position/organise files strategically
Having multiple disk requests in a queue allows us to minimise the arm movement

Note that every I/O operation goes through a system call, allowing the OS to intercept the request and re sequence it.

If the drive is free, the request can be serviced immediately, if not the request is queued.

In a dynamic situation, several I/O requests will be made over time that are kept in a table of requested sectors per cylinder.

Disk scheduling algorithms determine the order in which disk events are processed

First-Come First-Served

Process the requests in the order that they arrive

Consider the following sequence of disk requests

11 1 36 16 34 9 12

The total length is: |11-1|+|1-36|+|36-16|+|16-34|+|34-9|+|9-12|=111

Shortest Seek Time First

Selects the request that is closest to the current head position to reduce head movement

This allows us to gain ~50% over FCFS

Total length is: |11-12|+|12-9|+|9-16|+|16-1|+|1-34|+|34-36|=61

Disadvantages:

Could result in starvation:

The arm stays in the middle of the disk in case of heavy load, edge cylinders are poorly served - the strategy is biased

Continuously arriving requests for the same location could starve other regions

SCAN

Keep moving in the same direction until end is reached

It continues in the current direction, servicing all pending requests as it passes over them

When it gets to the last cylinder, it reverses direction and services pending requests

Total length: |11-12|+|12-16|+|16-34|+|34-36|+|36-9|+|9-1|=60

Disadvantages:

The upper limit on the waiting time is 2\space\times number of cylinders (no starvation)

The middle cylinders are favoured if the disk is heavily used.

C-SCAN

Once the outer/inner side of the disk has been reached, the requests at the other end of the disk have been waiting the longest

SCAN can be improved by using a circular => C-SCAN

When the disk arm gets to the last cylinder of the disk, it reverses direction but does not service requests on the return.

It is fairer and equalises response times on the disk

Total length: |11-12|+|12-16|+|16-34|+|34-36|+|36-1|+|1-9|=68

LOOK-SCAN

Look-SCAN moves to the last cylinder containing the first or last request (as opposed to the first/last cylinder on the disk like SCAN)

However, seeks are cylinder by cylinder and one cylinder contains multiple tracks

It may happen that the arm "sticks" to a cylinder

N-Step SCAN

Only services N requests every sweep.

[jay@diablo lecture_notes]$ cat /sys/block/sda/queue/scheduler 
[mq-deadline] kyber bfq none

# noop: FCFS
# deadline: N-step-SCAN
# cfq: Complete Fairness Queueing (from linux)

Driver caching

For current drives, the time required to seek a new cylinder is more than the rotational time.

It makes sense to read more sectors than actually required
- Read sectors during rotational delay (the sectors that just so happen to pass under the control arm)
- Modern controllers read multiple sectors when asked for the data from one sector track-at-a-time caching.

Scheduling on SSDs

SSDs don't have T_{seek} or rotational delay, we can use FCFS (SSTF, SCAN etc may reduce performace due to no head to move).

7.3 KiB Raw Blame History

Disk Scheduling

Hard Drives

Construction of Hard Drives

Low Level Format

Organisation of hard drives

Access times

Disk Scheduling

First-Come First-Served

Shortest Seek Time First

SCAN

C-SCAN

LOOK-SCAN

N-Step SCAN

Driver caching

Scheduling on SSDs

7.3 KiB

Raw Blame History