20/11/20 ## File System Views ### User View A **user view** that defines a file system in terms of the **abstractions** that the operating system provides An **implementation view** that defined the file system in terms of its **low level implementation** **Important aspects of the user view** > * The **file abstraction** which **hides** implementation details from the user > * File **naming policies**, user file **attributes** (size, protection, owner etc) > * There are also **system attributes** for files (e.g. non-human readable, archive flag, temp flag) > * **Directory structures** and organisation > * **System calls** to interact with the file system > > The user view defines how the file system looks to regular users and relates to **abstractions**. #### File Types Many OS's support several types of file. Both windows and Unix have regular files and directories: * **Regular files** contain user data in **ASCII** or **binary** format * **Directories** group files together (but are files on an implementation level) Unix also has character and block special files: * **Character special files** are used to model **serial I/O devices** (keyboards, printers etc) * **Block special files** are used to model drives ### System Calls File Control Blocks (FCBs) are kernel data structures (they are protected and only accessible in kernel mode) * Allowing user applications to access them directly could compromise their integrity * System calls enable a **user application** to **ask the OS** to carry out an action on it's behalf (in kernel mode) * There are **two different categories** of **system calls** * **File manipulation**: `open()`, `close()`, `read()`, `write()` ... * **Directory manipulation**: `create()`, `delete()`, `rename()`, `link()` ... ### File Structures **Single level**: all files are in the same directory (good enough for basic consumer electronics) **Two or multiple level directories**: tree structures * **Absolute path name**: from the root of the file system * **Relative path name**: the current working directory is used as the starting point **Directed acyclic graph (DAG)**: allows files to be shared (links files or sub-directories) but **cycles are forbidden** **Generic graph structure**: Links and cycles can exist. The use of **DAG** and **generic graph structures** results in **significant complications** in the implementation * Trees are a DAG with the restriction that a child can only have one parent and don't contain cycles. When searching the file system: * Cycles can result in **infinite loops** * Sub-trees can be **traversed multiple times** * Files have **multiple absolute file names** * Deleting files becomes a lot more complicated * Links may no longer point to a file * Inaccessible cycles may exist * A garbage collection scheme may be required to remove files that are no longer accessible from the file system tree. #### Directory Implementations Directories contain a list of **human readable file names** that are mapped onto **unique identifiers** and **disk locations** * They provide a mapping of the logical file onto the physical location Retrieving a file comes down to **searching the directory file** as fast as possible: * A **simple random order of directory** entries might be insufficient (search time is linear as a function of the number of entries) * Indexes or **hash tables** can be used. * They can store all **file related attributes** (file name, disk address - Windows) or they can **contain a pointer** to the data structure that contains the details of the file (Unix) ![directory files](/lectures/osc/assets/b2.png) ##### System Calls Similar to files, **directories** are manipulated using **system calls** * `create/delete`: new directory is created/deleted. * `opendir, closeddir`: add/free directory to/from internal tables * `readdir`: return the next entry in the directory file **Directories** are **special files** that **group files** together and of which the **structure is defined** by the **file system** * A bit is set to indicate that they are directories * In Linux when you create a directory, two files are in that directory that the user has no control over. These files are represented as `.` and `..` * `.` - a file dealing with file permissions * `..` - represents the parent directory (`cd ..`) ##### Implementation > Regardless of the type of file system, a number of **additional considerations** need to be made > > * **Disk Partitions**, **partition tables**, **boot sectors** etc > * Free **space management** > * System wide and per process **file tables** > > **Low level formatting** writes sectors to the disk > > **High level formatting** imposes a file system on top of this (using **blocks** that can cover multiple **sectors**) ### Partitions Disks are usually divided into **multiple partitions** * An independent file system may exist on each partiton **Master Boot Record** * Located as the start of the entire drive * Used to boot the computer (BIOS reads and executes MBR) * Contains **partition table** at its end with **active partition**. * One partition is listed as **active** containing a boot block to load the operating system. ![master boot record](assets/b3.png) #### Unix Partition > The partition contains > > * The partition **boot block**: > * Contains code to boot the OS > * Every partition has a boot block - even if it does not contain an OS > * **Super block** contains the partitions details e.g. partition size, number of blocks, I-node table etc > * **Free space management** contains a bitmap or linked list that indicates the free blocks. > * A linked list of disk blocks (also known as grouping) > * We use free blocks to hold the **number of the free blocks**. Since the free list shrinks when the disk becomes full, this is not wasted space > * **Blocks are linked together**. The size of the list **grows with the size of the disk** and **shrinks with the size of the blocks** > * Linked lists can be modified by **keeping track of the number of consecutive free blocks** for each entry (known as counting) > * **I-Nodes**: An array of data structures, one per file, telling all about the files > * **Root directory**: the top of the file-system tree > * **Data**: files and directories ![unix partition composition](/lectures/osc/assets/b4.png) ![Free block management](/lectures/osc/assets/b5.png) Free space management with linked list (on the left) and bitmaps (on the right) **Bitmaps** * Require extra space * Keeping it in main memory is possible but only for small disk **Linked lists** * No wasted disk space * We only need to keep in memory one block of pointers (load a new block when needed) Apart from the free space memory tables, there is a number of key data structures stored in memory: * An in-memory mount table (table with different partitions that have been mounted) * An in-memory directory cache of recently accessed directory information * A **system-wide open file table**, containing a copy of the FCB for every currently open file in the system, including location on disk, file size and **open count** (number of processes that use the file) * A **per-process open file table**, containing a pointer to the system open file table. ![file tables](/lectures/osc/assets/b6.png) ![opening & reading a file](/lectures/osc/assets/b7.png)