skip to content

Designing a File System

Updated: at 12:00 AM

We’re designing an in-memory file system for a single user. It supports hierarchical directories, files, and common file operations.

To keep the scope tight, it will run entirely in memory with no persistence across restarts. Permissions, concurrency, symbolic or hard links are ignored. It will support absolute UNIX-style path only (eg: /a/b/c)

The filesystem API is:

Let’s look at some initial ideas that you might have and see why that shouldn’t be the design you go for

Flat Map of path → content

{
  "/a/b/file.txt": "content"
}

The problem with this approach is:

What Must Always Be True

Before designing classes, let’s define what the requirements for correctness are.

  1. The structure is a rooted tree.
  2. Every node except the root has exactly one parent.
  3. Only directories can contain children.
  4. Child names inside a directory are unique.
  5. No cycles.
  6. Moving a directory into its own subtree is forbidden.
  7. Path resolution must fail if any intermediate component is a file.

State Model

class Node:
  name: str
  parent: Directory | None

class Directory:
  children: Dict[str, Node]

class File:
  content: str

Node

Abstract representation of any file system entry.

Directory

Represents internal tree nodes. O(1) lookup and enforces unique names on children using a dictionary.

File

Represents leaves. Files will never have child elements.

Core Operations

mkdir(path)

createFile(path)

write(path, content)

read(path)

ls(path)

delete(path)

move(src, dest)

Design Patterns

Minimal Class Diagram

classDiagram
    class Node {
        name: str
        parent: Directory
        is_directory()
    }

    class File {
        content: str
    }

    class Directory {
        children: Dict[str, Node]
    }

    class FileSystem {
        root: Directory
        _resolve(path)
        _resolve_parent(path)
    }

    Node <|-- File
    Node <|-- Directory
    FileSystem o-- Directory

Implementation (Python)

from typing import Dict, Optional, List


class Node:
    def __init__(self, name: str, parent: Optional["Directory"] = None):
        self.name = name
        self.parent = parent


class File(Node):
    def __init__(self, name: str, parent: Optional["Directory"]):
        super().__init__(name, parent)
        self._content = ""

    def read(self) -> str:
        return self._content

    def write(self, content: str) -> None:
        self._content = content


class Directory(Node):
    def __init__(self, name: str, parent: Optional["Directory"]):
        super().__init__(name, parent)
        self._children: Dict[str, Node] = {}

    def has_child(self, name: str) -> bool:
        return name in self._children

    def get_child(self, name: str) -> Node:
        if name not in self._children:
            raise FileNotFoundError(name)
        return self._children[name]

    def add_child(self, node: Node) -> None:
        if node.name in self._children:
            raise FileExistsError(node.name)
        self._children[node.name] = node
        node.parent = self

    def remove_child(self, name: str) -> Node:
        if name not in self._children:
            raise FileNotFoundError(name)
        node = self._children.pop(name)
        node.parent = None
        return node

    def list_children(self) -> List[str]:
        return sorted(self._children.keys())

    def is_ancestor_of(self, other: "Directory") -> bool:
        curr: Optional[Directory] = other
        while curr is not None:
            if curr is self:
                return True
            curr = curr.parent
        return False


class PathResolver:
    @staticmethod
    def _split(path: str) -> List[str]:
        if path == "/":
            return []
        return [p for p in path.split("/") if p]

    def resolve(self, path: str, root: Directory) -> Node:
        parts = self._split(path)
        curr: Node = root

        for part in parts:
            if not isinstance(curr, Directory):
                raise NotADirectoryError(path)
            curr = curr.get_child(part)

        return curr

    def resolve_parent(self, path: str, root: Directory) -> tuple[Directory, str]:
        parts = self._split(path)
        if not parts:
            raise ValueError("Cannot operate on root")

        parent_path = "/" + "/".join(parts[:-1]) if len(parts) > 1 else "/"
        name = parts[-1]

        parent = self.resolve(parent_path, root)
        if not isinstance(parent, Directory):
            raise NotADirectoryError(parent_path)

        return parent, name


class FileSystem:
    def __init__(self):
        self._root = Directory("/", None)
        self._resolver = PathResolver()

    def mkdir(self, path: str) -> None:
        parent, name = self._resolver.resolve_parent(path, self._root)
        parent.add_child(Directory(name, None))

    def createFile(self, path: str) -> None:
        parent, name = self._resolver.resolve_parent(path, self._root)
        parent.add_child(File(name, None))

    def write(self, path: str, content: str) -> None:
        node = self._resolver.resolve(path, self._root)
        if not isinstance(node, File):
            raise IsADirectoryError(path)
        node.write(content)

    def read(self, path: str) -> str:
        node = self._resolver.resolve(path, self._root)
        if not isinstance(node, File):
            raise IsADirectoryError(path)
        return node.read()

    def ls(self, path: str) -> List[str]:
        node = self._resolver.resolve(path, self._root)
        if isinstance(node, Directory):
            return node.list_children()
        return [node.name]

    def delete(self, path: str) -> None:
        if path == "/":
            raise ValueError("Cannot delete root")
        parent, name = self._resolver.resolve_parent(path, self._root)
        parent.remove_child(name)

    def move(self, src_path: str, dest_path: str) -> None:
        if src_path == "/":
            raise ValueError("Cannot move root")

        src_node = self._resolver.resolve(src_path, self._root)
        dest_node = self._resolver.resolve(dest_path, self._root)

        if not isinstance(dest_node, Directory):
            raise NotADirectoryError(dest_path)

        if isinstance(src_node, Directory) and src_node.is_ancestor_of(dest_node):
            raise ValueError("Cannot move directory into its own subtree")

        src_parent, src_name = self._resolver.resolve_parent(src_path, self._root)
        src_parent.remove_child(src_name)
        dest_node.add_child(src_node)

How to Extend

Large directories

Maintain an ordered structure instead of sorting on demand.

Permissions

Add owner and mode bits to the node

Thread Safety

Introduce global lock or per-directory fine-grained locks

Persistence

Snapshot via DFS serialization or a journal log