Kelly Innes dot com    About

Documenting Unix

Unix obviously has a really strong legacy: it’s the ancestor of Linux and Darwin, the operating system beneath both OS X and iOS, which means even something as bleeding-edge as the Apple Watch is really a “Unix Super Computer On Your Wrist.”

Old Unix documentation is pretty tremendous, too. Paul Ford’s essay “The Great Works of Software” describes Unix documentation as “very simple and plain-spoken, if pretty wonky.” Another Ford piece calls the book The UNIX Programming Environment “an exemplar of plain technical diction, sort of a Strunk & White of computing.”

I just started reading another old Unix book which is similarly lucid, The Design of the UNIX Operating System. The book describes how the Unix kernel works, and it was influential enough to be placed in the “Operating systems” category of Wikipedia’s “List of important publications in computer science.” Since the book’s approach centers on the kernel (The UNIX Programming Environment sticks mostly with the shell), it delves into hardware and – especially – memory management. Even so, it’s surprisingly readable for a book about an operating system.

One example – exemplary, but maybe not extraordinary in comparison with the rest of it – is the book’s explanation of how the Unix kernel translates a pathname typed into the shell into an inode to return the correct file from the file system. Think of an inode as the kernel’s index for a file, a bundle of metadata which (in part) associates human-readable filenames with machine-readable blocks of memory.

(In Unix, everything’s a file, directories included.)

When you type something like

cd ~/desktop/ruby_projects

or

cd /etc/passwd

at the $ prompt in the shell, the kernel uses an algorithm to parse the pathname into the correct inode, returning the inode in the form of the pathname you requested: ~/desktop/ruby_projects, /etc/passwd, or whatever.

It’s apparently called the namei algorithm, and Design presents it in C-like pseudocode which I’ll rewrite to be Ruby-flavored:

def namei(pathname)
  working_inode = current_directory_inode
  
  if pathname begins at root
    working_inode = root_inode
  end

  until the algorithm reaches the end of the pathname
    temp_pathname = next pathname component from input
    return an error if permissions don't allow access to the file
    
    if working_inode == root && temp_pathname == ".."
      restart until loop
    end

    working_directory = list of files at working_inode
    if working_directory includes temp_pathname
      working_inode = inode of temp_pathname
    else
      return # since the inode wasn't found!
    end
  end

  return working_inode # found it!
end

Here’s how The Design of the UNIX Operating System explains namei (it’s on page 76 of the 1986 version, if you want to see it for yourself):

suppose a process wants to open the file “/etc/passwd”. When the kernel starts parsing the filename, it encounters “/” and gets the system root inode. Making root its current working inode, the kernel gathers in the string “etc”. After checking that the current inode is that of a directory (“/”) and that the process has the necessary permissions to search it, the kernel searches root for a file whose name is “etc”: It accesses the data in the root directory block by block and searches every block one entry at a time until it locates an entry for root (algorithm iput) and allocates the inode for “etc” (algorithm iget) according to the inode numbers of the entry just found. After ascertaining that “etc” is a directory and that it has the requisite search permissions, the kernel searches “etc” block by block for a directory structure entry for the file “passwd”. … On finding it, the kernel releases the inode for “etc”, allocates the inode for “passwd”, and – since the path name is exhausted – returns that inode.

I like it because it’s complex and clear: it requires attention, but with patience it’s easy to follow. It’s almost pseudocode, albeit formatted in a paragraph instead of a block. I’m looking forward to reading the rest of the book.

more about Unix & Linux: