NFS

CSCI 333 :: Storage Systems

Spring 2021

%

Network File System (NFS)

NFS is a protocol developed by Sun Microsystems. NFS uses the client-server model, where one powerful server is accessed by potentially many clients. This model has several advantages, but also presents interesting implementation challenges. When thinking about NFS, it is important to remember that, at the time, “the cloud” was not the omnipresent force that it is today; most data only ever lived on local storage devices.

The motivating use case for NFS is more or less a mid-sized office building (not exactly inspiring, but it was an incredibly important use case). That being said, you can easily map the NFS “target environment” to our computer science department’s lab:

Although NFS is less exciting than it once was, it was an incredibly important step forward. We will reference NFS design points in many of our later discussions this semester, even if just as an example of what not to do.

Learning Objectives

The client-server model

NFS is an open protocol designed by SUN, essentially to create a market for SUN to sell powerful storage servers. There are many benefits to the client/server model.

In addition to the model itself, NFSv2 was developed around some interesting design principles. It was an influential model that continues to be used to this day (albeit with a slightly evolved design).

How the model works

Clients that use NFS operate as if they were using a local file system. Under normal operation, the fact that client machines are communicating over a network is transparent: clients are presented with the illusion of a local file system.

NFS Design Goals

Since the server is the single most important machine in the protocol (many clients connect to the server, and if the server goes down, the clients are largely unusable), the NFS design prioritizes simple and efficient server recovery. Two design elements that I find particularly elegant are that

NFS implementation

File Handles. To make NFS a stateless protocol, clients package their context into an NFS file handle. File handles contain a volume ID, an inode number, and a generation number.

The file handle is an essential part of almost every NFS protocol message.

Client side caching. NFS clients are separated from the server by a network. The latency of accessing a remote machine over a network is higher than the latency of accessing a local device. Since many file system operations are translated into multiple messages, each of which must cross the network in a round trip, NFS clients cache data when possible. At a high level:

This caching introduces a challenge: consistency. Multiple clients can open the same file. If they each cache parts of their data locally, their view of a given file’s contents can become out of sync in the presence of updates.

Server side caching. NFS servers must persist data, but they may also want to buffer updates for performance reasons.

Future Versions

The NFS protocol has had several iterations. The textbook focuses on NFSv2 and omits discussion of NFSv3 and NFSv4. The later versions implement some optimizations to minimize the number of “round trips” between the client and server for common, and to improve client performance by rolling back some of the design principles of v2. To summarize:

Questions: 1. How do idempotent operations simplify the design of how clients respond to server crashes/disconnections and recovery? 2. How does statelessness simplify NFS design (e.g., what are some challenges that would be added to the server if it needed to maintain state)? 3. How might removing statelessness improve NFS performance? 4. What are some alternatives to close-to-open consistency? What would be the costs of implementing those alternatives in NFS?