Assignment 2

CSCI 333 : Storage Systems

Spring 2021

Learning Objectives

This assignment serves several purposes. The main goal is to build comfort with FUSE so that we may implement our own FFS-like file system in upcoming weeks. Specifically, by the end of this lab you should:

Assignment Overview

In this assignment, you will be writing a pseudo-filesystem that slightly extends the basic functionality of the “Hello world” file system example from the FUSE https://github.com/libfuse/libfuse/tree/fuse_2_9_5.

You are encouraged to explore online FUSE resources, including source code from sample file systems. Please share any good resources that you find. But you should never seek out or view source code for solutions to these labs. Any attempt to do so would be considered a violation of the honor code.

Assignment Logistics

Repositories & Starter Code

For this assignment, you are strongly encouraged to work in a group. If your unique situation makes collaboration impossible, please let me know.

Each student will be given a private repository under the Williams-CS GitHub organization, and only you will have read/write access to that repository. This repository will contain a copy of the starter code for the lab. You may commit and push code to this repository if that is helpful, but the main purpose of your private repository is so that you may write your own Eval.md (each group member must submit a private Eval.md).

After forming groups, I will create shared repositories for your team’s code development. You should be committing your code to this repository as you make progress. I highly recommend that you commit code early and often: it will only help because the teaching staff can view your code and easily answer questions using the GitHub interface.

Collaboration

Since your team will be submitting a single repository with shared code, teammates may collaborate without restriction. In fact, I strongly encourage you to pair program whenever possible (Zoom screen sharing makes this much easier to schedule). The code you will write for this lab is significantly shorter than the code you will write in the next lab, so it is important that you understand and are involved in each function.

Tips for pair programming:

You may also discuss high-level questions with other classmates. High level questions include:

The rule of thumb is that you should never view any classmate’s code if they are not on your team.

Getting Started

The purpose of this “Getting Started” section is help familiarize yourself with much of FUSE, learn how to return errors, and develop a framework that will be useful when we begin our FUSE Simple FS implementation.

For this lab, we will consult two starter files from FUSE version 2.9.5’s examples/ directory (later FUSE versions broke backwards compatibility; API version 26 is the one installed on our Lab’s Ubuntu machines). You will also find copies of these in your repositories: fusexmp.c and hello.c

To quote the README from the FUSE repository:

FUSE comes with several example file systems in the examples directory. For example, the fusexmp example mirrors the contents of the root directory under the mountpoint. Start from there and adapt the code!

So let’s do that. After cloning your private repository, start by copying fusexmp.c to my-hello.c. We will use this as the “skeleton” to build our “new and improved” hello.c implementation.

An obvious question is: why would we start from fusexmp.c instead of hello.c when we are extending the hello.c functionality? Honestly, taking either approach is possible, but here is my reasoning: hello.c includes a subset of the necessary functions, fusexmp.c includes a superset of the necessary functions; your my-hello.c will fall somewhere in between. It is easy to copy the relevant parts of the hello.c functions when needed, but it is nice to have the whole “code skeleton” as a starting point.

A good next step is to replace the code in my-hello.c functions with some debugging information. For example, in every function defined in struct fuse_operations xmp_oper (i.e., xmp_getattr, xmp_access, xmp_readlink, …) you could replace the function body with:

   printf("my-hello <functionname>");
   return -ENOSYS;

The goal of this substitution is to create a FS that will compile, but that will not support any functions. If we compile our FS and mount it, any operations that we try to perform on our FS will fail. But we will see exactly which functions are called in response to different actions, and hopefully better understand the FUSE framework.

Try that out: compile, mount, and explore.

The next step is to copy the useful hello.c functionality into our skeleton. This will give you a starting point for your my-hello.c spec. But which parts are relevant? Answering that requires understanding the hello.c code as well as the assignment spec (next section).

Assignment Tasks

File System Requirements

In this lab, you will develop a filesystem with the following characteristics:

You should also write tests to verify that you’ve met this File System specification.

Testing

After last lab, it may seem that the easiest way to test your writing is to redirect program output to the file:

 $ echo "Here is some new text for my file." > testdir/cs333.txt

However, executing this command may result in several FS-related system calls, not just write()! Tracing through your code using gdb will help you to understand which calls are made and with which arguments. It would be desirable to write tests that isolate just one behavior.

In our previous lab, we wrote tests using bash scripts. That same approach is reasonable when testing significant portions of this lab’s required functionality, but you should also considering writing some tests in C; when you write your own small C program, you can call the exact libc functions that you wish, rather than relying on a shell program that does multiple things at once. Once written, you can call your small C program from your bash script with the appropriate arguments (take advantage of the fact that that the return value from your C program’s main method is communicated back to the shell, accessible with $?).

Development Environment

You can do your development inside a virtual machine or on a “real” system if you have access to one. If you would like, the department has a small number of machines, which we affectionately call “panics” in honor of the “kernel panic” (the name for an error in your OS that crashes your machine), that we can set up for you to have sudo privileges: you will be able to install any system software that you need/want, and you can completely customize your environment.

However, your final code should be runnable on an Ubuntu 20.04 machine using FUSE version 2.9.9 (the default version in Ubuntu 20.04’s repository).

FUSE

As linked from the readings, Professor Keunning of Harvey Mudd College maintains a web page with FUSE reference materials. This page has the most complete documentation that I have been able to find.

Running & Debugging

The above FUSE documentation page contains instructions on how to run and debug FUSE programs in general. ** NOTE: GDB is useful for understanding code EVEN IF YOU DON’T HAVE A BUG **

Compiling

Compiling a FUSE program requires a slightly complicated command:

 $ gcc -g -Og my-hello.c -o my-hello `pkg-config fuse --cflags --libs`

A better approach, however, is to use make. I have placed a minimal Makefile in your repository that will compile the single C file, my-hello.c, using the appropriate commands. I encourage you to further develop this Makefile to fit your needs.

Evolving Advice (check back for updates)

After completing the steps in the Getting Started section, you should have a file system that compiles, mounts, and implements all of the functionality that the hello.c example file system implements. In addition, when you attempt to execute operations on your FUSE file system that are not implemented, you should see debugging print messages saying which function was called, and a return value of -ENOSYS (an error).

Using this strategy should help you to identify which “missing” functions need to be implemented. However, there also things you’ll need to change in order to adapt the read-only hello.c file system’s functionality to support writes. The best ways to understand what functionality is needed are to:

Advice for Functions

Here are some notes with things to keep in mind for some of the functions.

Advice for Testing

The most “straightforward” way to test some functionality is to overwrite the file and then read it back to verify that the data has changed:

 $ echo "new data" > mnt/cs333.txt
 $ cat mnt/333.txt
 new data
 $

However, you have hopefully observed that the first command doesn’t just overwrite the existing data: it first truncates the file to length zero, then makes a write request starting at offset 0.

So while this is a very useful test, it doesn’t handle all of the cases that we need to consider.

To do this, I suggest using a small C program that does three things

  1. Opens a file for writing.
  2. Calls the write (or pwrite) system call.
  3. Closes the file.

Then, we could use that C program inside a bash script to test the various cases.

Here’s an example (feel free to use/adapt it). The program takes 3 arguments:

Then, after compiling it to an executable named write-test, I can use it to write tests by:

  1. Creating a reference file outside my my-hello file system that has the same intial contents as my default file’s contents (e.g., "Hello World!\n").
  2. Peforming the same operations on the reference file as I do on my default file.
  3. Comparing the files.

Evaluation

However, this class is not just about the final product. There are often things that do not show up in your git commit history. Did you take the time to get comfortable with reading man pages? Did you spend time building your gdb skills? Did you explore bash syntax to help you write creative tests? Did you overcome any challenging bugs or situations that you are proud of? Doing this takes time, and these are investments that you should be rewarded for making. (Hopefully the promise of making your life easier down the line is a pay-off, but your efforts should also be acknowledged now). So at the end of your Eval.md, you should document your experience, including the ups and downs, and reflect on how you spent your time. Convince me and convince yourself that you spent the time to learn the material.

The format of the Eval.md that I provide is a suggestion. Ultimately, how you reflect on your Lab Assignment is up to you.

In your repository, you will find a file called Eval.md. In it, you should assess your my-hello.c FUSE file system’s correctness (it should implement the behavior as described above), code clarity (Did you write small modular functions that you compose to complete the program’s task? Did you sufficiently document your programs so that you could understand the code if you were to revisit it a year from now? Did you choose good variable names, consistently indent, and define your variables in the appropriate scope (e.g., using return values to communicate across functions rather than updating global variables)?), and error-handling (did you check the return value of all non-printing functions, and handle success/failure appropriately). You should also comment on the quality and comprehensiveness of your testing.

In addition to your code, you should assess your process. Think about your contribution to your group. Everyone brings different backgrounds and experiences to a partnership. Did you contribute a fair amount of effort towards the assignment? Did you or your CS background grow during the project as a result of working with your partner(s)? Did you help your partner(s) grow? Are there things you would do differently in your next partnership?

Submitting Your Work

When you have completed the lab, submit your individual repository as well as your team’s code using the appropriate git commands, such as:

  $ git status
  $ git add ...
  $ git commit -m "final submission"
  $ git push

Verify that your changes appear on GitHub by navigating to your repositories using the web interface. They should be available at https://github.com/williams-cs/cs333lab2-{USERNAME} and https://github.com/williams-cs/cs333lab2-{USERNAME}_{USERNAME}_{USERNAME}. You should see all changes reflected in the various files that you submit. If not, go back and make sure you committed and pushed. I will be retrieving all lab code from GitHub, so if your changes are not visible to you on GitHub, they will not be visible to me either. I want to make sure everyone receives credit for their work!


Assignment Credit

This lab (and the subsequent FUSE FS labs) were influenced by similar assignments created by Geoff Keunning at Harvey Mudd College. as part of his CS137 course materials.