Spring 2021
This assignment serves several purposes. The main goal is to build comfort with FUSE so that we may implement our own FFS-like file system in upcoming weeks. Specifically, by the end of this lab you should:
In this assignment, you will be writing a pseudo-filesystem that slightly extends the basic functionality of the “Hello world” file system example from the FUSE https://github.com/libfuse/libfuse/tree/fuse_2_9_5.
You are encouraged to explore online FUSE resources, including source code from sample file systems. Please share any good resources that you find. But you should never seek out or view source code for solutions to these labs. Any attempt to do so would be considered a violation of the honor code.
For this assignment, you are strongly encouraged to work in a group. If your unique situation makes collaboration impossible, please let me know.
Each student will be given a private repository under the Williams-CS GitHub organization, and only you will have read/write access to that repository. This repository will contain a copy of the starter code for the lab. You may commit and push code to this repository if that is helpful, but the main purpose of your private repository is so that you may write your own Eval.md
(each group member must submit a private Eval.md
).
After forming groups, I will create shared repositories for your team’s code development. You should be committing your code to this repository as you make progress. I highly recommend that you commit code early and often: it will only help because the teaching staff can view your code and easily answer questions using the GitHub interface.
Since your team will be submitting a single repository with shared code, teammates may collaborate without restriction. In fact, I strongly encourage you to pair program whenever possible (Zoom screen sharing makes this much easier to schedule). The code you will write for this lab is significantly shorter than the code you will write in the next lab, so it is important that you understand and are involved in each function.
Tips for pair programming:
You may also discuss high-level questions with other classmates. High level questions include:
The rule of thumb is that you should never view any classmate’s code if they are not on your team.
The purpose of this “Getting Started” section is help familiarize yourself with much of FUSE, learn how to return errors, and develop a framework that will be useful when we begin our FUSE Simple FS implementation.
For this lab, we will consult two starter files from FUSE version 2.9.5’s examples/ directory (later FUSE versions broke backwards compatibility; API version 26 is the one installed on our Lab’s Ubuntu machines). You will also find copies of these in your repositories: fusexmp.c
and hello.c
To quote the README from the FUSE repository:
FUSE comes with several example file systems in the examples directory. For example, the
fusexmp
example mirrors the contents of the root directory under the mountpoint. Start from there and adapt the code!
So let’s do that. After cloning your private repository, start by copying fusexmp.c
to my-hello.c
. We will use this as the “skeleton” to build our “new and improved” hello.c
implementation.
An obvious question is: why would we start from fusexmp.c
instead of hello.c
when we are extending the hello.c
functionality? Honestly, taking either approach is possible, but here is my reasoning: hello.c
includes a subset of the necessary functions, fusexmp.c
includes a superset of the necessary functions; your my-hello.c
will fall somewhere in between. It is easy to copy the relevant parts of the hello.c
functions when needed, but it is nice to have the whole “code skeleton” as a starting point.
A good next step is to replace the code in my-hello.c
functions with some debugging information. For example, in every function defined in struct fuse_operations xmp_oper
(i.e., xmp_getattr
, xmp_access
, xmp_readlink
, …) you could replace the function body with:
The goal of this substitution is to create a FS that will compile, but that will not support any functions. If we compile our FS and mount it, any operations that we try to perform on our FS will fail. But we will see exactly which functions are called in response to different actions, and hopefully better understand the FUSE framework.
Try that out: compile, mount, and explore.
The next step is to copy the useful hello.c
functionality into our skeleton. This will give you a starting point for your my-hello.c
spec. But which parts are relevant? Answering that requires understanding the hello.c
code as well as the assignment spec (next section).
In this lab, you will develop a filesystem with the following characteristics:
There is a single regular file, with a name of your choice (I’ll use “cs333.txt
” as an example, but you may substitute whatever file name you choose in place of “cs333.txt
”).
ls
and ls -l
should work on the root directory, and they should return plausible values for “.
” (dot), “..
”, and “cs333.txt
”. This means that you must make sure that “cs333.txt
” always has the correct length, even after modification.
The access()
system call should work and it should correctly indicate that“cs333.txt
” can be both read and written by its owner.
“cs333.txt
” (or whatever you name your file) should initially contain a short string of your choosing (e.g, "Teamwork: when other people do the work for you."
). For cosmetic reasons, your initial string should be terminated by a newline, but do not include a trailing null byte.
When read, “cs333.txt
” (or whatever you name your file) should return its contents (following the appropriate behavior of read()
).
Writes to “cs333.txt
” (or whatever you name your file) should overwrite the contents of the file, possibly extending the string if the data extends beyond the current file’s size.
write()
correctly requires that you also implement truncate()
. The truncate()
system call can be used to shorten OR lengthen a file. If you extend the length of the file, make sure to “zero-fill” bytes beyond the previous file length so that you do not have uninitialized data in your file.Other operations are up to you. Anything unimplemented function should return an error code. The error code ENOSYS
is appropriate for unimplemented system calls. The complete list of error codes can be found in /usr/include/asm-generic/errno*.h
; The most common error codes have values below ~50.
You should also write tests to verify that you’ve met this File System specification.
After last lab, it may seem that the easiest way to test your writing is to redirect program output to the file:
However, executing this command may result in several FS-related system calls, not just write()
! Tracing through your code using gdb
will help you to understand which calls are made and with which arguments. It would be desirable to write tests that isolate just one behavior.
In our previous lab, we wrote tests using bash scripts. That same approach is reasonable when testing significant portions of this lab’s required functionality, but you should also considering writing some tests in C; when you write your own small C program, you can call the exact libc functions that you wish, rather than relying on a shell program that does multiple things at once. Once written, you can call your small C program from your bash script with the appropriate arguments (take advantage of the fact that that the return value from your C program’s main method is communicated back to the shell, accessible with $?
).
You can do your development inside a virtual machine or on a “real” system if you have access to one. If you would like, the department has a small number of machines, which we affectionately call “panics” in honor of the “kernel panic” (the name for an error in your OS that crashes your machine), that we can set up for you to have sudo privileges: you will be able to install any system software that you need/want, and you can completely customize your environment.
However, your final code should be runnable on an Ubuntu 20.04 machine using FUSE version 2.9.9 (the default version in Ubuntu 20.04’s repository).
As linked from the readings, Professor Keunning of Harvey Mudd College maintains a web page with FUSE reference materials. This page has the most complete documentation that I have been able to find.
The above FUSE documentation page contains instructions on how to run and debug FUSE programs in general. ** NOTE: GDB is useful for understanding code EVEN IF YOU DON’T HAVE A BUG **
Compiling a FUSE program requires a slightly complicated command:
A better approach, however, is to use make
. I have placed a minimal Makefile
in your repository that will compile the single C file, my-hello.c
, using the appropriate commands. I encourage you to further develop this Makefile
to fit your needs.
After completing the steps in the Getting Started section, you should have a file system that compiles, mounts, and implements all of the functionality that the hello.c
example file system implements. In addition, when you attempt to execute operations on your FUSE file system that are not implemented, you should see debugging print messages saying which function was called, and a return value of -ENOSYS
(an error).
Using this strategy should help you to identify which “missing” functions need to be implemented. However, there also things you’ll need to change in order to adapt the read-only hello.c
file system’s functionality to support writes. The best ways to understand what functionality is needed are to:
Here are some notes with things to keep in mind for some of the functions.
open(const char* path, struct fuse_file_info* fi)
:
path
argument does not match your file’s path, then you should return -ENOENT
.access(const char* path, int mask)
mask
arg signify the requested permissions.0
) when asked for any combination of read and write permissions. The symbolic constants R_OK
and W_OK
should be used to make your code more readable and portable./
should be X_OK
.truncate(const char* path, off_t size)
:
cs333.txt
” (or equivalent) is not actually stored on disk, but instead stored in a malloc()
-allocated region of memory. If this is your strategy, truncate()
bares some resemblance with the realloc()
function that you may have implemented in your CS237 memory allocator lab.write(const char* path, const char *buf, size_t size, off_t offset, struct fuse_file_info* fi)
:
write()
function failed.The most “straightforward” way to test some functionality is to overwrite the file and then read it back to verify that the data has changed:
However, you have hopefully observed that the first command doesn’t just overwrite the existing data: it first truncates the file to length zero, then makes a write request starting at offset 0
.
So while this is a very useful test, it doesn’t handle all of the cases that we need to consider.
To do this, I suggest using a small C program that does three things
write
(or pwrite
) system call.Then, we could use that C program inside a bash script to test the various cases.
Here’s an example (feel free to use/adapt it). The program takes 3 arguments:
/mnt/hello
)Then, after compiling it to an executable named write-test
, I can use it to write tests by:
my-hello
file system that has the same intial contents as my default file’s contents (e.g., "Hello World!\n"
).However, this class is not just about the final product. There are often things that do not show up in your git commit history. Did you take the time to get comfortable with reading man pages? Did you spend time building your gdb
skills? Did you explore bash
syntax to help you write creative tests? Did you overcome any challenging bugs or situations that you are proud of? Doing this takes time, and these are investments that you should be rewarded for making. (Hopefully the promise of making your life easier down the line is a pay-off, but your efforts should also be acknowledged now). So at the end of your Eval.md
, you should document your experience, including the ups and downs, and reflect on how you spent your time. Convince me and convince yourself that you spent the time to learn the material.
The format of the Eval.md
that I provide is a suggestion. Ultimately, how you reflect on your Lab Assignment is up to you.
In your repository, you will find a file called Eval.md
. In it, you should assess your my-hello.c
FUSE file system’s correctness (it should implement the behavior as described above), code clarity (Did you write small modular functions that you compose to complete the program’s task? Did you sufficiently document your programs so that you could understand the code if you were to revisit it a year from now? Did you choose good variable names, consistently indent, and define your variables in the appropriate scope (e.g., using return values to communicate across functions rather than updating global variables)?), and error-handling (did you check the return value of all non-printing functions, and handle success/failure appropriately). You should also comment on the quality and comprehensiveness of your testing.
In addition to your code, you should assess your process. Think about your contribution to your group. Everyone brings different backgrounds and experiences to a partnership. Did you contribute a fair amount of effort towards the assignment? Did you or your CS background grow during the project as a result of working with your partner(s)? Did you help your partner(s) grow? Are there things you would do differently in your next partnership?
When you have completed the lab, submit your individual repository as well as your team’s code using the appropriate git commands, such as:
Verify that your changes appear on GitHub by navigating to your repositories using the web interface. They should be available at https://github.com/williams-cs/cs333lab2-{USERNAME}
and https://github.com/williams-cs/cs333lab2-{USERNAME}_{USERNAME}_{USERNAME}
. You should see all changes reflected in the various files that you submit. If not, go back and make sure you committed and pushed. I will be retrieving all lab code from GitHub, so if your changes are not visible to you on GitHub, they will not be visible to me either. I want to make sure everyone receives credit for their work!
This lab (and the subsequent FUSE FS labs) were influenced by similar assignments created by Geoff Keunning at Harvey Mudd College. as part of his CS137 course materials.