The Operating Systems Course

Exercise 9: The File System

Goals

  1. Learn the system calls for handling regular files, symbolic links and directories in the Unix file system.
  2. Write a program that can back up and restore an entire directory tree.

The Program

Compiling your exercise using a Makefile (that you write) should create two executables named hbackup and hrestore. The two programs require the following command arguments:

hbackup <archive_file> <files_for_backup_list>

hrestore <archive_file> <restore_location_base>

The hbackup utility saves into the archive_file all the files in the files_for_backup_list - a white-space separated list of files. The archive_file is a regular file that will contain the backed-up files on success. The archive file should be created when the program runs; if a file with the same name already exists, it should not be overwritten - instead, an error message should be printed and the program should exit.

Backing up a regular file simply copies the file data, along with its name and permissions. Backing up a symbolic link means that we only save the link - not the file the link points to. Therefore, use readlink(2) and not read(2) to backup symbolic links, and use lstat(2) to determine file types. Backing up a directory means backing up the directory itself (name and permissions), including all the files in it recursively. If you encounter a file of another type (such as a socket, pipe or device) print an error message saying that it can't be backed up, and continue to the next file (this is a non-fatal error, see below).

The hrestore utility is used to restore files from an archive. To make life simpler, you only have to implement restoring an entire archive, and not parts of it. The program takes exactly two arguments: The name of an archive file in which there are backed up files, and the name of a directory. All files in the archive file should be recreated on this directory. If a directory was backed up, it should be recreated as a sub-directory of the restore_location_base. Files should be restored with their original names and access permissions.

Both hbackup and hrestore should give an indication of progress: On completing a file read or write, both programs should print a line to the standard output containing the full name of the file and its size. See the output example in the next section.

Error Handling

Error handling should be done as follows. You must check all system call return values and every other possibility of failure, and report each error by printing an informative message to the screen. Use the perror(3) function for printing system error messages. An example of proper error handling and the use of perror(3) is in both example files, in the 'Help' section.

There are two types of errors in this exercise: Fatal and non-fatal ones. Fatal errors prevent you from continuing to backup or restore: Wrong arguments, inability to access the archive file, and so forth. Non-fatal errors are an inability to read or write one file on the way (because of permissions or a special file type, for example). On a fatal error, print an error message and terminate the program with an exit code of 1. On a non-fatal error while working on a file, print an error message starting with the file name (see 'access denied' example in the previous section), and continue working to the next file as if nothing happened. All error messages must be sent to standard error, and not standard output - the example programs demonstrate this.

This is an example of running the programs. We create an archive file called 'backupex9' that should contain a regular file called 'reg_file' and the directory 'ex9', all from the current directory:

> hbackup backupex9 reg_file ex9
ex9 4096
ex9/hbackup.c 7758
ex9/hrestore.c 6494
ex9/test.c: Access denied
ex9/README 1490
ex9/test 4096
ext/test/my_data 45912
> hrestore backupex10 ~
open: No such file or directory
>

For each file that is successfully saved, its name and size and printed to the standard output. On error (such as with 'ex9/test.c'), the file name followed by a colon and the system's error message are printed to the standard error. However, this is a non-fatal error, so we move on to the next file. Note that we backup all of the ex9 directory, and move recursively into the 'ex9/test' directory as well. The program exits with an exit code 0. On the other hand, hrestore is called with an archive file that doesn't exist, which is a fatal error. An error message is printed, and the program terminates with an exit code 1. Note that some of these instructions contrast those in the exercise guidelines - when in question, follow the instructions here.

Bonus

There are two bonuses offered for this exercise. Implementing both can give you a maximum grade of 120.

A ten points bonus will be given for including an integrity check in the archive file. The hbackup program should place some signature of each archived file along with its other data, and when the file is restored this signature will be checked. If the archive file has been damaged or changed, the program should report the error (as a non-fatal error). To get the bonus, invent a signature scheme and add it to both the backup and restore programs.

Another ten points bonus will be given for correctly handling the issue multiple hard links to the same file. It is possible that two different regular files that you back up will be two hard links to the same i-node. In this case it is of course only necessary to save the file data once, and let both hard links point to it. In restoring, the data should be written to disk only once as well. You are allowed to ignore this issue and back up every file, but if you do handle it and get it right, the bonus is yours. The i-node number a file name refers to is returned by the lstat(2) system call.

Miscellaneous

Your submission must include a README, a Makefile, and the source code of the programs. See the exercise guidelines about README and make files. You may include as many source files as you wish. To submit, place all files in a directory called "~/os/ex9". Copying all these files to a new directory and running 'make' should create exactly two executables - hbackup and hrestore.

You can save some coding if you place functions that both programs use in a common header/source file pair, and then #include the header file to use it. You are expected to do a smart job of design - duplicate code, unreadable C tricks and not using constants, for example, can hurt your grade just as badly as a runtime bug. Not to mention not checking return values or ignoring them.

Since this is the first Makefile you submit, it will be checked as well. A good make file will cause 'make' to compile only the necessary files when a file changes. In order to test your make file, you can use the 'touch' utility (see 'man touch'). Touching a file does not change it, but changes its last modification date so that it seems modified. The Makefile for this exercise is a simple variation of what we've done in class.

If you choose to implement the bonus, include an exact description of the algorithm(s) you used in the README file. You might lose some of the bonus points for not being able to describe your accomplishment well.

Help

The following two programs demonstrate some of the systems calls you'll need. Most of their code has cut-and-paste value to you. They also demonstrate how to deal with both system and non-system error messages, and give a general idea of the style and amount of documentation required. Smaller issues such as reading command-line arguments and using printf(3) and fprintf(3) are also shown.

file_copy.c - This program copies a regular file to a new file name. Shows how to use open, read, write and close.

list_dir.c - This file prints the name, size and type for each file in a given directory. Shows how to use opendir, readdir, closedir and lstat.

Before beginning to write anything, make sure to read the man page of every system call you use - we cannot and have not covered every tiny detail in class. Remember that system calls are in section 2 of the man pages, so you need to write 'man 2 open' or 'man 2 lstat' to read them.