CSS 430
Final Project: File System

Due Date: See the syllabus

1. Purpose

In this project, you will build a Unix-like file system on the ThreadOS. Through the use of the file system, user programs will now be able to access persistent data on disk by way of stream-oriented files rather than the more painful direct access to disk blocks with rawread() and rawrite().

2. Collaboration

Given the size of this project, you are encouraged to form a team of two or three students. In working with your partner(s), the workload should divided equally across all members. This project requires much more time to be completed than the previous four assignments. Schedule and pace your design and coding work accordingly. If you are very confident of working independently you may do so.

3. What Files to Use

Use ThreadOS' original files, (i.e., the original classes) except for Kernel.java. For Kernel.java, use the one you got in the assignment 4, (i.e., Kernel_org.java). For simplicity, you should just re-download all readable files from the ThreadOS directory to get started.

4. Interface

The file system should provide user threads with the system calls that will allow them to format, to open, to read from , to write to, to update the seek pointer of, to close, to delete, and to get the size of their files.

For simplicity, the file system being created will consiste of a single level. The "/" root directory is predefined by the file system and permanently available for user threads to store their files. No other directories are provided by the system and created by users dynamically.

Each user thread needs to keep track of all files it has opened. For this purpose, it should maintain a table of those open files in its TCB. This table is called a user file descriptor table. It has 32 entries. Whenever a thread opens a file, it must allocate to this file a new table entry, termed a file descriptor. Each file descriptor includes the file access mode and the reference to the corresponding file (structure) table entry. The file access mode indicates "read only", "write only", "read/write", or "append". The file (structure) table is a system-maintained table shared among all user threads, each entry of which maintains the seek pointer and the inode number of a file. Depending on the access mode, the seek pointer is set to the first or the tail of the file, and keeps track of a next position to read from and to write to the file. It is entirely possible for one thread to open the same file many times, thus having several entries in the corresponding TCB's user file descriptor table. Although each of these user file descriptor table entries refer to a different file (structure) table entry with its own seek pointer, all of them eventually points to the same inode.

The file system you will implement must provide the following eight system calls.

  1. int SysLib.format( int files );
    Formats the disk (Disk.java's data contents). The parameter files specifies the maximum number of files to be created (the number of inodes to be allocated) in your file system. The return value is 0 on success, otherwise -1.
  2. int fd = SysLib.open( String fileName, String mode );
    Opens the file specified by the fileName string in the given mode (where "r" = ready only, "w" = write only, "w+" = read/write, "a" = append). The call allocates a new file descriptor, fd to this file. The file is created if it does not exist in the mode "w", "w+" or "a". SysLib.open must return a negative number as an error value if the file does not exist in the mode "r". Note that the file descriptors 0, 1, and 2 are reserved as the standard input, output, and error, and therefore a newly opened file must receive a new descriptor numbered in the range between 3 and 31. If the calling thread's user file descriptor table is full, SysLib.open should return an error value. The seek pointer is initialized to zero in the mode "r", "w", and "w+", whereas initialized at the end of the file in the mode "a".
  3. int read( int fd, byte buffer[] );
    Reads up to buffer.length bytes from the file indicated by fd, starting at the position currently pointed to by the seek pointer. If bytes remaining between the current seek pointer and the end of file are less than buffer.length, SysLib.read reads as many bytes as possible, putting them into the beginning of buffer. It increments the seek pointer by the number of bytes to have been read. The return value is the number of bytes that have been read, or a negative value upon an error.
  4. int write( int fd, byte buffer[] );
    Writes the contents of buffer to the file indicated by fd, starting at the position indicated by the seek pointer. The operation may overwrite existing data in the file and/or append to the end of the file. SysLib.write increments the seek pointer by the number of bytes to have been written. The return value is the number of bytes that have been written, or a negative value upon an error.
  5. int seek( int fd, int offset, int whence );
    Updates the seek pointer corresponding to fd as follows:
  6. If the user attempts to set the seek pointer to a negative number you must clamp it to zero. If the user attempts to set the pointer to beyond the file size, you must set the seek pointer to the end of the file. The offset loction of the seek pointer in the file is returned from the call to seek.
  7. int close( int fd );
    Closes the file corresponding to fd, commits all file transactions on this file, and unregisters fd from the user file descriptor table of the calling thread's TCB. The return value is 0 in success, otherwise -1.
  8. int delete( String fileName );
    Deletes the file specified by fileName. All blocks used by file are freed. If the file is currently open, it is not deleted and the operation returns a -1. If successfully deleted a 0 is returned.
  9. int fsize( int fd );
    Returns the size in bytes of the file indicated by fd.

Unless specified otherwise, each of the above system calls returns -1 when detecting an error.

5. Implementation

Superblock

The first disk block, block 0, is called the superblock. It is used to describe

  1. The number of disk blocks.
  2. The number of inodes.
  3. The block number of the head block of the free list.
It is the OS-managed block. No other information must be recorded in and no user threads must be able to get access to the superblock.

Inodes

Starting from the blocks after the superblock will be the inode blocks. Each inode describes one file. Our inode is a simplified version of the Unix inode. It includes 12 pointers of the index block. The first 11 of these pointers point to direct blocks. The last pointer points to an indirect block. In addition, each inode must include (1) the length of the corresponding file, (2) the number of file (structure) table entries that point to this inode, and (3) the flag to indicate if it is unused (= 0), used(= 1), or in some other status (= 2, 3, 4, ...). 16 inodes can be stored in one block.

You will need a constructor that retrieves an existing inode from the disk into the memory. Given an inode number, termed inumber, this constructor reads the corresponding disk block, locates the corresponding inode information in that block, and initializes a new inode with this information.

The system must avoid any inode inconsistency among different user threads. There are two solutions to maintain the inode consistency:

  1. Before an inode in memory is updated, check the corresponding inode on disk, read it from the disk if the disk has been updated by another thread. Thereafter, you should write back its contents to disk immediately. Note that the inode data to be written back include int length, short count, short flag, short direct[11], and short indirect, thus requiring a space of 32 bytes in total. For this write-back operation, you will need the toDisk method that saves this inode information to the iNumber-th inode in the disk, where iNumber is given as an argument.
  2. Create a Vector< Inode> object that maintains all inode on memory, is shared among all threads, and is exclusively access by each thread.

Root Directory

The "/" root directory maintains each file in a different directory entry that contains its file name (maximum 30 characters; 60 bytes in Java) and the corresponding inode number. The directory receives the maximum number of inodes to be created, (i.e., thus the max. number of files to be created) and keeps track of which inode numbers are in use. Since the directory itself is considered as a file, its contents are maintained by an inode, specifically inode 0. This can be located in the first 32 bytes of the disk block 1.

Upon booting ThreadOS, the file system instantiates the Directory class as the root directory through its constructor, reads the file from the disk that can be found through the inode 0 at 32 bytes of the disk block 1, and initializes the Directory instnace with the file contents. Prior to shutdown, the file system must write back the Directory information onto the disk. The methods bytes2directory( ) and directory2bytes will initialize the Directory instance with a byte array read from the disk and converts the Directory instance into a byte array that will be thereafter written back to the disk.

File (Structure) Table

The file system maintains the file (structure) table shared among all user threads. When a user thread opens a file, it follows the sequence listed below:

  1. The user thread allocates a new entry of the user file descriptor table in its TCB. This entry number itself becomes a file descriptor number. The entry maintains a reference to a file (structure) table entry.
  2. The user thread then requests the file system to allocate a new entry of the system-maintained file (structure) table. This entry includes the seek pointer of this file, a reference to the inode corresponding to the file, the inode number, the count to maintain #threads sharing this file (structure) table, and the access mode. The seek pointer is set to the front or the tail of this file depending on the file access mode.
  3. The file system locates the corresponding inode and records it in this file (structure) table entry.
  4. The user thread finally registers a reference to this file (structure) table entry in its file descriptor table entry of the TCB.

The file (structure) table entry should be:

The file (structure) table is defined as follows:

File Descriptor Table

Each user thread maintains a user file descriptor table in its own TCB. Every time it opens a file, it allocates a new entry table including a reference to the corresponding file (structure) table entry. Whenever a thread spawns a new child thread, it passes a copy of its TCB to this child which thus has a copy of its parent's user file descriptor table. This in turn means the both the parent and the child refer to the same file (structure) table entries and eventually share the same files.

Note that the file descriptors 0, 1, and 2 are reserved for the standard input, output, and error, and thus those entires must remain null.

6. Statement of Work

Design and implement the file system in ThreadOS as specified above. Run Test5.java which is found in the ThreadOS directory. This test program will test the functionality of every feature and system call listed above and check the data consistency of your file system. Some of the things that are tested include:

  1. Does the filesystem allow one to read the same data from a file that was previously written?
  2. Are malicious operations handled appropriately and proper error codes returned? Those operations include file accesses with a negative seek pointer, too many files to be opened at a time per a thread and over the system, etc.
  3. Does the file system synchronize all data on memory with disk before a shutdown and reload those data from the disk upon the next boot?

7. What to Turn In

Softcopy (file uploads):

gradeguide5.txt is the grade guide for the final project.

8. FAQ

This website could answer your questions. Please click here before emailing the professor :-).

9. Additional Materials

Click here.