CSS 430 for 2016 Summer
Final Project: File System

Due Date: See the syllabus

1. Purpose

In this project, you will build a Unix-like file system on the ThreadOS. Through the use of the file system, user programs will now be able to access persistent data on disk by way of stream-oriented files rather than the more painful direct access to disk blocks with rawread() and rawrite().

2. Collaboration

Given the size of this project, you are encouraged to form a team of four or fewer students. In working with your partner(s), the workload should divided equally across all members. This project requires much more time to be completed than the previous four assignments. Schedule and pace your design and coding work accordingly. If you are very confident of working independently you may do so.

3. What Files to Use

If you're using the files provided to you via the Linux Lab: Use ThreadOS' original files, (i.e., the original classes) except for Kernel.java. For Kernel.java, use the one you got in the assignment 4, (i.e., Kernel_org.java). For simplicity, you should just re-download all readable files from the ThreadOS directory to get started.

If you're using the IntelliJ starter project then all the files you need are in that project (and please disregard the above paragraph)

4. Interface

The file system should provide user threads with the system calls that will allow them to format, to open, to read from , to write to, to update the seek pointer of, to close, to delete, and to get the size of their files.

For simplicity, the file system being created will consiste of a single level. The "/" root directory is predefined by the file system and permanently available for user threads to store their files. No other directories are provided by the system and created by users dynamically.

Each user thread needs to keep track of all files it has opened. For this purpose, it should maintain a table of those open files in its TCB. This table is called a user file descriptor table. It has 32 entries. Whenever a thread opens a file, it must allocate to this file a new table entry, termed a file descriptor. Each file descriptor includes the file access mode and the reference to the corresponding file (structure) table entry. The file access mode indicates "read only", "write only", "read/write", or "append". The file (structure) table is a system-maintained table shared among all user threads, each entry of which maintains the seek pointer and the inode number of a file. Depending on the access mode, the seek pointer is set to the first or the tail of the file, and keeps track of a next position to read from and to write to the file. It is entirely possible for one thread to open the same file many times, thus having several entries in the corresponding TCB's user file descriptor table. Although each of these user file descriptor table entries refer to a different file (structure) table entry with its own seek pointer, all of them eventually points to the same inode.

The file system you will implement must provide the following eight system calls.

  1. int SysLib.format( int files );
    Formats the disk (Disk.java's data contents). The parameter files specifies the maximum number of files to be created (the number of inodes to be allocated) in your file system. The return value is 0 on success, otherwise -1.
  2. int fd = SysLib.open( String fileName, String mode );
    Opens the file specified by the fileName string in the given mode (where "r" = ready only, "w" = write only, "w+" = read/write, "a" = append). The call allocates a new file descriptor, fd to this file. The file is created if it does not exist in the mode "w", "w+" or "a". SysLib.open must return a negative number as an error value if the file does not exist in the mode "r". Note that the file descriptors 0, 1, and 2 are reserved as the standard input, output, and error, and therefore a newly opened file must receive a new descriptor numbered in the range between 3 and 31. If the calling thread's user file descriptor table is full, SysLib.open should return an error value. The seek pointer is initialized to zero in the mode "r", "w", and "w+", whereas initialized at the end of the file in the mode "a".
  3. int read( int fd, byte buffer[] );
    Reads up to buffer.length bytes from the file indicated by fd, starting at the position currently pointed to by the seek pointer. If bytes remaining between the current seek pointer and the end of file are less than buffer.length, SysLib.read reads as many bytes as possible, putting them into the beginning of buffer. It increments the seek pointer by the number of bytes to have been read. The return value is the number of bytes that have been read, or a negative value upon an error.
  4. int write( int fd, byte buffer[] );
    Writes the contents of buffer to the file indicated by fd, starting at the position indicated by the seek pointer. The operation may overwrite existing data in the file and/or append to the end of the file. SysLib.write increments the seek pointer by the number of bytes to have been written. The return value is the number of bytes that have been written, or a negative value upon an error.
  5. int seek( int fd, int offset, int whence );
    Updates the seek pointer corresponding to fd as follows:
  6. If the user attempts to set the seek pointer to a negative number you must clamp it to zero. If the user attempts to set the pointer to beyond the file size, you must set the seek pointer to the end of the file. In both cases, you should return success.
  7. int close( int fd );
    Closes the file corresponding to fd, commits all file transactions on this file, and unregisters fd from the user file descriptor table of the calling thread's TCB. The return value is 0 in success, otherwise -1.
  8. int delete( String fileName );
    Destroys the file specified by fileName. If the file is currently open, it is not destroyed until the last open on it is closed, but new attempts to open it will fail.
  9. int fsize( int fd );
    Returns the size in bytes of the file indicated by fd.

Unless specified otherwise, each of the above system calls returns -1 when detecting an error.

5. Implementation

Superblock

The first disk block, block 0, is called the superblock. It is used to describe

  1. The number of disk blocks.
  2. The number of inodes.
  3. The block number of the head block of the free list.
It is the OS-managed block. No other information must be recorded in and no user threads must be able to get access to the superblock.

Inodes

Starting from the blocks after the superblock will be the inode blocks. Each inode describes one file. Our inode is a simplified version of the Unix inode. It includes 12 pointers of the index block. The first 11 of these pointers point to direct blocks. The last pointer points to an indirect block. In addition, each inode must include (1) the length of the corresponding file, (2) the number of file (structure) table entries that point to this inode, and (3) the flag to indicate if it is unused (= 0), used(= 1), or in some other status (= 2, 3, 4, ...). 16 inodes can be stored in one block.

You will need a constructor that retrieves an existing inode from the disk into the memory. Given an inode number, termed inumber, this constructor reads the corresponding disk block, locates the corresponding inode information in that block, and initializes a new inode with this information.

The system must avoid any inode inconsistency among different user threads. There are two solutions to maintain the inode consistency:

  1. Before an inode in memory is updated, check the corresponding inode on disk, read it from the disk if the disk has been updated by another thread. Thereafter, you should write back its contents to disk immediately. Note that the inode data to be written back include int length, short count, short flag, short direct[11], and short indirect, thus requiring a space of 32 bytes in total. For this write-back operation, you will need the toDisk method that saves this inode information to the iNumber-th inode in the disk, where iNumber is given as an argument.
  2. Create a Vector< Inode> object that maintains all inode on memory, is shared among all threads, and is exclusively access by each thread.

Root Directory

The "/" root directory maintains each file in a different directory entry that contains its file name (maximum 30 characters; 60 bytes in Java) and the corresponding inode number. The directory receives the maximum number of inodes to be created, (i.e., thus the max. number of files to be created) and keeps track of which inode numbers are in use. Since the directory itself is considered as a file, its contents are maintained by an inode, specifically inode 0. This can be located in the first 32 bytes of the disk block 1.

Upon booting ThreadOS, the file system instantiates the Directory class as the root directory through its constructor, reads the file from the disk that can be found through the inode 0 at 32 bytes of the disk block 1, and initializes the Directory instnace with the file contents. Prior to shutdown, the file system must write back the Directory information onto the disk. The methods bytes2directory( ) and directory2bytes will initialize the Directory instance with a byte array read from the disk and converts the Directory instance into a byte array that will be thereafter written back to the disk.

File (Structure) Table

The file system maintains the file (structure) table shared among all user threads. When a user thread opens a file, it follows the sequence listed below:

  1. The user thread allocates a new entry of the user file descriptor table in its TCB. This entry number itself becomes a file descriptor number. The entry maintains a reference to a file (structure) table entry.
  2. The user thread then requests the file system to allocate a new entry of the system-maintained file (structure) table. This entry includes the seek pointer of this file, a reference to the inode corresponding to the file, the inode number, the count to maintain #threads sharing this file (structure) table, and the access mode. The seek pointer is set to the front or the tail of this file depending on the file access mode.
  3. The file system locates the corresponding inode and records it in this file (structure) table entry.
  4. The user thread finally registers a reference to this file (structure) table entry in its file descriptor table entry of the TCB.

The file (structure) table entry should be:

The file (structure) table is defined as follows:

File Descriptor Table

Each user thread maintains a user file descriptor table in its own TCB. Every time it opens a file, it allocates a new entry table including a reference to the corresponding file (structure) table entry. Whenever a thread spawns a new child thread, it passes a copy of its TCB to this child which thus has a copy of its parent's user file descriptor table. This in turn means the both the parent and the child refer to the same file (structure) table entries and eventually share the same files.

Note that the file descriptors 0, 1, and 2 are reserved for the standard input, output, and error, and thus those entires must remain null.

6. Statement of Work

Work for this project is divided up into three sections.  The goal here is to help y'all start work on this early (i.e., to divide the work up into chunks and do each chunk in good time, rather than trying to do the entire project a day or two before it's due)(NOTE: Trying to do the entire project 1 or 2 days before it's due usually results in poor code that earns a low grade or even outright failiure to complete the work).  Each section is refered to as a 'phase', and each phase has it's own deadline (please see the Syllabus and/or Canvas for the due dates)

6.1: Planning Phase

For this phase you will find a group to work with (PLEASE don't try this on your own!) and figure out what you need to do for this massive project. 

What to turn in for this phase: Each group will hand in a single document containing the work for this phase.  Each group's document must contain:

  1. A list of team members
    Please list members alphabetically by last name to make it easy to see who's in the group.
  2. A clear description of the work that your group needs to do.
    In order to fulfill the requirements of this phase you'll need to list out each of the classes that you'll implement, each of the methods that you're planning on coding up, and some comment and very brief pseudocode that explain what each method should do.  You're welcome to write this up in a Word document, in a PDF, or in a text file (which could include a .JAVA file - note that the file should compile but doesn't have to do anything (i.e., put all your notes in comments)  ).
    The purpose of this requirement is to make sure that you've got a good idea of what you need to do and how you're going to do it.
    It is expected that you will change things around as you work through the next two phases, and there will be no point penalty for doing so.
  3. A list of work items that your group will complete for the next phase.
    For each work item write down the name of the person (or people) who intends to complete that work.
    While it's recommended that you aim to finish about half of the overall project for each of the next two phases you're welcome to split up the work however you want.  The only restriction is that you must do some work for the next phase.

6.2:  Initial Work

For this phase you will complete some of the work for the project.  You will finish the project during the Completing The Project phase.

What To Turn In For This Phase: Each group should upload a single copy of their work so far, and an updated list of work items.
The work should include whatever .JAVA files you've worked on.  Everything you hand in should both compile and execute.  While it doesn't have to accomplish anything in particular at this stage it's good to include 'test code' that verifies that your individual pieces are working. 
The updated list of work items should list all the work items for the project, should clearly indicate which items have been completed, and should list the name(s) of the people assigned to each item.  It's expected that this will have changed since the prior phase (which is why you're being asked to turn in an updated version)

What you hand in during this phase does not necessarily have to be the same as the list of work items that you handed in during the previous phase.  As long as you've gotten a 'reasonable amount' of work done there will be no point penalty. 

One of the goals for this phase is for you to realize how much work this project is, and to schedule more time to complete this project before the final due date.  In other words, one goal for this phase is for you to (potentially) be overwhelmed and not finish everything on your list for this phase so that you'll make changes to your life and then successfully finish the final project on time!

VERY IMPORTANT NOTE: NO flexibility will be provided for the next phase.   If you don't hand anything in (or you hand in junk code that does poorly) there will be no extra chances to revise your work, and no adjustments made to the due date of the Project Completion phase.

6.3:  Project Completion

Before you read anything else about this phase, please re-read the VERY IMPORTANT NOTE at the end of the prior phase.

At the end of this phase each group will hand in a fully complete implementation of the file system for ThreadOS.  Each group should upload a single copy for the entire group.

The technical goal for this phase is to finish the design and implementation of the file system in ThreadOS as specified above. Run Test5.java which is found in the ThreadOS directory. This test program will test the functionality of every feature and system call listed above and check the data consistency of your file system. Some of the things that are tested include:

  1. Does the filesystem allow one to read the same data from a file that was previously written?
  2. Are malicious operations handled appropriately and proper error codes returned? Those operations include file accesses with a negative seek pointer, too many files to be opened at a time per a thread and over the system, etc.
  3. Does the file system synchronize all data on memory with disk before a shutdown and reload those data from the disk upon the next boot?

What To Turn In For This Phase:

gradeguide5.txt is the grade guide for the final project.

7. Note

Visit the Canvas page about general assignment information in order to double-check your working environment and turn-in procedure.

8. FAQ

This website could answer your questions. Please click here before emailing the professor :-).

9. Additional Materials