CSS 430 - Program 3: Synchronization

CSS 430
Program 3: Synchronization
Due date: See the syllabus

1. Purpose

This assignment provides you an opportunity to learn more about monitors by implementing a monitor in Java to manage disk I/O within ThreadOS.

While the original Java monitors have only an ability of randomly waking up one or all sleeping threads (via notify() or notifyAll()), our Java monitor will allow threads to sleep on and wake up from a specific condition. This new monitor class, SyncQueue (which you will define in SyncQueue.java), will prevent threads from busy waiting on disk read and write operations, wherein I/O-bound threads might waste considerable CPU time. Instead, ThreadOS will preempt the current thread if it executes a disk read or write operation, put it into the SyncQueue list, and select another thread for execution. This will be accomplished, in part, by implementing the Kernel interrupt handling for the SysLib.join() and SysLib.exit() ThreadOS system calls, which threads can use to wait for or be notified upon the termination of one of their child threads.

The implementation of the Kernel interrupt handling code for SysLib.join() and SysLib.exit() has been available in Kernel.class since the start of the course; we've used them in previous assignments. SysLib.join() calls Kernel.interrupt passing it the Kernel.WAIT commmand; SysLib.exit() calls Kernel.interrupt passing it the Kernel.EXIT commmand; The careful reader may have noticed the following lines in the Kernel.java file provided in the distribution directory:

  public static int interrupt( int irq, int cmd, int param, Object args ) {
    TCB myTcb;
    switch( irq ) {
      case INTERRUPT_SOFTWARE: // System calls
        switch( cmd ) { 
          // ...
          case WAIT:
            // get the current thread id
            // let the current thread sleep in waitQueue under the 
            // condition = this thread id
            // System.out.print( " * " );
            return OK; // return a child thread id who woke me up
          case EXIT:
            // get the current thread's parent id
            // search waitQueue for and wakes up the thread under the
            // condition = the current thread's parent id
            // tell the Scheduler to delete the current thread (since it is exiting)
            return OK;
          // ...

[Note: the last comment above, about deleting the current thread, is not yet part of the distribution source file.]

You will now have an opportunity to replace those comments with Java code that implements the operations they describe.

2. `Disk.java`

We will be learning more about how an operating system manages operations on disks in chapters 10 and 11. This section will provide you with an overview of all you need to know [for now] about such operations.

ThreadOS uses buffered I/O in reading and writing to the disk. Accessing the disk requires system calls, and accessing data on a physical disk is many orders of magnitude slower than accessing data in main memory. When accessing a file, rather than reading or writing a single byte at a time directly to or from the disk, a larger block of data is read from or written to the disk in a single system call, using a buffer in main memory as an intermediary. Executing a file read operation (via SysLib.cin()) involves reading from the buffer, and executing a file write operation (via SysLib.cout() or SysLib.cerr()) involves writing to the buffer. When a file read operation is executed on a file for which data has not yet been loaded into the buffer, a disk read operation (or what we will be calling a RAWREAD operation) is used to transfer the contents of a data block on the disk into the buffer. When the buffer has been filled by file write statements, a disk write operation (or what we will be calling a RAWWRITE operation) is used to transfer the contents of the buffer out to a data block on the disk. A disk sync operation (SYNC) can be used to write whatever data is in the buffer out to a data block on the disk, regardless of whether the buffer is full.

In ThreadOS, a Disk class simulates a slow disk device with 1000 blocks, each containing 512 bytes. The disk is composed of 10 tracks, each of which thus includes 100 blocks. This simulated disk is represented by the file, DISK in the ThreadOS directory, and can execute only one disk operation or command at a time. The disk commands are

IDLE: no disk operation is being executed
READ: disk executes a read() operation
WRITE: disk executes a write() operation
SYNC: disk executes a sync() operation

The Disk class provides support for these commands via the following operations (all of which return a boolean value):

Operations	Behaviors
`read( int blockId, byte buffer[] )`	reads data into `buffer[]` from the disk block specified by `blockId`
`write( int blockId, byte buffer[] )`	writes the data in `buffer[]` to the disk block specified by `blockId`
`sync()`	writes back all logical disk data in `buffer[]` to the simulated physical `DISK` file
`testReady()`	returns the disk status (`readyBuffer`), which indicates whether the disk is ready to execute a command; primarily used to see if the most recent disk command - `read()`, `write()` or `sync()` - has been completed or not
`testAndResetReady()`	tests the disk status (`readyBuffer`), and if it is `true`, resets it to `false`; primarily used to see if the most recent disk command - `read()`, `write()` or `sync()` - has been completed and if it has, starts the next command

Upon an instantiation, a new Disk object checks if the current directory includes the file named DISK. If such a file exists, Disk reads data from this file into its private array (buffer[]). Otherwise, Disk creates a new DISK file in your current directory (and the time required to create this file is the primary reason for the slow response the first time you launch ThreadOS). Every time it accepts a SYNC command, Disk writes all its internal disk blocks (in buffer) into the DISK file. When ThreadOS Kernel is shut down - when the 'q' command is typed to the Loader, it automatically issues a SYNC command to Disk (via SysLib.sync()), so that Disk can be repopulated with data from the DISK file next time ThreadOS is booted. If you want to start ThreadOS with a clean disk, erase the DISK file from your directory.

All Disk access methods - read(), write() and sync() - immediately return a boolean value indicating if Disk has accepted the command or not. Executing these commands takes a certain amount of time to be completed. If a new command is issued before the disk has completed the current command, Disk returns false, indicating it did not accept the command. Therefore, these commands have to be repeatedly executed until a return value of true is received.

The read() and write() methods do not guarantee that the buffer[] data have been read from or written to a specific disk block until the disk command completion status (readyBuffer) is set to true. Similarly, the sync() method does not guarantee that all disk blocks have been written back to the DISK file until the disk command completion status (readyBuffer) is set to true. This completion status is a private variable, whose value can be checked by two methods: testReady() and testAndResetReady(). If the previous command - a READ, WRITE or SYNC - has been completed, those testing methods return true, otherwise they will return false. The main difference between these two methods is that testAndResetReady() resets the completion status to false if the status had been true (simulating the launching of a new disk command after the last one has completed), while testReady() does not modify the status.

All Disk operations must be requested via system calls from user threads to the ThreadOS Kernel. The following code is a portion of Kernel.java related to disk operations.

public class Kernel {
  // Interrupt requests
  public final static int INTERRUPT_SOFTWARE = 1;  // System calls
  public final static int INTERRUPT_DISK     = 2;  // Disk interrupts
  public final static int INTERRUPT_IO       = 3;  // Other I/O interrupts

  // System calls
  public final static int BOOT    =  0; // SysLib.boot()
  // ...
  public final static int RAWREAD =  5; // SysLib.rawread(int blk, byte b[])
  public final static int RAWWRITE=  6; // SysLib.rawwrite(int blk, byte b[])
  public final static int SYNC    =  7; // SysLib.sync()

  // Return values
  public final static int OK = 0;
  public final static int ERROR = -1;

  // System thread references
  private static Scheduler scheduler;
  private static Disk disk;

  // The heart of Kernel
  public static int interrupt( int irq, int cmd, int param, Object args ) {
    TCB myTcb;
    switch( irq ) {
      case INTERRUPT_SOFTWARE: // System calls
        switch( cmd ) { 
          case BOOT:
            // instantiate and start a scheduler
            scheduler = new Scheduler(); 
            scheduler.start();
    
            // instantiate and start a disk
            disk = new Disk( 1000 );
            disk.start();
            return OK;
          //...
          case RAWREAD: // read a block of data from disk
            while ( disk.read( param, ( byte[] )args ) == false )
              ;     // busy wait
            while ( disk.testAndResetReady() == false )
              ;     // another busy wait
            return OK;
          case RAWWRITE: // write a block of data to disk
            while ( disk.write( param, ( byte[] )args ) == false )
              ;     // busy wait
            while ( disk.testAndResetReady() == false )
              ;     // another busy wait
            return OK;
          case SYNC:     // synchronize disk data to a real file
            while ( disk.sync() == false )
              ;     // busy wait
            while ( disk.testAndResetReady() == false )
              ;     // another busy wait
            return OK;
          // ...
          }
        return ERROR;
      case INTERRUPT_DISK: // Disk interrupts
        // ...
      case INTERRUPT_IO:   // other I/O interrupts (not implemented)
        // ...
    }
    return OK;
  }
}

3. `wait()` and `notify()`

The above code example has two severe performance problems. First of all, a user thread must repeatedly request a disk operation until the disk accepts its request. Second, a user thread needs to repeatedly check if its request has been satisfied or completed. This kind of busy wait loop (or spinlock) simply causes the user thread to waste CPU time until it either relinquishes the CPU upon a context switch or receives a response from the disk. A more efficient way to manage disk I/O would be to place any thread waiting for disk service into an I/O queue, where it can be awakened when the disk is available.

We can implement a monitor in Java by defining a class using synchronized methods. A thread can put itself to sleep inside a monitor object by calling the object's wait() method, and can wake up another thread waiting in the monitor by calling the object's notify() method. However, wait() and notify() do not distinguish different conditions for which a thread might be waiting (e.g., waiting for the disk to accept a request vs. waiting for the disk to complete a request). Therefore, in order to wake up a thread waiting for a specific condition, we will implement a more refined monitor - or, more specifically, a collection of monitors (one per condition) - which will be managed by a SyncQueue, with the following data and methods:

Private/Public Methods/Data Descriptions

private QueueNode[] queue maintains an array of QueueNode objects, each representing a different condition and enqueuing all threads that wait for this condition. The size of the queue array should be determined via the SyncQueue constructor, whose spec is given below.

public SyncQueue() SyncQueue( int condMax ) constructors that create a queue and allow threads to wait for a condMax number of condition/event types (default number is 10)
public enqueueAndSleep( int condition ) enqueues the calling thread and puts it to sleep until a given condition is satisfied
public void dequeueAndWakeup( int condition ) dequeues and wakes up a thread waiting for a given condition. If there are two or more threads waiting for the same condition, only one thread is dequeued and resumed. The FCFS (first-come-first-served) order does not matter.

[Note that while there is a QueueNode.class file in the ThreadOS distribution directory, there is no QueueNode.java file. You will have to implement your own QueueNode class as part of this execise.]

With our new SyncQueue, we can now implement more efficient disk operations. We must first create a new SyncQueue object during the boot process of ThreadOS. Actually, we will create two: an ioQueue to manage disk I/O, and a waitQueue to manage waits (joins) and exits (the latter SyncQueue will be described in the next section).

          case BOOT:
            // instantiate and start a scheduler
            scheduler = new Scheduler(); 
            scheduler.start();
          
            // instantiate and start a disk
            disk = new Disk( 1000 );
            disk.start();
            
            // instantiate a cache memory
            cache = new Cache( disk.blockSize, 10 );
            
            // instantiate synchronized queues
            ioQueue = new SyncQueue();
            waitQueue = new SyncQueue( scheduler.getMaxThreads() );
            return OK;

We can then replace the original, busy-waiting RAWREAD operations:

          case RAWREAD: // read a block of data from disk
            while ( disk.read( param, (byte[]) args ) == false )
              ;     // busy wait
            while ( disk.testAndResetReady() == false )
              ;     // another busy wait
            return OK;

with a new, more CPU-efficient sequence of operations:

          case RAWREAD: // read a block of data from disk
            while ( disk.read( param, (byte[]) args ) == false )
              ioQueue.enqueueAndSleep( COND_DISK_REQ ); // relinquish CPU to another ready thread
            while ( disk.testAndResetReady() == false )
              ioQueue.enqueueAndSleep( COND_DISK_FIN ); // relinquish CPU to another ready thread
            // now you can access data in buffer
            return OK;

In the above example, there are two conditions specified. COND_DISK_REQ is used to represent the case when a thread is waiting for the disk to accept a request; COND_DISK_FIN represents the case when a thread is waiting for the disk to complete (or finish) a request, (i.e., waiting for the buffer[] array to be read or written).

The remaining problem is who will wake up a thread sleeping on this ioQueue under one of these two conditions. Interrupts indicating the completion of Disk I/O are handled by the ThreadOS Kernel. The following code shows the portion of Kernel.java that wakes up a sleeping thread upon receiving an INTERRUPT_DISK interrupt (the calls to ioQueue.dequeueAndWakeup() are commented out in the distribution version of the Kernel source file).

  public static int interrupt( int irq, int cmd, int param, Object args ) {
    TCB myTcb;
    switch( irq ) {
      case INTERRUPT_SOFTWARE: // System calls
        // ...
      case INTERRUPT_DISK: // Disk interrupts
        // wake up the thread waiting for a service completion
        ioQueue.dequeueAndWakeup( COND_DISK_FIN );

        // wake up the thread waiting for a request acceptance
        ioQueue.dequeueAndWakeup( COND_DISK_REQ );
        return OK;
      // ...
    }
    return OK;
  }

4. `SysLib.join()` and `SysLib.exit()`

In Java, you can use the join( Thread t ) method to have the calling thread wait for the termination of the thread pointed to by the argument t. However, if an application program starts two or more threads and it simply wants to wait for all of them to complete, waiting for a particular thread is not an optimal solution. In Unix/Linux, the wait() system call is based on this idea. When it is called with no arguments - and thus no specification of a PID for which to wait - it simply waits for the next one of its child processes to terminate and returns the PID of the child that has woken up the calling process.

Now, consider thread synchronizations in our ThreadOS. We should not allow a user thread to directly access another thread's reference. Otherwise, a user thread can take over thread management such as suspending, resuming and killing other threads. These operations are the sole purview of ThreadOS. Therefore, ThreadOS should not allow a user thread to directly call the Java join( Thread t ) method. It should follow the same synchronization semantics as Unix/Linux. For such synchronization, ThreadOS provides two system calls: SysLib.join() and SysLib.exit(). SysLib.join() permits the calling thread to sleep until one of its child threads calls SysLib.exit() upon its termination, and returns the ID of the child thread that woke up the calling thread.

We will use a SyncQueue for implementing those two system calls. First of all, the Kernel should instantiate a new SyncQueue named waitQueue (as shown in the previous section) which will use each thread ID as an independent waiting condition.

   SyncQueue waitQueue = new SyncQueue( scheduler.getMaxThreads() );

When SysLib.join() is invoked by a parent thread, the Kernel receives a Kernel.WAIT interrupt request. The Kernel puts the current thread to sleep in waitQueue under a condition represented by that thread's thread ID. When SysLib.exit() is invoked by the child thread, the Kernel receives a Kernel.EXIT interrupt request. The Kernel then searches waitQueue for - and wakes up - the thread waiting under the condition represented by the current thread's parent ID.

          case WAIT:
            // get the current thread id
            // let the current thread sleep in waitQueue under the 
            // condition = this thread id
            // System.out.print( " * " );
            return OK; // return a child thread id who woke me up
          case EXIT:
            // get the current thread's parent id
            // search waitQueue for and wakes up the thread under the
            // condition = the current thread's parent id
            // tell the Scheduler to delete the current thread (since it is exiting)
            return OK;

SysLib.join() must return to the calling thread the ID of the child thread that has woken it up. For this purpose, we need to redefine SyncQueue's enqueueAndSleep() and dequeueAndWakeup() methods as follows:

Private/Public	Methods/Data	Descriptions
`public`	`int enqueueAndSleep( int condition )`	enqueues the calling thread and puts it to sleep until a given `condition` is satisfied, returning the ID of a child thread that has woken up the calling thread
`public`	`dequeueAndWakeup( int condition ) dequeueAndWakeup( int condition, int tid )`	dequeues and wakes up a thread waiting for a given `condition`. If there are two or more threads waiting for the same `condition`, only one thread is dequeued and resumed. The FCFS (first-come-first-served) order does not matter. The new version of this function can receive the calling thread's ID, (`tid`) as the 2nd argument, which will will be passed to the thread that has been woken up from `enqueueAndSleep()`. If no 2nd argument is given, `tid` will have the value 0.

5. Statement of Work

Part 1: Implementing SysLib.join() and SysLib.exit()

Make a copy of the current Kernel, e.g., cp Kernel.java Kernel_0.java

Design and code SyncQueue.java according to the above specification.
Design and code QueueNode.java according to the above specification.

Modify the code of the WAIT and the EXIT cases for INTERRUPT_SOFTWARE in Kernel.java using a SyncQueue, so that threads can wait for one of their child threads to be terminated. Note that a SyncQueue object should be instantiated as the variable waitQueue upon a boot, (i.e., in the BOOT case):
```
  waitQueue = new SyncQueue( scheduler.getMaxThreads() );
```
Compile your SyncQueue.java and Kernel.java, and thereafter run Test2e from the Shell (i.e., after loading Shell.class via "l Shell" after booting ThreadOS) to confirm:
1. Test2e waits for the termination of all its five child threads, (i.e., the TestThread2 threads).
2. Shell waits for the termination of Test2e, i.e., it should not display its prompt until Test2e has completed its execution.
3. Loader waits for the termination of Shell, i.e., Loader should not display its prompt ("-->") until you type exit from the Shell prompt.
[Note: use Shell.class from the ThreadOS distribution directory, rather than your MyShell.class from the earlier programming assignment.]

Part 2: Implementing Asynchronous Disk I/Os

Save a copy of your Kernel.java from Part 1 as Kernel_1.java.

Modify Kernel.java to use a SyncQueue in the three case statements - RAWREAD, RAWWRITE and SYNC - so that disk accesses will no longer cause spinlock (busy waiting). Remember to instantiate a SyncQueue object as the variable ioQueue upon a boot, (i.e., in the BOOT case).
```
  case BOOT:
    // ...
    ioQueue = new SyncQueue();
    // ...
```

Write a user-level test thread called Test3a.java which spawns and waits for the completion of X pairs of threads (where X = 1 ~ 4); in each pair, one thread should conduct only numerical computation (CPU-bound) and the other thread should call only SysLib.rawread() and SysLib.rawwrite() randomly across the Disk. These two types of threads may be defined separately in TestThread3a.java and TestThread3b.java or defined in a single file, TestThread3a.java, that takes a single argument to specify whether computation or disk accesses should be performed. In addition to starting and waiting for threads, Test3a.java should measure the time elapsed from the start to the termination of its child threads.

Measure the running time of Test3a.java when running it on your ThreadOS. Then, replace Kernel.java with the older version, Kernel_1.java, which does not permit threads to sleep on disk operations. Compile and run this older version to see the running time of Test3a.java. Do you see any performance improvement in your newer version? Discuss the results in your report.

To verify your own test program, you may run the distribution version, Test3.class that receives one integer, (i.e., X) and spawns X pairs of CPU- and Disk I/O-intensive threads. You should note any significant differences between the behavior of Test3 (distribution version) and Test3a (your version). Since UW1-320 machines include multi-cores, you need to increase X up to 3 or 4 to observe a clear difference between interruptible and busy-wait I/O approaches.
```
[css430@uw1-320-20 ThreadOS]$ java Boot
threadOS ver 1.0:
Type ? for help
threadOS: a new thread (thread=Thread[Thread-3,2,main] tid=0 pid=-1)
-->l Test3 3
l Test3 3
...
elapsed time = 85162 msec.
-->q
```

5. What to Turn In

Submit the following files:

Part 1:
- Kernel_1.java (which you have modified for Part 1)
- Test2e.output (results when running Test2e from Shell)
- SyncQueue.java
- QueueNode.java
Part 2:
- Kernel.java (which you have modified for Part 2)
- Test3a.java
- TestThread3a.java [and TestThread3b.java, if applicable]
- Test3a.output:
  - Performance results when running Test3a [or Test3] on your Kernel_1.java (implemented in Part 1)
  - Performance result when running Test3a [or Test3] on your Kernel.java (implemented in Part 2)
Part 3: Documentation
- Specifications and descriptions of your Test3a.java and TestThread3?.java programs;
- Discussion about performance results, especially the differences
- Discussion of any significant differences between the behavior of Test3 (distribution version) and Test3a (your version)
- Discussion of the data structures you considered for implementing QueueNode and the collection of QueueNodes inside SyncQueue and why you used the data structures you selected

gradeguide3b.txt is the grade guide for the assignment 3

6. Note

Visit CSS430 Operating Systems Assignments Home Page in order to reassure your working environment and turn-in procedure.

7. FAQ

This website could answer your questions. Please click here before emailing the professor :-).