CSS 434 - Group Discussion

CSS 434
Group Discussion
Instructor: Munehiro Fukuda
Discussion dates: see the syllabus

1. Purpose

Group discussion intends to help you understand other students' paper-review presentation and class materials. Three or four students form one group to discuss about one topic given by the professor and present their discussions.

2. Topics

No.	Discussions	Grade	Topics
1	Process Migration, Time and Global States	1%	Group 1,6: Consider how to move a mobile agent to a remote site and to resume its execution from a new function there. Where and how do you think agent execution platforms should use techniques of serialization, RMI, classloader, and multi-threading? Group 2,7: Consider three distributed snapshot algorithms such as Chandy-Lamport's, Samadi's, and Mattern's algorithms. Discuss their pros and cons. Group 3,8: An NTP server B receives server A's message at 16:34:23.480 bearing a timestamp 16:34:13.430 and replies to it. A receives the message at 16:34:15.725, bearing B's timestamp 16:34:25.7. Estimate the offset between B and A and the accuracy of the estimate. Group 4: Compare Timewarp and SPEEDS in terms of performance, process creation/termination, dynamic memory allocation, and I/O handling. Group 5: Solve the questions in the following charts.
2	Distributed Shared Memory	1%	Group 3: Compare Ivy and Dash in terms of considtency models, shared-data granularity, HW/SW implementation, false sharing, and implementation. What types of applications can Ivy and Dash benefit respectively? Group 2: Is the memory underlying the following execution fo two processes sequentially consistent (assuming that, initially all variables are set to set)? P1: P2: 1 <- R(x); W(x, 1); 2 <- R(x); 1 <- R(y); W(y, 1); W(x, 2); Group 1: Compare distributed shared memory and message passing (such as MPI) in terms of programmability and performance. Discuss about their pros and cons using two types of applications: computer graphics such as 3D ray tracing and spatial simulation such as Heat2D. Group 8: Show that the following history is not causlly consistent. P1: P2: P3: W(a, 0) 1 <- R(a) 2 <- R(b) W(a, 1) W(b, 2) 0 <- R(a) Group 7, 5: Consider that two processes P0 and P1 are running on a different computing node, each with its own DRAM memory while they can access each other's memory through a DSM support. Let's assume that System.SharedMemory( ) is available to allocate a zero-initialized shared space. For instance, the following example allocates a 4K-byte shared space as a byte array. byte[] byteArray = (byte [])System.SharedMemory( 4096 ); Processes P0 and P1 will execute the following program at the same time. public class SharedMemory { public static int main( String args[] ) { int pid = Integer.parseInt( args[0] ); // args[0] should be 0 or 1 as process ID. // allocate a 4096-byte zero-initialized shared memory byte[] byteArray = (byte [])System.SharedMemory( 4096 ); while ( true ) { if ( ++byteArray[ pid * 32 ] % 100 ) { // every 100-time increment of byteArray[0] by P0 or byteArray[32] by P1 while ( byteArray[64] != pid ); // wait for my turn byteArray[64] = ( pid + 1 ) % 2; // then, let the other process go } } } } Assume that this DSM is the Dash-like hardware-based multiprocessor with release consistency whose cache line size is 32bytes (but not 16byes in real Dash). What problem occurs? To address this problem, how should you change this program? Group 6, 4: Consider Ivy's ownership manager algorithm. The following figure shows a given state where process 4 is page 0's actual owner, while other processes consider their right neighbor as page 0's probable owner. Now, assume that process 0 reads page 0, and thereafter process 2 reads the same page. Show page 0's new probable owner visible from each process by drawing new thick arrows.
3	Distributed File Systems	1%	Group 8: Compare NFS (ver 3) and AFS in terms of file-accessing models, file-sharing semantics, modification propagation (write through or delayed write), server-side/client-side caching, client/server-initiated validation, and stateful/stateless server. Group 7: Discuss about the pros and cons of backward transaction and forward transaction. State the pros and cons of server-side caching. Also, state the pros and cons of client-side caching. Group 6, 5: Consider two extreme cases when using PVFS. Case 1 is to let an MPI program read many small files. Case 2 is to let an MPI program read a single large file Group 4, 3: Given the following code, implement a client-initiated invalidation in FileClient.java. Note that the client can cache only one file. Assume that FileServer.java has been already implemented. public class FileContents implements Serializable { private byte[] contents; private Date time; public FileContents( byte[] contents ) { this.contents = contents; } public byte[] get( ) { return contents; } public Date timestamp( ) { // the timestampe when downloaded from the server return time; } } public interface ServerInterface extends Remote { // download a given file from the server public FileContents download( String filename ) throws RemoteException; // upload a given file's contents to the server public boolean upload( String filename, FileContents contents ) throws RemoteException; // check the timestampe of a given file public Date timestamp( String filename ) throws RemoteException; } public interface ClientInterface extends Remote { // invalidate the client cache public void invalidate( ) throws RemoteException; public FileContents writeback( ) throws RemoteException; } public class FileClient extends UnicastRemoteObject implements ClientInterface { private String cachedFilename = null; // the name of the file cached at the client private FileContents fileContents = null; // the file contents cached at the client private String cachedFileStatus = null; // "r" or "w" public static void main( String[] args ) throws RemoteException { new FileClient( args ); } public FileClient ( String[] args ) throws RemoteException { try { Scanner input = new Scanner( System.in ); // keyboard input ServerInterface server // access a file server = ( ServerInterface ) Naming.lookup( "rmi://" + args[0] + ":" + args[1] + "fileserver" ); while ( true ) { System.out.print( "FileClient: file [r/w]: " ); String filename = input.next( ); // read a file name to operate String r_w = input.next( ); // check the read/write mode of this file // Client-initiated file validation before accessing a file if ( cachedFilename != null ) { // some file cached at the client if ( !filename.equals( cachedFilename ) ) { // need a different file if ( cachedFileStatus.equals( "w" ) ) { // the client wants to write // ITEM 4 and 5: // Write the current cached file back to the server before downloading a differen file // Implement it (Both CLIENT and SERVER-INVALIDATION) } } // ITEM 4: CLIENT-INITIATED INVALIDATION // Need to check the timestamp here, right? } else // no file is cached at the client fileContents = server.download( filename ); // start a text editor to open the file. cachedFilename = filename; cachedFileStatus = r_w; runTextEditor( fileContents, cachedFileStatus ); } } catch ( Exception e ) { } } public void invalidate( ) { // ITEM 5: SERVER-INITIATED INVALIDATION // Implement it } public FileContents writeback( ) { // ITEM 5: SERVER-INITIATED INVALIDATION // Implement it } private void runTextEditor( FileContents file, String r_w ) { // assume that it has been already implemented } } Group 2, 1: Given the above code, implement a server-initiated invalidation in FileClient.java. Note that the client can cache only one file. Assume that FileServer.java has been already implemented.
4	Fault Tolerane	1%	Group 1, 5: Discuss about Gossip and Coda: Could the gossip architecture be used for a distributed computer game as described below? The players move figures around a common scene. The state of the game is replicated at the players' workstations and at a server, which contains services controlling the game overall, such as collision detection. Updates are multicast to all replicas. The quorum-based replication protocol can address network partition problems. Why didn't Code use this protocol? Explain the reason. To cope with the network partition problem, what featreus does Coda have? Group 2, 6: The following state transition diagram describes the two-phase commitment protocol. Let's assume that worker1 crashed when a coordinator sent a doCommit message. Trace this diagram. To be specific, make appropriate arrows "thick, read arrows" with your pen or pencil. When worker1 gets resumed back, what does it have to do? Group 3, 7: Discuss about Hadoop (HDFS) Consider why HDFS maintains only one primary and only one secondary name server? Why not multiple name nodes? Explain how HDFS maintain replicas. On top of HDFS, a MpaReduce program can run in parallel. For fault tolerance, how does MapReduce rerun a crashed job? Group 4, 8: The following MPI program wants to checkpoint each rank's computation as a snapshot every for-loop iteration, anticipating any crash in the middle of an entire computation. When being resumed from a crash, this MPI program wants to resume the on-going computation from the latest snapshot rather than redo from the scratch. Consider checkpoint( ) and resume( ) you find at the bottom of this code and implement them as accurate as possible. import mpi.*; public class FT_MPI { private static int rank = 0; private static int size = 0; private static int[] data = new int[1]; private static int i; public static void main( String[] args ) throws MPIException { MPI.Init( args ); rank = MPI.COMM_WORLD.Rank( ); size = MPI.COMM_WORLD.Size( ); data[0] = 0; resume( ); for ( i = 0; i < 1000000; i++ ) { // communication from the left neighbor if ( rank > 0 ) MPI.COMM_WORLD.Recv( data, 0, 1, MPI.INT, rank - 1, 0 ); // do some dummy computation data[0] += rank + 1; if ( rank < size - 1 ) MPI.COMM_WORLD.Send( data, 0, 1, MPI.INT, rank + 1, 0 ); // communication to the right neighbor if ( rank == size - 1 ) System.out.println( data[0] ); checkpoint( ); } MPI.Finalize( ); } private static void checkpoint( ) { // any arguments? // what data to save into a snapshot file } private static void resume( ) { // any arguments? // what data to resume from a snapshot file } }

3. Discussion and Presentation

Approximately 20 minutes will be given for a group discussion. The professor will assign each group to a different discussion item that should be summarized on Google Doc. A group representative is supposed to present the discussion to other students and will receive 0.1 points as extra credits.