CSS 434
Paper Review

Professor: Munehiro Fukuda
Group presentation dates: see the syllabus


0. Teamwork

Each paper review and presentation will be done by a team of two students. Please choose a review topic from the following list of reading assignments, and work with your partner on the assigned paper review and in-class presentation.

1. Purpose

This reading assignment intends to have you experience the very initial step of research activity, i.e., reading research papers. Unlike reading textbooks, you are not supposed to memorize well-known facts but expected to summarize the key idea of each paper you have read and to discuss the contribution/drawback of the research presented in the paper.

Each group is expected to pick up a notable research/commercial project, to review one or more related papers, and to present the group's understanding of the research project that has been chosen.

2. Reading Assignment

There are five research areas, each including several projects whose accomplishment has been already published in research papers. The following shows the list of possible papers and web pages you should read. They are accessible from the web or retrieval from cssmpi1h-12h.uwb.edu: /home/NETID/css434/papers/ through sftp:

A. Distributed Synchronization

A-1. Timewarp
Focus on optimistic synchronization
  1. David Jefferson, Brian Beckman, Fred Wieland, Leo Blume, Mike DiLoret, Phil Hontalas, Pierre Laroche, Kathy Sturdevant, Jack Tupman, Van Warren, John Wedel, Herb Younger, and Steve Bellonot, "Distributed Simulation and the Time Warp Operating System" Technical Report, UCLA, Agust, 1987 (available from /home/NETID/css434/papers/)
  2. Jefferson, D.R., "Virtual Time", ACM Transactions on Programming Languages and Systems, Vol.7 No.3, 1985, pages 404-425 (available from /home/NETID/css434/papers/)
A-2. SPEEDES
Focus on breathing time bucket (as compared to Timewarp)
  1. Jeff Steinman, "The Event Horizon", Technical Report, Jet Propulsion Laboratory California Institute of Technology, JPL D-10029, November 1992 (available from /home/NETID/css434/papers/)
  2. Jeff S. Steinman, "Discrete-event simulation and the event horizon", ACM SIGSIM Simulation Digest, Vol.24 No.1, pages 39-49, July 1994 (available from /home/NETID/css434/papers/)
A-3. Distributed Snapshots (Chandy and Lamport / Samadi and Mattern)
Compare the difference between two algorithms
  1. K. Chandy and L. Lamport, "Distributed Snapshots: Determining Global States of Distributed Systems", ACM Transactions on Computer Systems, Vol. 3, No. 1, February 1985, pages 63-75 (available from /home/NETID/css434/papers/)
  2. B. Samadi, Distributed Simulation, Algorithms and Performance Analysis. PhD thesis, UCLA, 1985. (available from /home/NETID/css434/papers/)
  3. Mattern, F., "Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation", Journal of Parallel and Distributed Computing, Vol.18, No.4, 1993, pages 425-434 (available from /home/NETID/css434/papers/)

B. Distributed Shared Memory / Cache

B-1. Ivy
Focus on centralized, fixed, and dynamic distributed managers
  1. Li, K. and Hudak, P., "Memory Coherence in Shared Virtual Memory Systems", ACM Transactions on Computing Systems, Vol.7, No.4, 1989 pages 321-359 (available from /home/NETID/css434/papers/)
  2. George Coulouris, Jean Dollimore, and Tim Kindberg, "Sequential Consistency and Ivy", Section 18.3, In Book of Distributed Systems: Concepts and Design, 4th Ed., Addison-Wesley, 2005, pages 763-771 (Our textbook. If you choose this topic, your review should be more than the textbook's scope.)
B-2. Dash
Focus on physical architecture and cache coherence protocol
  1. D. Kenoski, J. Laudon, K. Gharachorloo, W. Weber, A. Gupt, J. Hennessy, M. Horowitz, and M. Lam, "The Stanford DASH multiprocessor", IEEE Computer, Vol.25 No.3, 1992, pages 63-79 (available from /home/NETID/css434/papers/)
  2. Leonoski, D., Laudon, J., Joe, T., Nakahira, D., Steves, L. Gupta, A., and Hennesy, J., "The DASH Prototype: Logic Overhead and Performance", IEEE Transaction on Parallel and Distributed Systems, Vol.4, No.1, 1993, pages 41-61 (available from /home/NETID/css434/papers/)
B-3. Java Cache
Compare Linda, JavaSpace, Hazelcast, and Oracle's Coherence JCache
  1. Gelernter, D. and Carriero, N., "Coordination Languages and Their Significance", Communication of ACM, Vol.35 No.2, 1992, pages 97-107 (available from /home/NETID/css434/papers/)
  2. Carriero, N., and Gelernter, D., "Linda in Contex", Communication of ACM, Vol.32, No.4, 1989 pages 444-458 (available from /home/NETID/css434/papers/)
  3. Getting Started With JavaSpaces Technology: Beyond Conventional Distributed Programming Paradigms
  4. JSR107 API and SPI 1.1.1 API - javax.cache
  5. Hazelcast JCache (You need to explain about Hazelcast, too.)
  6. Oracle, "Ch 33. Introduction to Coherence JCache", in Coherence 12.2.1.4.0
B-4. Spark
Focus on resilient distributed dataset (RDD), transformations, actions, broadcast variables, and accumulators
  1. Apache Spark - Unified Analytics Engine for Big Data

C. Distributed File Systems

C-1. Sun NFS
Focus on client caching, client-initiated invalidation, file locking, and open delegation
  1. Andrew S. Tanenbaum and Maarten van Steen, "SUN Network File System", Section 10.1, In Book of Distributed Systems: Principles and Paradigms, Prentice Hall, 2002, pages 576-603 (available from /home/NETID/css434/papers/)
  2. George Coulouris, Jean Dollimore, and Tim Kindberg, "Sun Network File System", Section 8.3, In Book of Distributed Systems: Concepts and Design, 4th Ed., Addison-Wesley, 2005, pages 337-349 (Our textbook. If you choose this topic, your review should be more than the textbook's scope.)
  3. Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, and David Hitz, "NFS Version 3 Design and Implementation", USENIX Summer, 1994 (paper available from /home/NETID/css434/papers/)
C-2. AFS
Focus on session semantics, server-initiated invalidation, callback, and its implementation
  1. George Coulouris, Jean Dollimore, and Tim Kindberg, "The Andrew File Sytem", Section 8.4, In Book of Distributed Systems: Concepts and Design, 4th Ed., Addison-Wesley, 2005, pages 349-358 (Our textbook. If you choose this topic, your review should be more than the textbook's scope.)
  2. John H Howard, "An Overview of the Andrew File System", in Winter 1988 USENIX Conference Proceedings, 1988 (paper available from /home/NETID/css434/papers/)
  3. M. L. Kazar, "Synchronization and Caching Issues in the Andrew File System", In Proceedings of the USENIX Winter Technical Conference, 1988. (available from /home/NETID/css434/papers/)
C-3. PVFS: Parallel Virtual File System
Focus on IO nodes, file striping, MPI/IO, and cooperative cache
  1. OrangeFS (PVFS follower)
  2. Philip H. Carns, Walter B. Ligon, III, Robert B. Ross and Rajeev Thakur, "PVFS: A Parallel File System for Linux Clusters," In Proc. of the 4th Annual Linux Showcase and Conference, October 2000, pages 317--327 (paper available from /home/NETID/css434/papers/)
  3. In-Chul Hwang, Hojoong Kim, Hanjo Jung, Dong-Hwan Kim, Hojin Ghim, Seung-Ryoul Maeng, and Jung-Wan Cho, "Design and Implementation of the Cooperative Cache for PVFS", Lecture Notes in Computer Science, Volume 3036/2004, pages 43 - 50 (available as /home/NETID/css434/papers/).

D. Replication and Fault Tolerance

D-1. Gossip
Focus on available copy protocol, gossip architecture, timestamp, queries, and updates
  1. George Coulouris, Jean Dollimore, and Tim Kindberg, "The Gossip Architecture", Section 18.4.1, In Book of Distributed Systems: Concepts and Design, 5th Ed., Addison-Wesley, 2012, pages 783-792 (Our textbook. If you choose this topic, your review should be more than the textbook's scope.)
  2. Randy Chow and Theodore Johnson, "Gossip Update Propagation", Section 6.4.4, In Book of Distributed Operating Systems & Algorithms, Addison-Wesley, 1998 pages 223-226 (available from /home/NETID/css434/papers/)
  3. Ladin, R., Liskov, B., Shrira, L., and Ghemawat, S., "Providing Availability Using Lazy Replication", ACM Transactions on Computer Systems, Vol.10, No.4, 1992, pages 360-391 (available from /home/NETID/css434/papers/)
D-2. Coda
Focus on difference from AFS, client caching (with hoarding, emulation, reintegration), and server replication (with VSG)
  1. Andrew S. Tanenbaum and Maarten van Steen, "The Coda File System", Section 10.2, In Book of Distributed Systems: Principles and Paradigms, Prentice Hall, 2002, pages 604-623 (available from /home/NETID/css434/papers/)
  2. George Coulouris, Jean Dollimore, and Tim Kindberg, "The Coda File System", Section 18.4.3, In Book of Distributed Systems: Concepts and Design, 5th Ed., Addison-Wesley, 2012, pages 795-802 (Our textbook. If you choose this topic, your review should be more than the textbook's scope.)
  3. James J. Kistler and M. Satyanarayanan, "Disconnected Operation in the Coda File System", In Milojicic, D., Douglis, F., and Wheeler, R., editors, Mobility: Processes, Computers, and Agents, ACM Press, 1999, pages 293-305 (available from /home/NETID/css434/papers/)
D-3. GFS: Google File System or Hadoop
Focus on (the primary and 2ndary) name node(s), data nodes (for replication management, available reads and pipelined writes), their heartbeat communication, and failover mechanism in MapReduce
  1. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System", SOSP'03 October 19-22, 2003, (available from /home/NETID/css434/papers/)
  2. Cast Study GFS: Evoluation on Fast-forward, (available at ACM Queue http://queue.acm.org/detail.cfm?id=1594206
  3. Hadoop
D-4. JGroups (which is based on ISIS)
Focus on message-ordering features in ISIS and their implementation in JGroups
  1. JGroups
  2. http://www.cs.cornell.edu/Info/Projects/ISIS/
  3. Birman, K.P., "The Process Group Approach to Reliable Distributed Computing" , Communication of ACM, Vol.36, No.12, 1993 pages 36-53 (available from /home/NETID/css434/papers/)
  4. Birman, K. and Joseph, T., "Exploiting Virtual Synchrony in Distributed Systems", In Proceedings of 11th Symposium on Operating System Principles, 1987 pages 123-138 (available from /home/NETID/css434/papers/)

E. Job Management in Grid/Cloud

E-1. Condor
Focus on class ad, components (such as schedd, matchmaker, startd, starter, and shadow) and their interaction, flocking, and master-worker
  1. http://www.cs.wisc.edu/condor
  2. Douglas Thain, Todd Tannenbaum, and Miron Livny, "Condor and the Grid", in Fran Berman, Anthony J.G. Hey, Geoffrey Fox, editors, Grid Computing: Making The Global Infrastructure a Reality, John Wiley, 2003. ISBN: 0-470-85319-0 (available from both the above link and /home/NETID/css434/papers/)
E-2. YARN
Focus on components (RMs, NMs, AMs, and containers) and their interaction, memory/CPU allocation, and job scheduling policy
  1. Apache Hadoop YARN

Decide one research/commercial project your group is interested in, and reviews one or more readings related to the project. Some of them may be research papers published through IEEE, ACM, or online through their website, the others from a textbook section. Of importance is investigating the research project well enough to present your understanding in the class. If you are interested in any well-known research project other than those listed above, you can investigate it provided you receive an approval from the professor.

Sign up the topic you want to survey, by the end of the fourth week. The readings will be assigned in a first-come-first-service manner. Your presentation time slot will be scheduled depending on which paper(s) you want to read. Review the papers timely and get prepared for your presentation.

3. Presentation

Two through to four group presentations categorized in the same research area will be scheduled on the same lecture day. Each group has 20 minutes for a presentation followed by a Q-and-A session.

Get prepared for your presentation using PowerPoint. Send your draft PPT to the professor by at least 48 hours before your acutal presentation, so that the professor can give you some feedback. Note that, if students fail to send their draft PPT file to the professor by the deadline, the professor not only will discount their survey work but also may disallow their in-class presentations.

The audience is expected to evaluate each group presentation according to an evaluation sheet passed by the professor. This sheet includes the following 10 criteria:

The depth of a speaker's understanding on the research project
Item 1 Did he/she well understand the paper he/she reviewed?
Item 2 Did he/she well summarized the main idea of papers?
Item 3 Did he/she give clear answers to questions asked by the audience?

The depth of a speaker's critique for the paper(s)
Item 4 Did he/she properly point out the contribution of the papers?
Item 5 Did he/she mention about any drawbacks of the ideas introduced in the papers?
Item 6 Did he/she express his/her own opinions to improve the quality of the papers, research, and projects he/she reviewed?

The quality of a reivewer's slides
Item 7 Did his/her slides help the audience understand the paper(s)?
Item 8 How about the number of slides, the amount of contents on each slide, and the use of colors, different fonts, and animation?

The effectiveness of a reviewer's presentation
Item 9 Did you understand his/her speech? In other words, did he/she well organize his/her presentation and do every effort to let audience understand his/her presentation, (i.e., alternative or additional explanations)?
Item 10 Was his/her presentation interesting? In other words, did he/she try to keep audience attracted to his/her presentation?

Each evaluation criterion will receive the following grade:
very good: 10
good: 9
fair: 8
poor: 7
very poor: 6

The audience will fill out all criteria and turn in an evaluation sheet to the professor upon the completion of each group presentation. Based on audience evaluation, the professor will grade each group presentation. Note that the audience evaluation is not 100% reflected to the final grade of your presentation. The professor will take into account all including your office visit to discuss about your draft PPT, the quality of your final slides, your answer in a question-and-answer session, etc. to grade your survey work.

4. Your Responsibility as Audience

You are responsible to fill out an evaluation sheet for each presentation except your own. Give useful feedback to your classmates. Your critique to the other students is also counted as a part of your grade. Your absence or malicious evaluation will void your in-class discussion (1%) following the student presentations on the same day. If you must be absent from the class, you should watch the Panopto video of speakers who gave their presentation you missed, understand their paper review, and submit evaluation sheets to the professor by the next day of your missing class.