CSS 490: Distributed Computing Systems
(a.k.a. CSS 434: Parallel and Distributed Computing)
Winter 2004
TTh 545-750pm
Prof. Munehiro Fukuda
Professor:
Munehiro Fukuda
<mfukuda@u.washington.edu>,
room UW1-331, phone 352-3459,
office hours: TTu 410-540pm
Course Description:
This course introduces the concepts and design of distributed
computing systems. Topics covered include message passing, remote
procedure calls, process management, migration, mobile agents,
distributed coordination, distributed shared memory, distributed file
systems, fault tolerance, and grid computing.
The first five weeks focus on the basic mechanism and the C++
programming techniques for message passing, process management, and
migration. We will use sockets, MPI: Message Passing Interface, SunRPC,
and M++: a C++ based mobile agent system the instructor has designed.
The last five weeks discuss advanced topics, where the instructor will
overview each topic and students will review a topic-related research
paper in the class.
The final project requires a team work where each team of two students
picks up any parallelizable application, programs in MPI and M++, and
compares the programmability and the performance.
Prerequisites:
CSS343.
Work Load and Grading:
Course Work |
Percentage |
Achievements |
Approximately Corresponding Numeric Grade |
Programming 1 |
5% |
90s |
3.5 - 4.0 |
Programming 2 |
5% |
80s |
2.5 - 3.4 |
Paper Review |
25% (Presentation: 20%, Critique 5%) |
70s |
1.5 - 2.4 |
Final Project |
25% (Preliminary: 5%, Final: 20%) |
60s |
0.7 - 1.4 |
Midterm Exam |
20% |
Final Exam |
20% |
Textbooks:
75% of the lecture covers the following textbook, while the rest
focuses on some advanced topics such as MPI, mobile agents, and some
research-oriented topics. To help your understanding, I recommend
you should buy this textbook.
Distributed Operating Systems : Concepts and Design,
Pradeep K. Sinha
Wiley-IEEE Press, 1997.
References:
-
Distributed Systems: Concepts and Design,, 3rd Edition,
George Coulouris, Jean Dollimore and Tim Kindberg,
Addison-Wesley Publishers, Wokingham UK, 2001
-
Distributed Systems: Principles and Paradigms
Andrew S. Tanenbaum and Maarten van Steen,
Prentice Hall
2002
-
Distributed Operating Systems and Algorithm Analysis,
Randy Chow and Theodore Johnson,
Addition-Wesley
1997
Some Programming Textbooks:
The following books and manuals are useful for system, network, and MPI programming.
-
Advanced Programming in the UNIX Environment,
W. Richard Stevens,
Addison-Wesley,
1992.
-
Unix Network Programming, Volumn 1, , 2nd Version
W. Richard Stevens,
Addison-Wesley,
1998.
-
Unix Network Programming, Volumn 2, , 2nd Version
W. Richard Stevens,
Addison-Wesley,
1999.
-
Parallel Programming with MPI,
Peter S. Pacheco,
Morgan Kaufmann,
1997.
Policies:
Programming assignment 1 - 2 and paper review are to be done
independently. Any collaboration of work will result in severe
penalty. You may discuss the problem statement and any clarification
with each other, but any actual work to be turned in, must be done
without collaboration.
The final project may be done by a pair of students, in which case
both students must achieve an equally amount of work. For the detailed
instructions, see the project assignment sheet.
Any homework is due at the beginning of class on the due date. The
submission may be postponed only in emergencies such as accidents,
sickness, sudden business trips, and family emergencies, in which case
you may turn in yor homework late with a written proof. No make-up
exams will be given except under exceptional circumstances. Barring
emergencies, I must be informed before the exam.
To request academic accommodations due to a disability, please contact
Disabled Student Services (DSS) in Bothell Library Annex Building,
Room 106, (email: rlundborg@bothell.washington.edu,
TDD: 425-352-3132, and FAX: 425-352-5444). If you have a documented
disability on file with the DSS office, please have your DSS counselor
contact me and we can discuss accommodations.
Course Goals:
The overall goal of CSS 490, "Distributed Computing System" includes:
- To learn fundamental concepts that are used in and applicable to a
variety of distributed computing applicaitons,
- To improve your ability of reviewing and grasping the key idea of
research papers, and
- To evaluate the programmability and performance of a distributed
application you have chosen and coded.
To strengthen your understanding of fundamental concepts, you are
strongly recommended to solve the problems that are given on the final
page of each lecture slide. To review research papers, you must visit
the library, search for them, and get prepared for presenting your
paper review with the power point. Finally, you need to work in the
Linux laboratory, (UW1-320) for testing and evaluating the performance
of your distributed program. Therefore, as with most technical
courses, besides ability and motivation, it takes time to learn and
master the subject. Expect to spend an additional 10 to 15 hours a
week outside of class time on the average.
Assignments:
This is a research-flavored course. Each assignment specificaiton only
gives you a topic and a guideline in order to work on the assignment.
The answer and the quality of assignment work just depend on your
enthusiuasm for assignment work. Therefore, there are no specific key
answers.
- Program 1: exercises TCP communication and
program a simple parallel program.
- Program 2: exercises RPC programming and
code a program that passes a poiner/stl-based data structure.
- Paper review:requires each
student to review a notable research project and to present his/her
understanding in the class.
- Final project: requires a team
work where each team of two students picks up any parallelizable
application, programs in MPI and M++, and compares the programmability
and the performance.
Please read assignment.html to
understand the environment you use for assignments and the
submission/grading procedures.
Topics covered and tentative 430 fall schedule:
Note that this is an approximate ordering of topics. Chapters will
take about the allotted time and not all sections in all chapters are
covered.
Week |
Date |
Topics |
Chapters |
Reading |
Assignment |
1 |
Jan 6 |
School cancelled due to an incremental weather |
|
|
|
|
Jan 8 |
Fundamentals |
1 |
pp1-45 |
|
2 |
Jan 13 |
Invited Talk (by Dr. Drew, Boeing)
Lab Orientation (by Mr. McLean, IS) |
|
|
Program 1 assigned |
|
Jan 15 |
Message Passing |
1 |
pp46-139 |
|
3 |
Jan 20 |
Message Passing Interface |
3 |
pp139-166 |
|
|
Jan 22 |
Remort Procedure Call
|
4 |
pp167-230 |
|
4 |
Jan 27 |
Process Management |
8 |
pp398-420 |
|
|
Jan 29 |
Process Migration |
8 |
pp381-398 |
Programm 1 due Program 2 assigned |
5 |
Feb 3 |
M++ |
|
M++ User's Manual |
|
|
Feb 5 |
Paper Review |
|
D'Agent
IBM Agelts
Ara
Mole |
Reviewer: Bafus
Reviewer: Chan
Reviewer: Watson
Reviewer: Pham
|
6 |
Feb 10 |
Midterm exam in class |
1 - 4, and 8 |
pp1-230 and 381-420 |
Program 2 due Final project assigned |
|
Feb 12 |
Distributed Shared Memory |
5 |
pp231-281 |
|
7 |
Feb 17 |
Paper Review |
5 |
Messengers
Ivy
Dash
Munin
Jini/JavaSpace |
Reviewer: Dockter
Reviewer: Voorhees
Reviewer: Park
Reviewer: Suzuki
Reviewer: Bromage |
|
Feb 19 |
Synchronization |
6 |
pp282-346 |
|
8 |
Feb 24 |
Paper Review |
6 |
SPEEDES
Time Warp
Timing Management in HLA
| Reviewer: Smith
Reviewer: Christensen
Reviewer: Minar
|
|
Feb 26 |
Distributed File Systems |
9 |
pp421-439 |
Final project preliminary submission due |
9 |
Mar 2 |
Paper Review |
9 |
Sun NFS
AFS
XFS
Plan 9
LFS |
Reviewer: Robinson
Reviewer: Margell
Reviewer: Brown
Reviewer: Shirzad
Reviewer: Salisbury |
|
Mar 4 |
Replication and Fault Tolerance |
9 |
pp440-495 |
|
10 |
Mar 9 |
Paper Review |
9 |
ISIS
Gossip
Bayou
Coda
| Reviewer: Howe
Reviewer: Hendrich
Reviewer: Bekele
Reviewer: Owen
|
|
Mar 11 |
Grid Computing(1-hour lecture) Paper Review |
|
Legion
Condor
| Reviewer: Rai
Reviewer: Metke
|
11 |
Mar 16 |
Paper Review Final Project Presentation and Wrap Up |
|
Globus
| Reviewer: Grinburg
Final project final submission due |
|
Mar 18 |
Final exam in class |
5, 6, and 9 |
pp231-346 and pp421-495 |
|