How to read a paper for Advanced Operating Systems
There are a lot of assigned readings in this class and many students
consider the number more than reasonable for a single semester course.
The number of papers is reasonable if read appropriately. The number
of papers comes to well under five papers per week in the semester,
but the readings are heavily front end loaded (there are more of them
at the start of the course than at the end). Many students spend too
much time reading the early papers, causing them to fall behind and
leaving them with too little time to read the later papers in the
course.
These guidelines are intended to help students efficiently read papers
in Computer Science 555, the USC Graduate Computer Operating Systems
course. The guidelines are based in part on the summary from John
Brewer's
555 Survival Guide,
and the brochure "Efficient reading of papers
in Science and Technology" by Michael J. Hanson, 1990,
revised
2000 Dylan McNamee. Read the brochure first. The guidelines in the
brochure are extended here in terms specific to USC's Computer Science
555.
Why read the papers
The papers are assigned so that students see approaches that were
taken to solving distributed systems problems in various systems. It
is hoped that students will recall systems that were similar when
related problems arise in a system they design for real, or on an
exam. We don't expect students to memorize the details of each paper,
but we do expect them to know what the important aspects of a system
were, and most importantly, why certain approaches were taken. Exams
for Fall semester are open book, meaning that as long students
remember basically what was done and why, they can refer back to the
paper for the details if necessary.
The textbook
The assigned textbook chapters provide a coherent overview of the
material covered in the papers, and they add material that is not
available in the papers. In the textbook you will find discussions
covering the benefits and drawbacks of the different approaches
presented in what is usually a more balanced manner than you will find
in the individual papers, which tend to focus primarily on a single
system or approach.
First weeks readings
The first weeks reading includes both the introductory readings and
the readings for the topic covered in the second lecture
(usually it is communications models). Please be sure
to read both sets of readings.
Reading the papers
Using John Brewer's words, "Read for breadth first, then read it for
real". Read the intro, section headings, conclusion or summary, and
tables and graphs to get a feel for what the paper conveys. Take
notes and if you are reading several papers on related topics, prepare
a table of the key points: often the high level approaches and the
environment in which the system was deployed. [The table might make a
useful part of your reading report]. Once you know what the paper is
about, dig read the sections necessary to understand the approach and
the conclusions.
Be sure read sections describing the relationship to the work of
others - this section is often one of the most important because it
describes limitations of other systems that weren't mentioned in other
papers. Be careful here, since these "other" systems aren't always
portrayed correctly.
Read the papers critically
Keep in mind that the papers in the readings are written by the
proponents of particular systems or approaches. It is normal to find
papers with conflicting opinions. You need to read each paper
critically, asking yourself if the conclusions are supported by the
results. It is often the case that the contribution of a paper, while
significant, is not the contribution intended by the author.
In computer science experiments it is rare for two systems to be
compared in a controlled environment - often the performance of the
authors system, on current hardware, is compared against the published
results of an earlier system. Even when the "base" system is run on
the same hardware in the current environment, the "base" system might
not be optimized or tuned for optimal performance. Even when attempts
are made to optimize the base system, ask yourself if there are
particular characteristics of the environment that place the "base"
system at a disadvantage. If so, consider whether the claims of the
benefit of the new approach is limited to such environments.