Reading Technical Papers in Advanced Operating Systems

How to read a paper for Advanced Operating Systems

There are a lot of assigned readings in this class and many students consider the number more than reasonable for a single semester course. The number of papers is reasonable if read appropriately. The number of papers comes to well under five papers per week in the semester, but the readings are heavily front end loaded (there are more of them at the start of the course than at the end). Many students spend too much time reading the early papers, causing them to fall behind and leaving them with too little time to read the later papers in the course.

These guidelines are intended to help students efficiently read papers in Computer Science 555, the USC Graduate Computer Operating Systems course. The guidelines are based in part on the summary from John Brewer's 555 Survival Guide, and the brochure "Efficient reading of papers in Science and Technology" by Michael J. Hanson, 1990, revised 2000 Dylan McNamee. Read the brochure first. The guidelines in the brochure are extended here in terms specific to USC's Computer Science 555.

Why read the papers

The papers are assigned so that students see approaches that were taken to solving distributed systems problems in various systems. It is hoped that students will recall systems that were similar when related problems arise in a system they design for real, or on an exam. We don't expect students to memorize the details of each paper, but we do expect them to know what the important aspects of a system were, and most importantly, why certain approaches were taken. Exams for Fall semester are open book, meaning that as long students remember basically what was done and why, they can refer back to the paper for the details if necessary.

The textbook

The assigned textbook chapters provide a coherent overview of the material covered in the papers, and they add material that is not available in the papers. In the textbook you will find discussions covering the benefits and drawbacks of the different approaches presented in what is usually a more balanced manner than you will find in the individual papers, which tend to focus primarily on a single system or approach.

First weeks readings

The first weeks reading includes both the introductory readings and the readings for the topic covered in the second lecture (usually it is communications models). Please be sure to read both sets of readings.

Reading the papers

Using John Brewer's words, "Read for breadth first, then read it for real". Read the intro, section headings, conclusion or summary, and tables and graphs to get a feel for what the paper conveys. Take notes and if you are reading several papers on related topics, prepare a table of the key points: often the high level approaches and the environment in which the system was deployed. [The table might make a useful part of your reading report]. Once you know what the paper is about, dig read the sections necessary to understand the approach and the conclusions.

Be sure read sections describing the relationship to the work of others - this section is often one of the most important because it describes limitations of other systems that weren't mentioned in other papers. Be careful here, since these "other" systems aren't always portrayed correctly.

Read the papers critically

Keep in mind that the papers in the readings are written by the proponents of particular systems or approaches. It is normal to find papers with conflicting opinions. You need to read each paper critically, asking yourself if the conclusions are supported by the results. It is often the case that the contribution of a paper, while significant, is not the contribution intended by the author.

In computer science experiments it is rare for two systems to be compared in a controlled environment - often the performance of the authors system, on current hardware, is compared against the published results of an earlier system. Even when the "base" system is run on the same hardware in the current environment, the "base" system might not be optimized or tuned for optimal performance. Even when attempts are made to optimize the base system, ask yourself if there are particular characteristics of the environment that place the "base" system at a disadvantage. If so, consider whether the claims of the benefit of the new approach is limited to such environments.