Monday, August 10, 2015

Paper Reviews vs. Code Reviews

Because I'm experiencing the NIPS submission process right now, the contrast with the ICLR submission process is salient. The NIPS submission process is a more traditional process, in which first (anonymous) submissions are sent to (anonymous) reviewers who provide feedback, and then authors have a chance to respond to the feedback. The ICLR submission process is more fluid: non-anonymous submissions are sent to anonymous reviewers who provide feedback, and then authors and reviewers enter into a cycle where the author updates the arxiv submission and reviewers provide further feedback. (ICLR also has public reviews, but I'm not going to talk about those). Note in the traditional model the reviewers have to imagine what my (promised) changes will look like in the final version.

The traditional model is from an age where papers were actual physical objects that were sent (via snail mail!) to reviewers who marked them up with ink pens, and hopefully advances in technology allow us to develop a more effective process. I think we should look to software engineering for inspiration. Having worked both as a researcher and as a software engineer, I appreciate the exoskeleton-robot distinction. In this context, the exoskeleton posture of science lends itself to a ballistic concept of paper review, where a completed unit of work is accepted or rejected; engineering is more about collaborative continuous improvement. Truthfully, most journals and some conferences utilize a more fluid review process, where there are “conditional accepts” (changes need to be re-reviewed) and “shepherds” (reviewers who commit to guiding a paper through several rounds of reviews). These processes place more burden on the reviewers, who are providing the valuable service of helping someone improve their work, without compensation or recognition. Naturally conferences might be hesitant to demand this from their volunteer reviewer pool.

The solution is technology that eases the cognitive and logistical burdens on all parties. Code reviews have the same broad goal as paper reviews: improvement of quality via peer feedback. Here are some things we can learn from code reviews:
  1. Incremental review. Rarely a programmer develop a complicated piece of software de novo. Most reviews are about relatively small changes to large pieces of software. To ease the cognitive burden on the reviewer, the changes are reviewed, rather than the entire new version. Visualization technology is used to improve change review productivity.
    1. The initial submission of a paper is distinct from the typical code review in this respect, but the subsequent cycle of revisions aligns with this nicely.
  2. Modular updates. When a programmer makes several distinct changes to a piece of software, the changes are (to the extent possible) designed to be commutable and independently reviewed. Technology is used to facilitate the composition of (the accepted subset of) changes.
    1. This aligns nicely with the review process, as different reviewers' feedback are analogous to issues.
  3. Minimal clean changes. Smart programmers will craft their changes to be most intelligible under review. This means little “easy” things like avoiding lexical changes which are semantically equivalent. It also means a tension between cleaning up the control flow of a program and creating a large change.

Add to this the ability to preserve the anonymity of all parties involved, and you would have a pretty sweet paper review platform that would plausibly accelerate the pace of all scientific fields while improving quality.

This is precisely the kind of public good that the government should fund. Somebody in academia: write a grant!

No comments:

Post a Comment