Rylan Schaeffer

Logo
Resume
Research
Learning
Blog
Teaching
Jokes
Kernel Papers
Failures


13 April 2025

Alternative Reviewing Rubric for Machine Learning Conferences

by Rylan Schaeffer

During the ICML 2025 review process, 3% of my scores changed during/after rebuttals. My labmates had similar experiences; for instance, one who was serving as an Area Chair (AC) saw 2-3 scores change across 48 reviews. He pointed me towards the ICML 2025 blog post which states:

Author rebuttals have been shown to have a relatively marginal effect on paper acceptance and to exacerbate existing biases, and the process often prioritizes authors who happen to be highly available precisely during the rebuttal and discussion week(s), to “wear down” reviewer objections.”

This is largely consistent with my previous ML research experiences. While I applaud ICML for trying something new, the rebuttal phase continues to feel like it consumes a tremendous amount of time and energy without making any difference. What is a reasonable alternative?

I personally aim to approach reviewing in a very specific way (and have previously received recognition for being an outstanding reviewer), and while I haven’t thought through the implications of doing this at scale, I’d like to put forward an idealized version for consideration. After the reviewer has read the paper and has their overall assessment:

For each item of consideration (writing, claims, methodology, figures, tables, theorems, results), the reviewer should state:

  1. The reviewer’s current assessment of that item’s quality.
  2. What exactly the reviewer suggests would improve that item and why.
  3. How much the reviewer would be willing to increase their score (conditional on good execution by the authors).

Then, for each item, the authors are given a choice on whether to:

As a frequent author and reviewer, I appreciate both giving and receiving such clarity. In this manner, authors and reviewers are clear on the major items of consideration. Authors are welcome to challenge reviewers’ perspectives, or are welcome to put in work to improve their manuscript if the reviewer has made good suggestions. By preregistering a commitment to improve the score (conditioned on good execution), authors have less to fear from putting in work only for the reviewer to give the fearful “I’ll keep my score” or ghost them altogether. ACs can more clearly evaluate the interactions between authors and reviewers.

Reviewers are indeed stretched thin. It is a time-consuming and thankless job that pays nothing. My hope is that a “contract” like this - encouraging reviewers to be more concrete with their feedback and rewarding authors for executing well - could help improve the utility of the rebuttal phase and could produce better papers overall.

FAQs

How would you enforce this structure?

Modify the OpenReview template. If you force reviewers to structure their reviews in this manner via OpenReview, I think they’ll quickly understand.

If this helps the rebuttals process produce better papers, how will conferences be able to reject 70% of papers to ensure conferences maintain a perception of selectivity and prestige?

The field’s priority should not be for selectivity. If every paper deserves a Nobel prize, then every paper should be accepted. eLife and PLOS One have moved the peer review in this direction.

Acknowledgements

I thank my labmates Olawale Salaudeen, Anka Reuel, Sang Truong, Zach Robertson and my advisor Sanmi Koyejo for discussions on this topic.

tags: machine-learning - peer-review - neurips - iclr - icml