Rylan Schaeffer

Logo
Resume
Publications
Learning
Blog
Teaching
Jokes
Kernel Papers


Quantifying variance in evaluation benchmarks

Lovish Madaan, Aaditya K Singh, Rylan Schaeffer, Andrew Poulton, Sanmi Koyejo, Pontus Stenetorp, Sharan Narang, Dieuwke Hupkes

arXiv preprint Under Review

June 2024

Summary

Quantifying and understanding variance in LLM evaluation benchmarks.