Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation

2025 article eriksson2025benchmarks ⚠ Needs review - 1 field differ

Fields with differences: venue. Compare local vs external BibTeX below.

Authors

Maria Eriksson, Erasmo Purificato, Arman Noroozian, Joao Vinagre, Guillaume Chaslot, Emilia Gomez, David Fernandez-Llorca

Venue

arXiv preprint arXiv:2502.06559

Citations

Cited in projects (1)

Data Leverage Blogs

BibTeX

Local Entry

@article{eriksson2025benchmarks,
  title = {Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation},
  author = {Maria Eriksson and Erasmo Purificato and Arman Noroozian and Joao Vinagre and Guillaume Chaslot and Emilia Gomez and David Fernandez-Llorca},
  year = {2025},
  journal = {arXiv preprint arXiv:2502.06559},
  url = {https://arxiv.org/abs/2502.06559}
}

From AUTO:S2

@article{eriksson2025benchmarks,
  title = {Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation},
  author = {Maria Eriksson and Erasmo Purificato and Arman Noroozian and João Vinagre and Guillaume Chaslot and Emilia Gómez and D. Fernández-Llorca},
  year = {2025},
  journal = {Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society},
  doi = {10.48550/arXiv.2502.06559}
}