Data Leverage References

← Back to browse

Distributional Training Data Attribution: What do Influence Functions Sample?

2025 article mlodozeniec2025dtda Not yet verified
Authors
Bruno Mlodozeniec, Isaac Reid, Sam Power, David Krueger, Murat Erdogdu, Richard E. Turner, Roger Grosse
Venue
arXiv preprint
Abstract
Introduces distributional training data attribution (d-TDA), which predicts how the distribution of model outputs depends upon the dataset. Shows that influence functions are "secretly distributional"—they emerge from this framework as the limit to unrolled differentiation without requiring restrictive convexity assumptions.

BibTeX

Local Entry
@article{mlodozeniec2025dtda,
  title = {Distributional Training Data Attribution: What do Influence Functions Sample?},
  author = {Bruno Mlodozeniec and Isaac Reid and Sam Power and David Krueger and Murat Erdogdu and Richard E. Turner and Roger Grosse},
  year = {2025},
  journal = {arXiv preprint},
  url = {https://arxiv.org/abs/2506.12965},
  abstract = {Introduces distributional training data attribution (d-TDA), which predicts how the distribution of model outputs depends upon the dataset. Shows that influence functions are "secretly distributional"—they emerge from this framework as the limit to unrolled differentiation without requiring restrictive convexity assumptions.}
}
From AUTO:S2
@article{mlodozeniec2025dtda,
  title = {Distributional Training Data Attribution},
  author = {Bruno Mlodozeniec and Isaac Reid and Sam Power and David Krueger and Murat Erdogdu and Richard E. Turner and Roger B. Grosse},
  year = {2025},
  journal = {arXiv.org},
  doi = {10.48550/arXiv.2506.12965}
}