Data Leverage References

← Back to browse

Tag: efficiency (2 references)

CHG Shapley: Efficient Data Valuation and Selection towards Trustworthy Machine Learning 2024 article

Huaiguang Cai

Proposes CHG (compound of Hardness and Gradient) utility function to approximate the utility of each data subset, reducing computational complexity to a single model retraining—achieving a quadratic improvement over existing Data Shapley methods.

A Versatile Influence Function for Data Attribution with Non-Decomposable Loss 2024 article

Junwei Deng, Weijing Tang, Jiaqi W. Ma

Proposes Versatile Influence Function (VIF) designed to fully leverage auto-differentiation, eliminating case-specific derivations. Demonstrated across Cox regression for survival analysis, node embedding for network analysis, and listwise learning-to-rank, with estimates closely resembling leave-one-out retraining while being up to 10^3 times faster.