Tag: unlearning (7 references)
Rethinking machine unlearning for large language models
Comprehensive review of machine unlearning in LLMs, aiming to eliminate undesirable data influence (sensitive or illegal information) while maintaining essential knowledge generation. Envisions LLM unlearning as a pivotal element in life-cycle management for developing safe, secure, trustworthy, and resource-efficient generative AI.
LLM Unlearning via Loss Adjustment with Only Forget Data
FLAT is a loss adjustment approach which maximizes f-divergence between the available template answer and the forget answer with respect to the forget data. Demonstrates superior unlearning performance compared to existing methods while minimizing impact on retained capabilities, tested on Harry Potter dataset and MUSE Benchmark.
Machine Unlearning: A Survey
Comprehensive survey of machine unlearning covering definitions, scenarios, verification methods, and applications. Cited in the International AI Safety Report 2025 as a pioneering paradigm for removing sensitive information.
LEACE: Perfect linear concept erasure in closed form
Datamodels: Predicting Predictions from Training Data
Proposes datamodels that predict model outputs as a function of training data subsets, providing a framework for understanding data attribution through retraining experiments.
Machine Unlearning
Introduces SISA (Sharded, Isolated, Sliced, Aggregated) training for efficient exact machine unlearning. Partitions data into shards with separate models, enabling targeted retraining when data must be forgotten.
Towards Making Systems Forget with Machine Unlearning
First formal definition of machine unlearning. Proposes converting learning algorithms into summation form to enable efficient data removal without full retraining. Foundational work establishing the unlearning problem.