Data Leverage References

← Back to browse

Tag: collection:data-leverage (20 references)

Algorithmic Collective Action with Two Collectives 2025 inproceedings

Aditya Karan, Nicholas Vincent, Karrie Karahalios, Hari Sundaram

The Economics of AI Training Data: A Research Agenda 2025 article

Hamidah Oderinwale, Anna Kazlauskas

Research agenda documenting AI training data deals from 2020 to 2025. Reveals persistent market fragmentation, five distinct pricing mechanisms (from per-unit licensing to commissioning), and that most deals exclude original creators from compensation. Found only 7 of 24 major deals compensate original creators.

Collective Bargaining in the Information Economy Can Address AI-Driven Power Concentration 2025 inproceedings

Nicholas Vincent, Matthew Prewitt, Hanlin Li

Push and Pull: A Framework for Measuring Attentional Agency on Digital Platforms 2025 inproceedings

Zachary Wojtowicz, Shrey Jain, Nicholas Vincent

Poisoning Web-Scale Training Datasets is Practical 2024 misc

Nicholas Carlini, Matthew Jagielski, Christopher A. Choquette-Choo, Daniel Paleka, Will Pearce, Hyrum Anderson, Andreas Terzis, Kurt Thomas, Florian Tramèr

Large language models reduce public knowledge sharing on online Q&A platforms 2024 article

R. Maria del Rio-Chanona, Nadzeya Laurentsyeva, Johannes Wachs

Algorithmic Collective Action in Machine Learning 2023 inproceedings

Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic

Provides theoretical framework for algorithmic collective action, showing that small collectives can exert significant control over platform learning algorithms through coordinated data strategies.

The Dimensions of Data Labor: A Road Map for Researchers, Activists, and Policymakers to Empower Data Producers 2023 inproceedings

Hanlin Li, Nicholas Vincent, Stevie Chancellor, Brent Hecht

Behavioral Use Licensing for Responsible AI 2022 inproceedings

Danish Contractor, Daniel McDuff, Julia Katherine Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent, Hanlin Li

Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses 2022 article

Micah Goldblum, Dimitris Tsipras, Chulin Xie, Xinyun Chen, Avi Schwarzschild, Dawn Song, Aleksander Madry, Bo Li, Tom Goldstein

Comprehensive survey systematically categorizing dataset vulnerabilities including poisoning and backdoor attacks, their threat models, and defense mechanisms.

Addressing Documentation Debt in Machine Learning Research: A Retrospective Datasheet for BookCorpus 2021 inproceedings

Jack Bandy, Nicholas Vincent

Machine Unlearning 2021 inproceedings

Lucas Bourtoule, Varun Chandrasekaran, Christopher A. Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, Nicolas Papernot

Introduces SISA (Sharded, Isolated, Sliced, Aggregated) training for efficient exact machine unlearning. Partitions data into shards with separate models, enabling targeted retraining when data must be forgotten.

Extracting Training Data from Large Language Models 2021 inproceedings

Carlini, Nicholas, Tramer, Florian, Wallace, Eric, Jagielski, Matthew, Herbert-Voss, Ariel, Lee, Katherine, Roberts, Adam, Brown, Tom B., Song, Dawn, Erlingsson, {\'U}lfar, Oprea, Alina, Papernot, Nicolas

Can "Conscious Data Contribution" Help Users to Exert "Data Leverage" Against Technology Companies? 2021 article

Nicholas Vincent, Brent Hecht

Data Leverage: A Framework for Empowering the Public in its Relationship with Technology Companies 2021 inproceedings

Vincent, Nicholas and Li, Hanlin and Tilly, Nicole and Chancellor, Stevie and Hecht, Brent

Data Shapley: Equitable Valuation of Data for Machine Learning 2019 inproceedings

Amirata Ghorbani, James Zou

BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain 2019 article

Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg

First demonstration of backdoor attacks on deep neural networks. Shows that small trigger patterns in training data cause models to misclassify any input containing the trigger (e.g., stop signs with stickers classified as speed limits).

How Do People Change Their Technology Use in Protest?: Understanding Protest Users 2019 article

Hanlin Li, Nicholas Vincent, Janice Tsai, Jofish Kaye, Brent Hecht

"Data Strikes": Evaluating the Effectiveness of a New Form of Collective Action Against Technology Companies 2019 inproceedings

Nicholas Vincent, Brent Hecht, Shilad Sen

Simulates data strikes against recommender systems, showing that collective withholding of training data can create leverage for users against technology platforms.

Examining Wikipedia With a Broader Lens: Quantifying the Value of Wikipedia's Relationships with Other Large-Scale Online Communities 2018 inproceedings

Nicholas Vincent, Isaac Johnson, Brent Hecht