Tag: data-provenance (1 references)
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI
Large-scale audit of over 1,800 text AI datasets analyzing trends, permissions of use and global representation. Found frequent miscategorization of licences on dataset hosting sites, with licence omission rates of more than 70% and error rates of more than 50%. Released the Data Provenance Explorer tool for practitioners.