Daunce: Data Attribution through Uncertainty Estimation
The paper introduces Daunce, a scalable and accurate data attribution method that estimates influence by analyzing the covariance of losses across perturbed models, making it suitable for large models like LLMs and proprietary systems such as GPT. Unlike gradient-based approaches, Daunce leverages uncertainty estimation through model perturbations, enhancing attribution accuracy for applications like data debugging and curation.