Just as AI performance degrades with data corruptions, we found that so does explanation faithfulness. For example, blurring an image can provide privacy, but this leads to heatmap explanations becoming spurious and highlight the wrong objects about the prediction. We propose Debiased-CAM, a self-supervised training method to allow the AI to “see through the blur” to explain faithfully despite image degradation. Debiased-CAM can significantly improve explanation faithfulness and prediction performance for a variety of image biases (blur, color shift, day-night lighting) and various prediction tasks (classification, captioning, multi-labeling). Through user studies, we identified when users find explanations untruthful and show that Debiased-CAM is perceived as more truthful and helpful than distorted biased heatmap explanations.
Congratulations to team members Wencan Zhang and collaborator Mariella Dimiccoli!
Video Preview:
CHI presentation:
Wencan Zhang, Mariella Dimiccoli and Brian Y. Lim. 2022. Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learning. In Proceedings of the international Conference on Human Factors in Computing Systems. CHI ’22.