Pixel-level Certified Explanations via Randomized Smoothing

Alaa Anani, Tobias Lorenz, Mario Fritz, Bernt Schiele in International Conference on Machine Learning (ICML), 2025

Deep learning explainability is essential to understand model predictions. Post hoc attribution methods determine which image pixels most influence a classifier prediction, but are non-robust against imperceptible noise, undermining their trustworthiness. Certified attribution methods should prove that pixel-level importance values are robust; however, prior work only provides image-level bounds, which are too coarse. We introduce a certification approach that guarantees the pixel-level robustness of any black-box attribution method via Randomized Smoothing against L2-bounded input noise. By sparsifying then smoothing attributions, we reformulate the setup into a segmentation certification problem. We propose novel qualitative and quantitative metrics to assess the certified robustness of three families of attribution methods. Visualizing pixel-level certificates nicely complements the visual nature of attribution methods by providing a reliable certified output for downstream tasks. For quantitative evaluation, we introduce two metrics: (i) the percentage of certified pixels, measuring robustness, and (ii) certified localization. Our extensive experiments on ImageNet show high-quality certified outputs across attribution methods, comparing their certified visuals, robustness and localization performance.

Citation

@inproceedings{anani2025pixel,
    title          = {Pixel-level Certified Explanations via Randomized Smoothing}, 
    author         = {Alaa Anani and Tobias Lorenz and Mario Fritz and Bernt Schiele},
    booktitle      = {International Conference on Machine Learning (ICML)},
    year           = {2025}
}