Abstract
Patch robustness certification is an emerging verification approach for defending against adversarial patch attacks with provable guarantees for deep learning systems. Certified recovery techniques guarantee the prediction of the sole true label of a certified sample. However, existing techniques, if applicable to top- k predictions, commonly conduct pairwise comparisons on those votes between labels, failing to certify the sole true label within the top k prediction labels precisely due to the inflation on the number of votes controlled by the attacker (i.e., attack budget); yet enumerating all combinations of vote allocation suffers from the combinatorial explosion problem. We propose CostCert, a novel, scalable, and precise voting-based certified recovery defender. CostCert verifies the true label of a sample within the top k predictions without pairwise comparisons and combinatorial explosion through a novel design: whether the attack budget on the sample is infeasible to cover the smallest total additional votes on top of the votes uncontrollable by the attacker to exclude the true labels from the top k prediction labels. Experiments show that CostCert significantly outperforms the current state-of-the-art defender PatchGuard, such as retaining up to 57.3% in certified accuracy when the patch size is 96, whereas PatchGuard has already dropped to zero. Copyright © 2025 IEEE.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 2025 25th International Conference on Software Quality, Reliability and Security, QRS 2025 |
| Place of Publication | USA |
| Publisher | IEEE |
| Pages | 199-210 |
| ISBN (Electronic) | 9781665477710 |
| DOIs | |
| Publication status | Published - 2025 |
Citation
Zhou, Q., Wang, H., Wei, Z., & Chan, W. K. (2025). Scalable and precise patch robustness certification for deep learning models with top-k predictions. In Proceedings of 2025 25th International Conference on Software Quality, Reliability and Security, QRS 2025 (pp. 199-210). IEEE. https://doi.org/10.1109/QRS65678.2025.00030Keywords
- Top-k certification
- Patch attacks
- Robustness
- Worst-case analysis
- Verification
- Deep learning model