BackdoorBench: A Comprehensive Benchmark of Backdoor Learning

Contribution analysis of BackdoorBench.

Main contribution

GOAL: Aim to alleviate the dilemma - evaluations of new methods are often unthorough to verify their claims and accurate performance.

Open-sourced toolbox.
8000 comprehensive evaluations.
Thorough analysis and new findings.

arch-backdoor-bench

Comprehensive evaluations

The paper provide evaluations of all pairs of 8 attacks against 9 defense methods, with 5 poisoning ratios, based on 4 datasets and 5 models, up to 8,000 pairs of evaluations in total.

Compared with TrojanZoo

There are significant differences between TrojanZoo and BackdoorBench in two main aspects:

codabase, TrojanZoo is OOP style, while BackdoorBench is POP style.
analysis and findings are different.

TrojanZoo privode abundant and diverse analysis of backdoor learning, mainly includeing:

a. attack effects of trigger size
b. attack effects of trigger transparency
c. data complexity
d. backdoor transferability to downstream tasks
e. defense effects tradeoff between robustness and utility
f. defense effects tradeoff between detection acc and recovery capability
g. impact of trigger definiation

BackdoorBench provides serveral new analysis from different perspectives, mainly including:

a. effects of poinsoning ratios and number of classes
b. quick learning of backdoor
c. trigger generalizaion
d. memorization and forgetting of poisoned samples
e. several analysis tools

Analysis tools

t-SNE provides a global visualization of feature representations of a set of samples in a model, and it can help us to observe whether the backdoor is formed or not.
Gradient-weighted class activation mapping(Grad-CAM) and Shapley value map are two individual analysis tools to visualize the contributions of differentpixels of one image in a model, and they can show that whether the trigger activates the backdoor or not.
Frequency saliency map to visualize the contribution of each individual frequency spectrum to the prediction, providing a novel prespective of backdoor from the frequency space.
Neuron activation calculates the average activation of each neuron in a layer for a batch of samples. It can be used to analyse the activation path of poisoned and clean samples, as well as the activation changes w.r.t. the model weights’ changes due to attack or defense, providing deeper insight behind the backdoor.

t-SNE visualization of samples

t-SNE visual

Conclusion

An open-sourced framework with backdoor attack and defense integrated.

Proposed a standard evaluation precedure with several visual analysis tool.