Silenced Bias
Biases silenced by safety-aligned refusals
Refusal Steering
Expose latent fairness via activation interventions to supress refusals
Customizable
Add your own groups, subjects, & models
SBB is designed as a research tool: you can define new demographic groups, specify subjects (topics or traits you want to test), and evaluate any open-source model for fairness—especially in regimes where safety-aligned refusals would otherwise hide meaningful differences.
demo.ipynb@article{himelstein2025silenced,
title={Silenced Biases: The Dark Side LLMs Learned to Refuse},
author={Himelstein, Rom and LeVi, Amit and Youngmann, Brit and Nemcovsky, Yaniv and Mendelson, Avi},
journal={arXiv preprint arXiv:2511.03369},
year={2025}
}