SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models
Published in AAAI 2026, 2025
Recommended citation: Piras, G., Mura, R., Brau, F., Oneto, L., Roli, F., & Biggio, B. (2026). "SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models." Proceedings of the AAAI Conference on Artificial Intelligence.
Download Paper
