SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models

Published in AAAI 2026, 2025

Recommended citation: Piras, G., Mura, R., Brau, F., Oneto, L., Roli, F., & Biggio, B. (2026). "SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models." Proceedings of the AAAI Conference on Artificial Intelligence.
Download Paper