Posts by Collection

news

portfolio

publications

Explaining Machine Learning DGA Detectors from DNS Traffic Data

Published in CEUR Workshop Proceedings, Vol. 3260, pp. 150-168, 2022

This paper discusses methods to explain machine learning-based Domain Generation Algorithm (DGA) detectors using DNS traffic data.

Recommended citation: Piras, G., Pintor, M., Demetrio, L., & Biggio, B. (2022). "Explaining Machine Learning DGA Detectors from DNS Traffic Data." CEUR Workshop Proceedings, 3260, 150-168.
Download Paper

Improving Fast Minimum-Norm Attacks with Hyperparameter Optimization

Published in ESANN-23, 2023

This paper presents enhancements to fast minimum-norm adversarial attacks through hyperparameter optimization techniques.

Recommended citation: Floris, G., Mura, R., Scionis, L., Piras, G., Pintor, M., Demontis, A., & Biggio, B. (2023). "Improving Fast Minimum-Norm Attacks with Hyperparameter Optimization." arXiv preprint arXiv:2310.08177.
Download Paper

Samples on Thin Ice: Re-Evaluating Adversarial Pruning of Neural Networks

Published in 2023 International Conference on Machine Learning and Cybernetics (ICMLC), 2023

This research re-evaluates the effectiveness of adversarial pruning techniques in neural networks.

Recommended citation: Piras, G., Pintor, M., Demontis, A., & Biggio, B. (2023). "Samples on Thin Ice: Re-Evaluating Adversarial Pruning of Neural Networks." 2023 International Conference on Machine Learning and Cybernetics (ICMLC).
Download Paper

Adversarial Attacks Against Uncertainty Quantification

Published in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

This study explores adversarial attacks targeting uncertainty quantification methods in machine learning models.

Recommended citation: Ledda, E., Angioni, D., Piras, G., Fumera, G., Biggio, B., & Roli, F. (2023). "Adversarial Attacks Against Uncertainty Quantification." Proceedings of the IEEE/CVF International Conference on Computer Vision.
Download Paper

On the Robustness of Adversarial Training Against Uncertainty Attacks

Published in arXiv preprint arXiv:2410.21952, 2024

This paper examines the robustness of adversarial training methods against uncertainty-based attacks.

Recommended citation: Ledda, E., Scodeller, G., Angioni, D., Piras, G., CinĂ , A. E., Fumera, G., Biggio, B., & Roli, F. (2024). "On the Robustness of Adversarial Training Against Uncertainty Attacks." arXiv preprint arXiv:2410.21952.
Download Paper

HO-FMN: Hyperparameter Optimization for Fast Minimum-Norm Attacks

Published in Neurocomputing, Vol. 616, Article 128918, 2025

This study introduces HO-FMN, a method for hyperparameter optimization in fast minimum-norm adversarial attacks.

Recommended citation: Mura, R., Floris, G., Scionis, L., Piras, G., Pintor, M., Demontis, A., Giacinto, G., & Biggio, B. (2025). "HO-FMN: Hyperparameter Optimization for Fast Minimum-Norm Attacks." Neurocomputing, 616, 128918.
Download Paper

LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback

Published in arXiv 2025, 2025

This paper introduces LatentBreak, a white-box jailbreak attack that creates natural, low-perplexity adversarial prompts through latent-space feedback.

Recommended citation: Mura, R., Piras, G., Lukosiute, K., Pintor, M., Karbasi, A., & Biggio, B. (2025). "LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback." arXiv preprint arXiv:2510.08604.
Download Paper

SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models

Published in AAAI 2026, 2025

This paper uses Self-Organizing Maps to extract multiple refusal directions in language models, showing that multi-directional ablation can suppress refusal more effectively than a single-direction baseline.

Recommended citation: Piras, G., Mura, R., Brau, F., Oneto, L., Roli, F., & Biggio, B. (2026). "SOM Directions are Better than One: Multi-Directional Refusal Suppression in Language Models." Proceedings of the AAAI Conference on Artificial Intelligence.
Download Paper

Latent-space Attacks for Refusal Evasion in Language Models

Published in arXiv 2026, 2026

This paper recasts refusal suppression as a latent-space evasion attack against linear probes, introducing Controlled Latent-space Evasion to push model representations into the compliant region.

Recommended citation: Piras, G., Mura, R., Brau, F., Pintor, M., Oneto, L., Roli, F., & Biggio, B. (2026). "Latent-space Attacks for Refusal Evasion in Language Models." arXiv preprint arXiv:2605.21706.
Download Paper

talks