Here's a list of thesis proposed by the members of the lab. These are basically ideas from which develop a more structured work once the topic of interest is selected. Basic knowledge of deep learning and computer vision is required to engage with these topics effectively.
Keywords: Deepfake detection, Mechanistic interpretability, Performance enhancement
Abstract: This thesis explores how to enhance the performance of VLM-based deepfake detectors. The goal is to detect deepfakes coming from the modern generative models, such as DALL-E3, Midjourney, and Stable Diffusion XL. To this end, mechanistic interpretability will be exploited to understand what are the most relevant model's components to act on.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)
References:
Yingdong Shi, Changming Li, Yifan Wang, Yongxiang Zhao, Anqi Pang, Sibei Yang, Jingyi Yu, and Kan Ren. Dissecting and mitigating diffusion bias via mechanistic interpretability. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 8192–8202, June 2025
Sohail Ahmed Khan and Duc-Tien Dang-Nguyen. 2024. CLIPping the Deception: Adapting Vision-Language Models for Universal Deepfake Detection. In Proceedings of the 2024 International Conference on Multimedia Retrieval (ICMR '24). Association for Computing Machinery, New York, NY, USA, 1006–1015
Keywords: Information theory, Entropy estimation, Efficient architectures
Abstract: This thesis investigates how optimizing different parts of transformer-based models affects the quality of learned representations. Focusing on encoder or decoder embeddings in monocular depth estimation, the study compares encoder, decoder, and full-model optimization. Insights can be extended to different computer vision tasks.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)
References:
Kingma et al., Auto-Encoding Variational Bayes, 2013.
Tishby et al., Deep learning and the information bottleneck principle. 2015.
Keywords: Deepfake detection, Social network compression, Forensic traces
Abstract: This thesis investigates deepfake detection under real-world social network compression, where platform-specific encoding and block artifacts significantly distort forensic traces. We will explore both model-side robustness strategies (e.g., artifact-suppression and attention guidance) and data-side emulation techniques to reproduce realistic compression pipelines, aiming to improve generalization from lab conditions to in-the-wild content.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Simone Teglia (teglia@diag.uniroma1.it)
References:
Montibeller, A., Shullani, D., Baracchi, D., Piva, A., & Boato, G. (2025, October). Bridging the Gap: A Framework for Real-World Video Deepfake Detection via Social Network Compression Emulation. Proceedings of the 1st on Deepfake Forensics Workshop: Detection, Attribution, Recognition, and Adversarial Challenges in the Era of AI-Generated Media, 29–36. doi:10.1145/3746265.3759670
Li, M., Tao, R., Liu, Y., Tan, C., Qin, H., Li, B., … Zhao, Y. (2025). Pay Less Attention to Deceptive Artifacts: Robust Detection of Compressed Deepfakes on Online Social Networks. arXiv [Cs.CV]. Retrieved from http://arxiv.org/abs/2506.20548
Keywords: Diffusion Models, Explainable AI, Medical Imaging, Dental Radiography
Abstract: This thesis aims to advance deep learning for periodontal analysis by tackling data scarcity and model opacity. We will develop a generative framework to augment the training dataset, improving model robustness and accuracy. Concurrently, we will integrate explainable AI methods to support the models' diagnostic reasoning, ensuring their outputs are transparent, clinically relevant and trustworthy.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Gianmarco Scarano (gianmarco.scarano@uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)
References:
Marie, H.S., Elbaz, M., Soliman, R.s. et al. DentoMorph-LDMs: diffusion models based on novel adaptive 8-connected gum tissue and deciduous teeth loss for dental image augmentation. Sci Rep 15, 27268 (2025). https://doi.org/10.1038/s41598-025-11955-2
Glick, A., Clayton, M., Angelov, N., & Chang, J. (2022). Impact of explainable artificial intelligence assistance on clinical decision-making of novice dental clinicians. JAMIA Open, 5(2). Retrieved from https://www.scopus.com/pages/publications/85134949460
Keywords: Attention optimization, Efficient architectures, Model compression
Abstract: This thesis compares two key approaches to optimizing Vision Transformers: architectural changes (e.g., attention modifications) and model compression techniques (e.g., pruning, quantization, distillation). It evaluates their interaction, robustness to compression, and the efficiency-accuracy trade-off across different networks and tasks.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it)
References:
Papa et al., A Survey on Efficient Vision Transformers: Algorithms, Techniques, and Performance Benchmarking, 2023.
Schiavella et al.,Optimize vision transformer architecture via efficient attention modules: a study on the monocular depth estimation task. 2024.
Keywords: Dataset development, 3D simulation, Green AI
Abstract: This thesis proposes the development of a 3D dataset for agricultural applications, featuring both point clouds and 3D mesh models. The dataset will support use cases such as simulation, virtual and augmented reality, and machine learning in smart farming scenarios.
In collaboration with: Tecnoseta SRL (https://www.tecnoseta.com/)
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Claudia Melis Tonti (melistonti@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it)
References:
Chang et al., Shapenet: An information-rich 3d model repository, 2015.
Keywords: Watermarking, Generative AI authenticity, Multimedia forensics
Abstract: This thesis investigates how the implementation of semi-fragile watermarking affects the detection of AI-generated visual media. It evaluates the effectiveness of semi-fragile watermarks in the context of generative content authenticity and the robustness under backdoor attacks and data poisoning.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Giuseppe Daidone (giuseppe.daidone@uniroma1.it)
References:
Deng, J., Lin, C., Zhao, Z., Liu, S., Peng, Z., Wang, Q., & Shen, C. (2025). A Survey of Defenses Against AI-Generated Visual Media: Detection, Disruption, and Authentication. ACM Computing Surveys. https://doi.org/10.1145/3770916
Yuan, Z., Zhang, X., Wang, Z., & Yin, Z. (2024). Semi-fragile neural network watermarking for content authentication and tampering localization. Expert Systems with Applications, 236, 121315–121315. https://doi.org/10.1016/j.eswa.2023.121315
Keywords: Geometric Algebra, Equivariant Transformers, 3D Face Reconstruction
Abstract: The objective of this thesis is to exploit the Geometric Algebra (GA) in order to determine whether a 3D representation is accurated. The GA is a mathematical framework for geometrical computations. GA represents data as multivectors and describe both geometric objects as well as their transformations in three-dimensional space.
Supervisors: Irene Amerini (amerini@diag.uniroma1.it), Lorenzo Cirillo (cirillo@diag.uniroma1.it) , Claudia Melis Tonti (melistonti@diag.uniroma1.it), Claudio Schiavella (schiavella@diag.uniroma1.it)
References:
Johann Brehmer and Pim de Haan and Sönke Behrends and Taco Cohen (2023). Geometric Algebra Transformer. https://doi.org/10.48550/2305.18415
Evangelos Sariyanidi and Claudio Ferrari and Federico Nocentini and Stefano Berretti and Andrea Cavallaro and Birkan Tunc (2025). 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation. https://doi.org/10.48550/2505.18025