publications
2024
- Conditioned Prompt-Optimization for Continual Deepfake DetectionFrancesco Laiti, Benedetta Liberatori, Thomas De Min, and 1 more author2024
The rapid advancement of generative models has significantly enhanced the realism and customization of digital content creation. The increasing power of these tools, coupled with their ease of access, fuels the creation of photorealistic fake content, termed deepfakes, that raises substantial concerns about their potential misuse. In response, there has been notable progress in developing detection mechanisms to identify content produced by these advanced systems. However, existing methods often struggle to adapt to the continuously evolving landscape of deepfake generation. This paper introduces Prompt2Guard, a novel solution for exemplar-free continual deepfake detection of images, that leverages Vision-Language Models (VLMs) and domain-specific multimodal prompts. Compared to previous VLM-based approaches that are either bounded by prompt selection accuracy or necessitate multiple forward passes, we leverage a prediction ensembling technique with read-only prompts. Read-only prompts do not interact with VLMs internal representation, mitigating the need for multiple forward passes. Thus, we enhance efficiency and accuracy in detecting generated content. Additionally, our method exploits a text-prompt conditioning tailored to deepfake detection, which we demonstrate is beneficial in our setting. We evaluate Prompt2Guard on CDDB-Hard, a continual deepfake detection benchmark composed of five deepfake detection datasets spanning multiple domains and generators, achieving a new state-of-the-art. Additionally, our results underscore the effectiveness of our approach in addressing the challenges posed by continual deepfake detection, paving the way for more robust and adaptable solutions in deepfake detection.
2023
- IEEE STAR 2023Meta-Trainer: An Augmented Reality Trainer for Home Fitness with Real-Time FeedbackLorenzo Orlandi, Giulia Martinelli, Francesco Laiti, and 3 more authorsIn IEEE International Workshop on Sport, Technology and Research (STAR), 2023, 2023
Meta-Trainer allows people to train on their own, supervised by a virtual trainer that provides exercise samples and real-time feedback. The system relies on a set of smart glasses for Augmented Reality (AR), to enable users to interact with the virtual trainer in a hybrid environment. Given a target exercise, the virtual trainer demonstrates the movements to be reproduced. Differently from video-based applications, the user can move around the virtual trainer so as to look at the exercise from different angles for a more immersive and realistic experience. The AR glasses can also track head movements, which are then used to give feedback and track the training progress. This system offers an advancement compared to video-only trainers, not requiring the user to look at a screen while performing the exercise.
- MDPI 2023Identifying Synthetic Faces through GAN Inversion and Biometric Traits AnalysisCecilia Pasquini, Francesco Laiti, Davide Lobba, and 3 more authors2023
In the field of image forensics, notable attention has been recently paid toward the detection of synthetic contents created through Generative Adversarial Networks (GANs), especially face images. This work explores a classification methodology inspired by the inner architecture of typical GANs, where vectors in a low-dimensional latent space are transformed by the generator into meaningful high-dimensional images. In particular, the proposed detector exploits the inversion of the GAN synthesis process: given a face image under investigation, we identify the point in the GAN latent space which more closely reconstructs it; we project the vector back into the image space, and we compare the resulting image with the actual one. Through experimental tests on widely known datasets (including FFHQ, CelebA, LFW, and Caltech), we demonstrate that real faces can be accurately discriminated from GAN-generated ones by properly capturing the facial traits through different feature representations. In particular, features based on facial landmarks fed to a Support Vector Machine consistently yield a global accuracy of above 88% for each dataset. Furthermore, we experimentally prove that the proposed detector is robust concerning routinely applied post-processing operations.