A dual-encoder deepfake detector pairs a frozen specialist with a LoRA-tuned MLLM, trained first via binary alignment then via RL to reward explain-then-classify behavior, yielding improved cross-dataset performance and interpretability.
arXiv preprint arXiv:2105.14376 (2021)
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Five universal physical descriptors including Laplacian variance, Sobel statistics, and residual noise variance, when integrated as text encodings with CLIP, achieve up to 99.8% accuracy detecting synthetic images across GAN and diffusion model datasets.
citing papers explorer
-
The Regularizing Power of Language-Training Deepfake Detectors
A dual-encoder deepfake detector pairs a frozen specialist with a LoRA-tuned MLLM, trained first via binary alignment then via RL to reward explain-then-classify behavior, yielding improved cross-dataset performance and interpretability.
-
Beyond Semantics: Uncovering the Physics of Fakes via Universal Physical Descriptors for Cross-Modal Synthetic Detection
Five universal physical descriptors including Laplacian variance, Sobel statistics, and residual noise variance, when integrated as text encodings with CLIP, achieve up to 99.8% accuracy detecting synthetic images across GAN and diffusion model datasets.