Analysis and comparison of deepfakes detection methods for cross-library generalisation

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Wang, Changjin
Sharifzadeh, Hamid
Varastehpour, Soheil
Ardekani, Iman

Author ORCID Profiles (clickable)

Degree

Grantor

Date

2023-08-21

Supervisors

Type

Conference Contribution - Paper in Published Proceedings

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

deepfakes
detection
cross-library generalisation
transfer learning (machine learning)

ANZSRC Field of Research Code (2020)

Citation

Wang, C., Sharifzadeh, H.., Varastehpour, S., & Ardekani, I. (2023, August 21-23). Analysis and comparison of deepfakes detection methods for cross-library generalisation [Paper presentation] 20th Annual International Conference on Privacy, Security & Trust, Copenhagen, Denmark (PST2023) (pp. 1-6). https://hdl.handle.net/10652/6152

Abstract

The rise of generative artificial intelligence (GenAI) has made it increasingly possible to use Deepfakes technology to generate fake pictures and videos. While this technology has benefits, it also has downsides such as spreading misinformation and endangering public interests. To address this issue, researchers have proposed various deep forgery detection algorithms and have achieved remarkable results. However, a common problem regarding these detection methods is that while in-library detection can usually achieve high accuracy, their performance is significantly degraded in cross-library detection. This indicates a severe problem of insufficient generalisation ability. To better compare the performance differences between various detection methods, this paper analyses the detection performance of the six established models of Two-stream, MesoNet, HeadPose, FWA, VA, and Multi-task. To ensure consistency, we employ a uniform evaluation framework as a benchmark for comparison. We conduct extensive intra-library and crosslibrary tests to evaluate these methods’ generalisation ability by utilising accuracy and error rate as key evaluation criteria for our experiments. Additionally, we further explore areas for improvement by analysing the impact of data augmentation, dataset partitioning, and threshold selection on the performance of these detection methods. Our comparative experiments are conducted on three existing fake face video datasets, including FaceForensics++, DeepfakeTIMIT, and Celeb-DF. Our research findings indicate the database partitioning method has a direct impact on the detector’s performance, and to enhance generalisation performance, the database should be divided person-based manually. The effectiveness of data augmentation techniques in improving cross-library performance is generally limited, and setting the threshold directly using source domain data often leads to a high error rate in the target domain. The findings of this paper provide insights into the development of more effective detection methods to combat the harmful effects

Publisher

Link to ePress publication

DOI

Copyright holder

Authors

Copyright notice

All rights reserved

Copyright license

This item appears in: