A framework for analysis and comparison of deepfakes detection methods

Loading...
Thumbnail Image

Supplementary material

Other Title

Authors

Wang, Changjin

Author ORCID Profiles (clickable)

Degree

Master of Computing

Grantor

Unitec Institute of Technology

Date

2021

Supervisors

Sharifzadeh, Hamid
Fleming, Rachel
Ardekani, Iman

Type

Masters Thesis

Ngā Upoko Tukutuku (Māori subject headings)

Keyword

deepfakes
detection
face-changing software
image manipulation
autoencoders
fake videos

ANZSRC Field of Research Code (2020)

Citation

Wang, C. (2021). A framework for analysis and comparison of deepfakes detection methods. (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Computing). Unitec Institute of Technology, New Zealand. Retrieved from https://hdl.handle.net/10652/5395

Abstract

With the rise of AI (Artificial Intelligence), people can already utilise Deepfakes technology to generate fake pictures and videos increasingly. Similar to all technologies, while bringing benefits, this technology also has its downsides, such as spreading false information and endangering public interests. In order to combat the harm of fake-face videos, researchers have proposed a variety of different deep forgery detection algorithms, and have achieved remarkable results. However, a common problem regarding these detection methods are that in-library detection can normally achieve high accuracy, but the performance severely degraded in cross-library detection. That is to say, there is a serious problem of insufficient generalisation ability. The current mainstream detection method is to train a binary classification model on real videos and fake videos, and classify real videos and tampered videos through a classifier to distinguish true and false. In other words, Deepfakes detection is based on the evaluation criteria of the binary classification model to evaluate the performance of various detection methods. However, each evaluation criteria will have a different emphasis on different application scenarios and technical requirements. That is to say, using only some kind of single evaluation criteria cannot effectively compare the performance of different detection methods. In current literature, Deepfakes detection usually uses only the Area Under Curve (AUC) in the binary classification model as the evaluation standard. Nevertheless, AUC focuses on only the relative size of the probability value, and it does not consider the absolute size of the threshold and probability value. Moreover, when the data is very uneven, AUC may not properly assess the performance of the detection method. To better compare the performance differences between various detection methods, this thesis provides analysis and comparison on the six Deepfakes detection methods of Two-stream, MesoNet, HeadPose, FWA, VA and Multi-task. To assess generalisation ability of these methods, I conduct intra-library and cross-library tests on the existing three fake face video datasets. For this purpose, accuracy (ACC) and error rate are used as evaluation criteria in this thesis and I will focus on analysing the impacts of three factors (dataset partitioning, data augmentation operations, and the detection threshold selection) on the generalisation ability of the Deepfakes detection methods. Thus, the principal contributions of this thesis outline as follows: 1) an analytical crosslibrary platform for comparison of Deepfakes detection methods; 2) proposing new evaluation metrics based on data partitioning, augmentation, and threshold selection; 3) providing an overall performance indication of detection methods based on their generalisation performance on different data types.

Publisher

Link to ePress publication

DOI

Copyright holder

Author

Copyright notice

All rights reserved

Copyright license

Available online at