SMS phishing detection using machine learning and deep learning techniques
Loading...
Supplementary material
Other Title
Authors
Hasti, Pavan
Author ORCID Profiles (clickable)
Degree
Master of Applied Technologies (Computing)
Grantor
Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology
Date
2025
Supervisors
Barmada, Bashar
Varastehpour, Soheil
Varastehpour, Soheil
Type
Masters Thesis
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
New Zealand
SMS phishing
phishing
detection
scams
machine learning
deep learning
SMS phishing
phishing
detection
scams
machine learning
deep learning
ANZSRC Field of Research Code (2020)
Citation
Hasti, P. (2025). SMS phishing detection using machine learning and deep learning techniques (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology
https://hdl.handle.net/10652/6803
Abstract
RESEARCH QUESTIONS
How effective are ML and DL models in detecting SMS phishing in New Zealand’s mobile system?
• What impact does class imbalance have, and how can SMOTE address this in phishing detection?
• Which ML and DL models perform best for SMS phishing detection in New Zealand?
• How can ethical data handling be maintained in SMS phishing detection?
• What real-world challenges (model drift, adversarial attacks) could affect SMS phishing detection, and how can they be mitigated?
ABSTRACT
Short Message Service (SMS) is still a vital communication tool in our daily life activities, even with the quick development of Internet protocol-based messaging services. An increasingly sophisticated cyber threat known as SMS phishing (smishing) has emerged in tandem with the rise in mobile device use. As a result, people are finding it hard to distinguish good messages from bad ones. The attackers' propensity for always developing methods has made smishing detection problematic for typical phishing detection techniques including heuristic, feature-based, rule-based, and blacklist approaches. The aim is to construct a New Zealand domain-specific SMS phishing detection system to overcome these difficulties and improve mobile system cybersecurity. The goal is to build an effective ML and DL-based model to accurately detect and categorize SMS smishing messages, addressing class imbalance and ensuring ethical data handling for improved cybersecurity. This study collects the dataset, which is the combination of SMS Smishing Collection from Kaggle and smishing messages from the New Zealand Department of Internal Affairs (DIA) anti-scam archive, ensuring relevance to the local context. The Pre-processing methods involved steps to manage missing and duplicated values, while checking label uniqueness, and performing text pre-processing and lemmatization, followed by label encoding. The dataset is balanced with SMOTE. Random Forest and XGBoost, CNN, RNN, and LSTM are some of the deep learning and machine learning classification models selected for their exceptional performance in text analysis. The models work well for detecting fake SMS messages in the setting of mobile communication networks. Accuracy, precision, recall, and F1score were some of the important measures used to assess the models' performance. The result showed that the XGBoost classifier achieving a superior accuracy of 97.05% compared to other models. This study highlights the practical implications of smishing detection, particularly in real-world mobile communication systems, emphasizing the importance of integrating these models into mobile security applications. Additionally, the research discusses potential future work, including the integration of transformer-based models, the handling of model drift, and addressing adversarial concerns in dynamic environments.
Publisher
Permanent link
Link to ePress publication
DOI
Copyright holder
Author
Copyright notice
All rights reserved
