Comparative research on code vulnerability detection: Open-source vs. proprietary large language models and LSTM neural network
Loading...
Supplementary material
Other Title
Authors
Don, Ravihansa Geekiyanage Geekiyanage
Author ORCID Profiles (clickable)
Degree
Master of Applied Technologies (Computing)
Grantor
Unitec, Te Pūkenga – New Zealand Institute of Skills and Technology
Date
2024
Supervisors
Ardekani, Iman
Bell, Jamie
Bell, Jamie
Type
Masters Thesis
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
software development
vulnerability assessment
risk management framework
software security
computer security
Long Short-Term Memory (LSTM)
open source
neural networks
large language models
vulnerability assessment
risk management framework
software security
computer security
Long Short-Term Memory (LSTM)
open source
neural networks
large language models
ANZSRC Field of Research Code (2020)
Citation
Don, R.G.G. (2024). Comparative research on code vulnerability detection: Open-source vs. proprietary large language models and LSTM neural network (Unpublished document submitted in partial fulfilment of the requirements for the degree of Master of Applied Technologies (Computing)). Unitec, Te Pūkenga - New Zealand Institute of Skills and Technology.
https://hdl.handle.net/10652/6749
Abstract
The reliance of industries such as banking, e-commerce, logistics, transportation, energy, and healthcare on computer systems has escalated the threat of cyberattacks. With the rapid development of the online ecosystem and the growth of sensitive data, cybercriminals have developed the ability to attack software systems. Safeguarding software has become an urgent yet challenging task despite advancements in technology. The contributions of this thesis show the incorporation of security into the Software Development Lifecycle through a focus on Static Code Analysis as a proactive approach for the discovery and mitigation of security flaws in the development cycle.
This research focuses on enhancing vulnerability detection in source code through advanced machine learning techniques, including open-source fine-tuned and proprietary large language models. It compares the models, including CodeGen2, LLaMA 2, and GPT, developed by OpenAI to quantitative and qualitative how suitable they are for detecting vulnerabilities, their accuracy and efficiency. Using a zero-shot classification-based approach, the study examines these models’ capabilities in detecting security risks and compares them to Word2Vec+LSTM neural networks. The findings reveal that CodeGen2 emerges as the most reliable model for vulnerability detection, achieving near-perfect precision and balanced recall, leading to superior F1 and AUC scores. LLaMA2-7b delivers reasonable performance, particularly in precision, but falls short in recall. Conversely, GPT-4 Assistant excels in recall but suffers from a high false positive rate, limiting its effectiveness. Classical neural networks, such as Word2Vec+LSTM, demonstrate moderate capability but lag modern LLMs in precision and recall.
Through this comparative analysis, the thesis underscores the importance of selecting tools aligned with organizational needs and constraints. The insights gained contribute to developing secure software by integrating machine learning into the SDLC, inspiring further progress in vulnerability detection and mitigating risks in an increasingly digital world.
Publisher
Permanent link
Link to ePress publication
DOI
Copyright holder
Author
Copyright notice
All rights reserved
