Utilizing artificial intelligence for website personality detection
Loading...
Supplementary material
Other Title
Authors
Chishti, Shafquat Ali
Author ORCID Profiles (clickable)
Degree
Doctorate in Computing
Grantor
Unitec
Date
2026
Supervisors
Ardekani, Iman
Type
Doctoral Thesis
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
website personalities
Website Personality Scale
machine learning
AI in website evaluation
websites
evaluation
library (computing)
automated evaluation
artificial intelligence
New Zealand
Website Personality Scale
machine learning
AI in website evaluation
websites
evaluation
library (computing)
automated evaluation
artificial intelligence
New Zealand
ANZSRC Field of Research Code (2020)
Citation
Chishti, S.A. (2026) Utilizing artificial intelligence for website personality detection. (Unpublished document submitted in partial fulfilment of the requirements for the degree of Doctor of Computing). Unitec.
https://hdl.handle.net/10652/7284
Abstract
MAIN RESEARCH QUESTION:
How can ML be used to automatically classify a website’s personality based on measurable quantitative attributes, without human subjective influence?
RQ1: How can the ‘Items’ define in the WPS be systematically mapped to quantifiable website elements?
RQ2: How can quantitative website elements be accurately extracted for personality classification?
RQ3: How can ML modules be designed to classify website personality across multiple traits and dimensions?
RQ4: How can the developed modules be validated against human perception of website personality?
ABSTRACT
To evaluate the personality of a website, surveys are commonly employed among individuals who have engaged with the respective website. However, surveys inherently introduce human bias due to the subjective nature of human input. The determination of a website's personality should ideally be objective and devoid of human bias and preferences. To achieve the classification of a website's personality without human intervention, this thesis proposes a methodology grounded in automated quantitative analysis. This method involves assigning ratings to specific quantitative features of a website and then using these ratings to assess personality traits.
This research involves quantifying various elements of websites, utilizing a database comprising 3000 websites for algorithms training and testing. The data extraction tools i.e., JSoup, Selenium WebDriver, and IBM Tone Analyzer Service are employed to extract data from the websites. Artificial intelligence (AI) techniques have been utilized for gaining insight from the data collected, reducing the reliance on human intervention for data extraction processes. The integration of AI, including machine learning (ML) and natural language processing (NLP) as subsets, offers numerous enhancements to the data mining process. Four distinct ML algorithms are implemented to develop four modules by utilizing the acquired quantitative data from the websites. The chosen algorithms are K-means, Expectation Maximization (EM), Hierarchical Agglomerative Clustering (HAC), and Density-Based Spatial Clustering of Applications with Noise (DBSCAN ).
Each of these algorithms is examined in terms of its methodology, applicability, and performance in organizing data into meaningful groups. The K-means algorithm partitions data based on centroid averages, requiring a predetermined number of clusters, while EM probabilistically assigns data points to clusters, accommodating various cluster shapes and sizes. HAC constructs a hierarchical structure of clusters through step-by-step merging or splitting, without prior knowledge of the number of clusters. DBSCAN identifies dense regions of data points separated by lower density areas, allowing for flexible cluster shapes and handling outliers effectively. Through this comparative analysis, insights into the strengths and limitations of each algorithm are studied. These four algorithms belong to four different clustering methods. Utilizing four algorithms from distinct clustering methods will yield varied website identification outcomes, bolstering confidence in the obtained results.
The thesis includes the development of a software tool designed to streamline various stages of the research process, including the creation of the website database, extraction of quantitative data, application of ML techniques for calculations and modules development, maintain survey processing and analysis of the acquired results etc.
The experiments are conducted individually for each module, utilizing identical training and testing datasets. A survey is then administered, and validation of the results obtained from the developed modules is carried out through this survey. Analysis of the experiment outcomes verifies that the developed module can accurately identify website personality with a success rate of up to 94% (with a Relative Error (Ratio) of ≤ 0.50), as validated by the validation dataset. Consequently, these modules enable the detection of a website's personality without relying on human input.
This thesis discovers relationships between website attributes and website personality, presenting potential applications in various domains. Firstly, for example, website developers can utilize the insights gained from this research to design their websites in alignment with specific business requirements. By ensuring that the online portrayal of a brand resonates with its desired perception, developers can enhance customer perceptions and foster long term loyalty, crucial in today's digital landscape where authentic and engaging online experiences are sought after. Another possible application which is more relevant in today’s culture is the development of cultural websites that are aligned with New Zealand initiatives and movements, media, and the general interest of the public. This will serve as groundwork for individuals or organizations interested in creating diverse cultural platforms online. By providing insights into website attributes such as logos, imagery, content, and features, developers can leverage this research to craft culturally resonant websites that contribute to fostering cross-cultural understanding in our globalized digital world. This thesis is not only confined to the aforementioned general examples, but this thesis may also serve as a groundwork for further studies and disciplines, with broad applicability to various aspects of contemporary life.
Publisher
Permanent link
Link to ePress publication
DOI
Copyright holder
Author
Copyright notice
All rights reserved
