Al-Hikmah University Central Journal
COMPARING THE EFFICIENCY OF LOGISTIC REGRESSION CLASSIFIER AND K-NEAREST NEIGHBOURS CLASSIFIERS FOR PREDICTING STUDENTS' PERFORMANCE IN COMPUTER SCIENCE PROGRAMME
Abstract
Accurate prediction of Master's program eligibility from Computer Science Bachelor's
performance is vital. Despite K-nearest neighbours' (KNN) common use in predictions,
there's a gap in comparing it with the Logistic Regression Classifier (LRC). This study aimed
to address this gap by identifying the most suitable classifier between LRC and KNN for
accurately predicting students' performance in the computer science programme. In order to
evaluate the performance metrics of two classification algorithms, LRC and KNN were
modelled through 10-fold cross-validation in WEKA, with a comprehensive evaluation of
performance metrics for each classifier. The study used secondary data from Al-Hikmah
University, Ilorin, Nigeria (2009-2015) on computer science students' academic
performances. It included 7 attributes and 478 instances for each, comprising three
categorical and four numeric features. Class labels Yi (YES, NO) reflected meeting minimum
admission requirements, with grade scales for class labels including 1.0-1.49(pass), 1.50-
2.3(Third class honor) 2.40-3.49 (Second class honor lower division), 3.5-4.49 (Second class
honor upper division) and 4.5-5.0 (First class honor). LRC showcased superior performance
over KNN, when tuning parameter k = 1with Euclidean distance used as distance metrics,
across multiple metrics, including accuracy (94.7699% vs. 89.9582%), precision (96.1% vs.
92.7%), recall (96.9% vs. 93.8%), F-measure (96.5% vs. 93.3%), ROC Area (97.5% vs.
85.6%), and error rate (5.2301% vs. 10.0418%). Notably, KNN exhibited faster processing
time (0.01 sec vs. 0.07 sec) when compared to LRC. The optimal KNN configuration for the
model was observed when k = 3. The study recommends utilizing LRC as the preferred
predictive model for students' performance in a computer science programme.