Algorithmic Showdown: Analysing the Performance of KNN, Random Forest, and SVM across Varied Dimensions

Authors

  •   S. Ramalakshmi Assistant Professor, Department of Computer Science & Applications, Don Bosco College (Arts & Science), Karaikal - 609 601, Puducherry
  •   G. Asha Assistant Professor & Head, Department of Computer Applications, Don Bosco College (Arts & Science), Karaikal - 609 601, Puducherry

DOI:

https://doi.org/10.17010/ijcs/2025/v10/i1/174923

Keywords:

Data Classification

, Data Analysis, KNN, Machine learning, Random Forest, Support Vector Machine (SVM).

Paper Submission Date

, January 9, 2025, Paper sent back for Revision, January 18, Paper Acceptance Date, January 20, Paper Published Online, February 5, 2025

Abstract

Data classification plays an essential role in machine learning and data analysis by grouping data points into predetermined categories based on their attributes. The goal is to develop models that can reliably determine the category of new, previously unseen data. This process is integral to many fields such as healthcare, finance, and image recognition, as it supports automated decision-making. Achieving high accuracy in classification involves choosing the best algorithm for the specific dataset, because different algorithms perform differently depending on the data’s distribution and features.

The present study examines and compares the effectiveness of three popular classification algorithms: K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Random Forest. Each algorithm employs a distinct classification strategy — KNN is based on distance-based learning, SVM focuses on constructing optimal decision boundaries, and Random Forest leverages multiple decision trees to enhance accuracy. To evaluate their performance thoroughly, we conduct experiments on two separate datasets, assessing their ability to accurately classify instances across varying conditions.

Our assessment aims to determine the highest and lowest performing scenarios for each algorithm, considering factors like accuracy, computational efficiency, and resilience to data variations. By examining the results, we offer valuable insights into the strengths and weaknesses of these models, helping researchers and practitioners choose the most suitable classification algorithm for their specific needs. This study enhances the understanding of how various machine learning models perform across different datasets, ultimately supporting the creation of more dependable and efficient classification systems.

Downloads

Download data is not yet available.

Published

2025-02-05

How to Cite

Ramalakshmi, S., & Asha, G. (2025). Algorithmic Showdown: Analysing the Performance of KNN, Random Forest, and SVM across Varied Dimensions. Indian Journal of Computer Science, 10(1), 34–45. https://doi.org/10.17010/ijcs/2025/v10/i1/174923

References

F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, "A comprehensive survey on support vector machine classification: Applications, challenges and trends," Neurocomputing, vol. 408, pp. 189–215, 2020, doi: 10.1016/j.neucom.2019.10.118.

C. Gold and P. Sollich, "Model selection for support vector machine classification," Neurocomputing, vol. 55, no. 1–2, pp. 221–249, Sep. 2003, doi: 10.1016/S0925-2312(03)00375-8.

M. Awad and R. Khanna, "Support vector machines for classification," in Efficient Learn. Machines, Berkeley, CA: Apress, pp. 39–66, Apr. 2015, doi: 10.1007/978-1-4302-5990-9_3.

V. A. Kumari and R. Chitra, "Classification of diabetes disease using Support Vector Machine," Int. J. Eng. Res. Appl., vol. 3, no. 2, pp. 1797–1801, 2013.

V. Jackins, S. Vimal, M. Kaliappan, and M. Y. Lee, "AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes," J. Supercomputing, vol. 77, pp. 5198–5219, Nov. 2020, doi: 10.1007/s11227-020-03481-x.

M. Pal and S. Parija, "Prediction of heart diseases using random forest," J. Phys.: Conf. Ser., vol. 1817, no. 1, 2021, doi: 10.1088/1742-6596/1817/1/012009.

R. D. H. Devi, P. Sreevalli, Keerthana, Prathyusha, and M. Asia. "Prediction of diseases using random forest classification algorithm," Zeichen J., vol. 6, no. 5, pp. 19–26, 2020, doi: 15.10089.ZJ.2020.V6I3.285311.2047.

D. C. Yadav and S. Pal, "Prediction of heart disease using feature selection and random forest ensemble method," Int. J. Pharmaceutical Res. Scholars, vol. 12, no. 4, pp. 56–66, 2020, doi: 10.31838/ijpr/2020.12.04.013.

“Support vector machines tutorial – learn to implement SVM in Python,†Data Flair, Aug. 29, 2019. [Online]. Available: https://data-flair.training/blogs/svm-support-vector-machine-tutorial/

J. Sun, W. Du, and N. Shi, “A survey of kNN Algorithm,†Inf. Eng. Appl. Comput., vol. 1, no. 1, May, 2018, doi: 10.18063/ieac.v1i1.770.

K. Taunk, S. De, S. Verma, and A. Swetapadma, "A brief review of nearest neighbor algorithm for learning and classification," 2019 Int. Conf. Intell. Comput. Control Syst., Madurai, India, 2019, pp. 1255–1260, doi: 10.1109/ICCS45141.2019.9065747.

G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, “KNN model-based approach in classification,†in Meersman, R., Tari, Z., Schmidt, D.C. (eds) On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Comput. Sci., vol 2888. Springer, Berlin, Heidelberg, doi: 10.1007/978-3-540-39964-3_62.

S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, "Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction," Sci. Rep. 12, Art no. 6256, Apr. 2022, doi: 10.1038/s41598-022-10358-x.

R. Muhamedyev, K. Yakunin, S. Iskakov, S. Sainova, A. Abdilmanova, and Y. Kuchin, "Comparative analysis of classification algorithms," in 2015 9th Int. Conf. Application Inf. Communication Technologies, Rostov on Don, Russia, 2015, pp. 96–101, doi: 10.1109/ICAICT.2015.7338525.

S. Sahan, K. Polat, H. Kodaz, and S. Güneş, “The medical applications of Attribute Weighted Artificial Immune System (AWAIS): Diagnosis of heart and diabetes diseases,†Selcuk Univ., Turkey, doi: 10.1007/11536444_35.