Main Article Content

Abstract

Diabetes mellitus is a health problem of global concern, considering that most cases are only identified when complications arise. Therefore, early detection is essential in controlling the health and financial consequences of the disease. The purpose of this study is to compare two machine learning models using gradient boosting techniques, namely Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM). This study use a technique called RandomizedSearchCV to optimize the performance of the proposed machine learning models. In evaluating the machine learning models, the study used a variety of metrics such as accuracy, precision, recall, F1 score, and ROC-AUC. The LightGBM is a more efficient machine learning model than XGBoost based on the result. The LightGBM model had a classification accuracy of 77.3%, a precision of 71.1%, and a recall of 59.3%, which is the same value obtained by the XGBoost model. However, the LightGBM model had a higher F1 score of 64.6% and a ROC-AUC of 83.0% which indicates that the model is more balanced and can accurately classify and distinguish between the two classes. The best-performing machine learning model was integrated with a web-based system using a framework called Streamlit to create a system that is responsive, interactive, and user-friendly. The system is useful for early detection of diabetes mellitus and can be used by non-experts to determine whether a patient is at risk of developing the disease using real-time prediction and user-friendly data input. The results of the study showed that gradient boosting machine learning models can be used to diagnose and detect early cases of diabetes mellitus.

Keywords

Diabetes XGBoost LightGBM Machine Learning Streamlit

Article Details

How to Cite
Wahyuningtyas, S., & Al Hafidz, M. (2026). A Comparative Evaluation of XGBoost and LightGBM for Diabetes Mellitus Risk Prediction Using a Public Dataset and Web-Based Dashboard. EKSAKTA: Journal of Sciences and Data Analysis, 7(1). https://doi.org/10.20885/EKSAKTA.vol7.iss1.art11

References

  1. Khurin, I. Wahyuni, A. A. Prayitno, and Y. I. Wibowo, “Efektivitas Edukasi Pasien Diabetes Mellitus Tipe 2 Terhadap Pengetahuan dan Kontrol Glikemik Rawat Jalan di RS Anwar Medika,” Jurnal Pharmascience, vol. 06, no. 01, pp. 1–9, 2019, [Online]. Available: https://ppjp.ulm.ac.id/journal/index.php/pharmascience
  2. Gojka. Roglic, Global report on diabetes. World Health Organization, 2016.
  3. “IDF Diabetes Atlas 10th edition 537 million people worldwide have diabetes.” [Online]. Available: www.diabetesatlas.org
  4. B. Kebijakan Pembangunan, K. Kementerian, and K. Ri, “Dalam Angka Tim Penyusun Ski 2023 Dalam Angka Kementerian Kesehatan Republik Indonesia.”
  5. M. Bhavsar and M. Patel, “Predicting Cardiovascular Disease with Machine Learning Algorithms: A Review,” ITM Web of Conferences, vol. 65, p. 03011, 2024, doi: 10.1051/itmconf/20246503011.
  6. R. A. Salasa, H. Rahman, and D. Andiani, “Faktor Risiko Diabetes Mellitus Tipe 2 Pada Populasi Asia: A systematic Review,” Jurnal BIOSAINSTEK, vol. 1, no. 1, 2019, doi: 10.52046/biosainstek.v1i01.306.95-107.
  7. D. Zhang and Y. Gong, “The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure,” IEEE Access, vol. 8, pp. 220990–221003, 2020, doi: 10.1109/ACCESS.2020.3042848.
  8. A. Ramadona, E. Rustam, and M. Syauqie, “Hubungan Kepatuhan Minum Obat dengan Munculnya Gejala Neuropati Pada Pasien Diabetes Melitus Tipe 2 Di Puskesmas Andalas.”
  9. A. Brahmandjati, A. Mizwar, A. Rahim, and F. Asharudin, “Optimasi Prediksi Diabetes Dengan Algoritma XGBoost Dan Teknik Preprocessing Data.” [Online]. Available: https://www.kaggle.com/datasets/mathchi/diabetes-data-set,
  10. X. Y. Fu et al., “Development and validation of LightGBM algorithm for optimizing of Helicobacter pylori antibody during the minimum living guarantee crowd based gastric cancer screening program in Taizhou, China,” Prev. Med. (Baltim)., vol. 174, Sep. 2023, doi: 10.1016/j.ypmed.2023.107605.
  11. F. Caroline and N. Rachmat, “Comparison of XGBoost and LightGBM Algorithms in Predicting Heart Disease,” Brilliance: Research of Artificial Intelligence, vol. 5, no. 2, pp. 1232–1239, Dec. 2025, doi: 10.47709/brilliance.v5i2.7505.
  12. A. Brahmandjati, A. Mizwar, A. Rahim, and F. Asharudin, “Optimasi Prediksi Diabetes Dengan Algoritma XGBoost Dan Teknik Preprocessing Data.” [Online]. Available: https://www.kaggle.com/datasets/mathchi/diabetes-data-set,
  13. P. Septiana Rizky, R. Haiban Hirzi, U. Hidayaturrohman, U. Hamzanwadi Selong Jl TGKH Muhammad Zainuddin Abdul Madjid Pancor, and L. Timur, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” 2022. [Online]. Available: www.unipasby.ac.id
  14. E. R. Susanto and A. Cahyana, “Penerapan Algoritma XGBoost untuk Prediksi Diabetes: Analisis Confusion Matrix dan ROC Curve,” Fountain of Informatics Journal, vol. 10, no. 1, pp. 40–50, May 2025, doi: 10.21111/fij.v10i1.14311.
  15. J. Khatib Sulaiman, D. Wijayanto, B. Pilu Hartato, and U. Amikom Yogyakarta, “Analisis Perbandingan Performa Algoritma XGBoost dan LightGBM pada Klasifikasi Kanker Payudara,” Indonesian Journal of Computer Science.
  16. W. Zhou, “analysis of Diabetes Prediction Models Based on XgBoost and LightgBM”.
  17. R. G. Wardhana, G. Wang, and F. Sibuea, “Penerapan Machine Learning Dalam Prediksi Tingkat Kasus Penyakit Di Indonesia,” 2023.
  18. O. P. Handayani, Purwono, I. A. Ashari, and R. Ardianto, “Systematic Literature Review: Penerapan Machine Learning dalam Diagnosis dan Prediksi Penyakit Diabetes,” Komputa : Jurnal Ilmiah Komputer dan Informatika, vol. 14, no. 2, pp. 108–118, Nov. 2025, doi: 10.34010/komputa.v14i2.16642.
  19. P. Septiana Rizky, R. Haiban Hirzi, U. Hidayaturrohman, U. Hamzanwadi Selong Jl TGKH Muhammad Zainuddin Abdul Madjid Pancor, and L. Timur, “Perbandingan Metode LightGBM dan XGBoost dalam Menangani Data dengan Kelas Tidak Seimbang,” 2022. [Online]. Available: www.unipasby.ac.id
  20. R. Siringoringo and R. Perangin-angin, “METHOMIKA: Jurnal Manajemen Informatika & Komputerisasi Akuntansi Model Hibrid Genetic-Xgboost Dan Principal Component Analysis Pada Segmentasi Dan Peramalan Pasar,” vol. 5, no. 2, 2021, doi: 10.46880/jmika.Vol5No2.pp97-103.
  21. W. Liang, S. Luo, G. Zhao, and H. Wu, “Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms,” Mathematics, vol. 8, no. 5, May 2020, doi: 10.3390/MATH8050765.
  22. D. D. Rufo, T. G. Debelee, A. Ibenthal, and W. G. Negera, “Diagnosis of diabetes mellitus using gradient boosting machine (Lightgbm),” Diagnostics, vol. 11, no. 9, Sep. 2021, doi: 10.3390/diagnostics11091714.
  23. A. Putranto, N. L. Azizah, I. Ratna, and I. Astutik, “Web-based Heart Disease Prediction System Using SVM Method and Streamlit Framework [Sistem Prediksi Penyakit Jantung Berbasis Web Menggunakan Metode SVM dan Framework Streamlit].” [Online]. Available: https://archive.ics.uci.edu/ml/datasets/heart+disease
  24. M. Al Hafidz and P. M. Effendi, “Aplikasi Penentuan Kebutuhan Pelatihan Berbasis Kompetensi Untuk Peningkatan Kinerja Staf Analis Laboratorium,” Teknika, vol. 12, no. 2, pp. 129–137, Jun. 2023, doi: 10.34148/teknika.v12i2.622.
No Related Submission Found