KOMPARASI ALGORITMA DENGAN PENDEKATAN RANDOM UNDERSAMPLING UNTUK MENANGANI KETIDAKSEIMBANGAN KELAS PADA PREDIKSI CACAT SOFTWARE

  • Ginabila Ginabila Program Pascasarjana Magister Ilmu Komputer STMIK Nusa Mandiri
  • Ahamd Fauzi STMIK Nusa Mandiri
Keywords: Naive Bayes, J48, Random Forest, Random Undersampling, Prediksi Cacat Software.

Abstract

Testing is a process that becomes a standard in producing quality software. In predictions of software defects, prediction errors are very bad. Incorrect and inappropriate data sets result in inaccurate prediction results will be affect the software itself. This study aims to overcome the problem of class imbalance with the software defect prediction data set, through the Random Undersampling (RUS) data level approach by taking several algorithms namely Naive Bayes (NB), J48 and Random Forest (RF) which aims to compare the accuracy level highest so that maximum results are obtained in the process of predicting software defects. From the results of this study it can be found that to overcome class imbalances using the Random Undersampling level data approach to predict software defects, the highest level of accuracy is obtained by the Random Forest algorithm with an accuracy rate of 71.932%.

Downloads

Download data is not yet available.

References

Akbar, M. S., & Rochimah, S. (2017). Prediksi Cacat Perangkat Lunak Dengan Optimasi Naive Bayes Menggunakan Gain Ratio. Jurnal Sistem Dan Informatika, 11, 147–155.

Andri, Kunang, Y. N., & Murniati, S. (2013). Implementasi Teknik Data Mining Untuk Memprediksi Tingkat Kelulusan Mahasiswa pada Universitas Bina Darma Palembang, 2013(June 2016), 1–8. https://doi.org/10.13140/RG.2.1.4212.1845

Aries, S., & Wahono, R. S. (2015). Pendekatan Level Data untuk Menangani Ketidakseimbangan Kelas pada Prediksi Cacat Software. Journal of Software Engineering, 1(2), 76–85. https://doi.org/10.1016/S1896-1126(14)00030-3

Diwandari, S., & Setiawan, N. A. (2015). Perbandingan Algoritme J48 dan Nbtree Untuk Klasifikasi Diagnosa Penyakit Pada Soybean. Seminar Nasional Teknologi Informasi Dan Komunikasi, 2015(Sentika), 205–212.

Frank, E., Hall, M., Trigg, L., Holmes, G., & Witten, I. H. (2004). Data Mining in Bioinformatics using Weka. Bioinformatics, 20(15), 2479–2481. https://doi.org/10.1093/bioinformatics/bth261

Frastian, N., Hendrian, S., & Valentino, V. H. (2018). Komparasi Algoritma Klasifikasi Menentukan Kelulusan Mata Kuliah Pada Universitas. Faktor Exacta, 11(1), 66. https://doi.org/10.30998/faktorexacta.v11i1.1826

Okutan, A., & Yildiz, O. T. (2014). Software Defect Prediction using Bayesian Networks. Empirical Software Engineering, 19(1), 154–181. https://doi.org/10.1007/s10664-012-9218-8

PROMISE. (2010). Data sets for software defect prediction. Retrieved from http://tunedit.org/repo/PROMISE/DefectPrediction

Putra, D. S., Wibawa, A. D., & Purnomo, M. H. (2016). Berjalan Menggunakan Random Forest, 1(1), 51–56.

Putri, S. A., & Frieyadie. (2017). Combining Integreted Sampling Technique With Feature Selection For Software Defect Prediction. In 2017 5th International Conference on Cyber and IT Service Management (CITSM) (pp. 1–6). Bali: IEEE. https://doi.org/10.1109/CITSM.2017.8089264

Shuo Wang, & Xin Yao. (2013). Using Class Imbalance Learning for Software Defect Prediction. IEEE Transactions on Reliability, 62(2), 434–443. https://doi.org/10.1109/tr.2013.2259203

Siringoringo, R. (2017). Integrasi Metode Resampling dan K-Nearest Naighbor pada Prediksi Cacat Software Aplikasi Android. ISD, 2 No.1(1), 47–58.

Tharwat, A. (2018). Classification assessment methods. Applied Computing and Informatics. https://doi.org/10.1016/j.aci.2018.08.003
Published
2019-03-07
How to Cite
Ginabila, G., & Fauzi, A. (2019). KOMPARASI ALGORITMA DENGAN PENDEKATAN RANDOM UNDERSAMPLING UNTUK MENANGANI KETIDAKSEIMBANGAN KELAS PADA PREDIKSI CACAT SOFTWARE. Jurnal Pilar Nusa Mandiri, 15(1), 27-34. https://doi.org/10.33480/pilar.v15i1.28