CLASSIFICATION OF CUSTOMERS’ REPEAT ORDER PROBABILITY USING DECISION TREE, NAÏVE BAYES AND RANDOM FOREST

Keywords: customer classification, decision tree, e-commerce, naive bayes, random forest

Abstract

Limited customer information in sales data on e-commerce in Indonesia hinders companies in determining targeted marketing strategies, especially in targeting groups of potential customers to make repeat purchases. Sales data in the form of customers' names and cellphone numbers has been hidden by e-commerce, and only data is available in the form of products purchased, number of purchases, and customer addresses. So far, the methods used to determine potential customers mostly use more complete data features. Research that uses limited e-commerce data to determine potential customers is scarce. Several algorithms for predicting repeat purchases in e-commerce also have been widely used. However, the comparison of the performance of these methods in the context of e-commerce in Indonesia with limited data has yet to be discovered. In this research, the Decision Tree, Naive Bayes, and Random Forest methods were compared to classify potential customers using Maschere brand sales data from two e-commerce sites, namely Tokopedia and Shopee. The research results show that the Decision Tree algorithm achieved an accuracy of 90.91%, Naive Bayes achieved an accuracy of 37.50%, and Random Forest achieved the best level of accuracy, namely 93.94%. These results show that the Random Forest method is the best method for classifying customers' probability of repeat purchases. In the future, the results of this research can be developed again as a decision-making system to determine potential customers.

Downloads

Download data is not yet available.

References

Arhami, M., & Nasir, M. (2020). Data Mining—Algoritma dan Implementasi. Penerbit Andi.

Budilaksono, S., Jupriyanto, J., Suwarno, M. A. S., Suwartane, I. G. A., Azhari, L., Fauzi, A., … Effendi, M. S. (2021). Customer Profilling for Precision Marketing using RFM Method, K-MEANS algorithm and Decision Tree. SinkrOn, 6(1), 191–200. https://doi.org/10.33395/sinkron.v6i1.11225

Charbuty, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01), 20–28. https://doi.org/10.38094/jastt20165

Chee, C. C. F., Leng Chiew, K., Sarbini, I. N. B., & Jing, E. K. H. (2022). Data Analytics Approach for Short-term Sales Forecasts Using Limited Information in E-commerce Marketplace. Acta Informatica Pragensia, 11(3), 309–323. https://doi.org/10.18267/j.aip.196

Djami, A. S. M., Utami, N. W., & Paramitha, A. I. I. (2023). The Prediction Of Product Sales Level Using K-Nearest Neighbor and Naive Bayes Algorithms (Case Study: PT Kotamas Bali). Jurnal Pilar Nusa Mandiri, 19(2), 77-84. https://doi.org/10.33480/pilar.v19i2.4420

Fan, C., Chen, M., Wang, X., Wang, J., & Huang, B. (2021). A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data. Frontiers in Energy Research, 9, 652801. https://doi.org/10.3389/fenrg.2021.652801

Farras, M., Friscily, F., Gabriela, G., Sera, S., Sherinne, S., & Mulyawan, B. (2022). Implementation of Big Data in E-Commerce to Improve User Experience: Presented at the 3rd Tarumanagara International Conference on the Applications of Social Sciences and Humanities (TICASH 2021), Jakarta, Indonesia. Jakarta, Indonesia. https://doi.org/10.2991/assehr.k.220404.326

Kotawadekar, M. S. V. (2022). Data Mining and Knowledge Discovery Process. International Journal of Creative Research Thoughts, 10(11).

M K, S., Gupta, R., & Gupta, K. (2021). Predicting Customers’ Next Order. International Journal of Engineering Research and Technology, 10(6), 418–421. https://doi.org/10.17577/IJERTV10IS060166

Ma, L., Zhang, X., Ding, X., & Wang, G. (2020). How Social Ties Influence Customers’ Involvement and Online Purchase Intentions. Journal of Theoretical and Applied Electronic Commerce Research, 16(3), 395–408. https://doi.org/10.3390/jtaer16030025

Man, L. (2020). Comparison between Community Shopping Mode and Traditional Cross-border E-commerce Modes. International Journal of Scientific Engineering and Science, 4(3), 8–10. https://doi.org/10.5281/zenodo.3749669

Maryoosh, A. A., & Hussein, E. M. (2022). A Review: Data Mining Techniques and Its Applications. International Journal of Computer Science and Mobile Applications, 10(3), 1–14. https://doi.org/10.47760/ijcsma.2022.v10i03.001

RapidMiner. (2023, October). Operator Manual—RapidMiner Documentation. Retrieved January 16, 2024, from https://docs.rapidminer.com/latest/studio/operators/

Papakyriakou, D., & Barbounakis, I. S. (2022). Data Mining Methods: A Review. International Journal of Computer Applications, 183(48), 5–19. https://doi.org/10.5120/ijca2022921884

Pusat Edukasi Penjual Tokopedia. (2023, 14 Desember). Penerapan Kebijakan Pelindungan Data Pribadi Pembeli. Retrieved January 16, 2024, from Pusat Seller website: https://seller.tokopedia.com/edu/kebijakan-data-pembeli/kebijakan-data-pembeli

Pusat Edukasi Penjual Shopee Indonesia. (2023, March 20). Perubahan Tampilan Informasi Pembeli. Retrieved January 16, 2024, from https://seller.shopee.co.id/edu/article/14910

Putro, H. F., Vulandari, R. T., & Saptomo, W. L. Y. (2020). Penerapan Metode Naive Bayes Untuk Klasifikasi Pelanggan. Jurnal Teknologi Informasi dan Komunikasi (TIKomSiN), 8(2). https://doi.org/10.30646/tikomsin.v8i2.500

Talekar, B. (2020). A Detailed Review on Decision Tree and Random Forest. Bioscience Biotechnology Research Communications, 13(14), 245–248. https://doi.org/10.21786/bbrc/13.14/57

Tharwat, A. (2021). Classification assessment methods. Applied Computing and Informatics, 17(1), 168–192. https://doi.org/10.1016/j.aci.2018.08.003

Varoquaux, G., & Colliot, O. (2023). Evaluating Machine Learning Models and Their Diagnostic Value. In O. Colliot (Ed.), Machine Learning for Brain Disorders (pp. 601–630). New York, NY: Springer US. https://doi.org/10.1007/978-1-0716-3195-9_20

Wickramasinghe, I., & Kalutarage, H. (2021). Naive Bayes: Applications, variations and vulnerabilities: A review of literature with code snippets for implementation. Soft Computing, 25(3), 2277–2293. https://doi.org/10.1007/s00500-020-05297-6

Published
2024-03-29
How to Cite
Dewi, A., Hermawan, A., & Avianto, D. (2024). CLASSIFICATION OF CUSTOMERS’ REPEAT ORDER PROBABILITY USING DECISION TREE, NAÏVE BAYES AND RANDOM FOREST. Jurnal Pilar Nusa Mandiri, 20(1), 52-59. https://doi.org/10.33480/pilar.v20i1.5243