A COMPARATIVE EVALUATING NUMERICAL MEASURE VARIATIONS IN K-MEDOIDS CLUSTERING FOR EFFECTIVE DATA GROUPING

  • Relita Buaton Sekolah Tinggi Manajemen Informatika dan Komputer Kaputama
  • Solikhun Solikhun STIKOM Tunas Bangsa
Keywords: clustering, comparison, distance metrics, k-medoids, numerical measure

Abstract

The K-Medoids Clustering algorithm is a frequently employed technique among researchers for data categorization. The primary difficulty addressed in this investigation pertains to the extent of optimality achieved when varying distance computation methodologies are applied within the framework of K-Medoids Clustering. This study is primarily concerned with the application of K-Medoids Clustering, employing a multitude of distance calculation methods, specifically those involving numerical metrics. The aim is to undertake a comparative analysis of Davies-Bouldin Index (DBI) values in order to ascertain the most productive distance calculation technique. In this research, the distance calculation methodologies include Manhattan Distance, Jaccard Similarity, Dynamic Time Warping Distance, Cosine Similarity, Chebyshev Distance, Canberra Distance and Euclidean Distance. The dataset consists of sales data from Devi Cosmetics, covering the period between January and April 2022 and comprising 56 distinct sales items. The research provides an exhaustive evaluation of numerical metrics concerning the K-Medoids Clustering algorithm. The findings indicate that the optimal clustering is achieved using the Chebyshev distance, resulting in 9 clusters with a DBI value of 166.632. The study's contribution is that it can improve more optimal data grouping to help make decisions correctly.

Downloads

Download data is not yet available.

References

D. Laila Sari, M. Saputra, and H. Gemasih, "Penerapan Data Mining Dalam Proses Prediksi Perceraian Menggunakan Algoritma Naive Bayes Di Kabupaten Aceh Tengah," Jurnal Teknik Informatika Dan Elektro, vol. 4, no. 1, pp. 23–35, 2022, doi: https://doi.org/10.55542/jurtie.v4i1.112

S. D. Nirwana, M. I. Jambak, and A. Bardadi, "Perbandingan Algoritma K-Means Dan K-Medoids Dalam Clustering Rata-Rata Penambahan Kasus Covid-19 Berdasarkan Kota/Kabupaten Di Provinsi Sumatera Selatan," JSiI (Jurnal Sistem Informasi), vol. 9, no. 2, pp. 126–131, 2022, doi: https://doi.org/10.30656/jsii.v9i2.5127

B. Charbuty and A. Abdulazeez, "Classification Based on Decision Tree Algorithm for Machine Learning," Journal of Applied Science and Technology Trends, vol. 2, no. 1, pp. 20–28, 2021, doi: https://doi.org/10.38094/jastt20165

A. Damuri, U. Riyanto, H. Rusdianto, and M. Aminudin, "Implementasi Data Mining dengan Algoritma Naïve Bayes Untuk Klasifikasi Kelayakan Penerima Bantuan Sembako," JURIKOM (Jurnal Riset Komputer), vol. 8, no. 6, p. 219, 2021, doi: https://doi.org/10.30865/jurikom.v8i6.3655

M. H. Santoso, "Application of association rule method using apriori algorithm to find sales patterns: Case study of Indomaret Tanjung Anom," Brilliance: Research of Artificial Intelligence, vol. 1, no. 2, pp. 54–66, 2021.

H. Putra and N. Ulfa Walmi, "Penerapan Prediksi Produksi Padi Menggunakan Artificial Neural Network Algoritma Backpropagation," Jurnal Nasional Teknologi Dan Sistem Informasi, vol. 6, no. 2, pp. 100–107, 2020, doi: https://doi.org/10.25077/teknosi.v6i2.2020.100-107

L. Liu, Y. Dong, W. Lang, H. Yang, and B. Wang, "The Impact of Commercial-Industry Development of Urban Vitality: A Study on the Central Urban Area of Guangzhou Using Multisource Data," Land, vol. 13, no. 2, p. 250, 2024.

M. Wahyudi and L. Pujiastuti, "Komparasi K-Means Clustering dan K-Medoids dalam Mengelompokkan Produksi Susu Segar di Indonesia," Jurnal Sistem Informasi, vol. 4, no. 2, pp. 243–254, 2022, doi: https://doi.org/10.30812/bite.v4i2.2104

T. A. Terlep, M. R. Bell, T. M. Talavage, and D. L. Smith, "Euclidean distance approximations from replacement product graphs," IEEE Transactions on Image Processing, vol. 31, pp. 125–137, 2021.

R. Buaton and S. Solikhun, "Application of Numerical Measure Variations in K-Means Clustering for Grouping Data," MATRIK: Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 23, no. 1, pp. 103–112, 2023.

F. Faisal, L. A. Giopani, M. Fitriah, Z. C. Dwynne, and S. Syahidatul, "Comparison of K-Means and K-Medoids Algorithms for Temperature Grouping in Riau Province," Jurnal Teknologi dan Sistem Informasi, vol. 2, no. 2, pp. 128–134, 2022.

C. G. Demartini, L. Sciascia, A. Bosso, and F. Manuri, "Artificial Intelligence Bringing Improvements to Adaptive Learning in Education: A Case Study," Sustainability, vol. 16, no. 3, p. 1347, 2024.

B. A. Setiawan and Sulastri, "Perbandingan Clustering Optimalisasi Stok Barang Menggunakan Algoritma K–Means dan Algoritma K–Medoids," Jurnal Sistem dan Informatika, vol. 978–979, 2021.

T. S. Syamfithriani, N. Mirantika, and R. Trisudarmo, "Perbandingan Algoritma K-Means dan K-Medoids Untuk Pemetaan Daerah Penanganan Diare Pada Balita di Kabupaten Kuningan," Jurnal Sistem Informasi Bisnis, vol. 12, no. 2, pp. 132–139, 2023, doi: https://doi.org/10.21456/vol12iss2pp132-139

A. Supriyadi, A. Triayudi, and I. D. Sholihati, "Perbandingan Algoritma K-Means Dengan K-Medoids Pada Pengelompokan Armada Kendaraan Truk Berdasarkan Produktivitas," JIPI (Jurnal Ilmiah Penelitian Dan Pembelajaran Informatika), vol. 6, no. 2, pp. 229–240, 2021, doi: https://doi.org/10.29100/jipi.v6i2.2008

H. Ningrum, E. Irawan, and M. R. Lubis, "Implementasi Metode K-Medoids Clustering Dalam Pengelompokan Data Penyakit Alergi Pada Anak," Jurasik (Jurnal Riset Sistem Informasi Dan Teknik Informatika), vol. 6, no. 1, p. 130, 2021, doi: https://doi.org/10.30645/jurasik.v6i1.277

A. J. Wahidin and D. I. Sensuse, "Perbandingan Algoritma K-Means, X-Means Dan K-Medoids Untuk Klasterisasi Awak Kabin Lion Air," Jurnal ICT: Information Communication & Technology, vol. 20, no. 2, pp. 298–302, 2021, doi: https://doi.org/10.36054/jict-ikmi.v20i2.387

M. N. P. Pamulang, M. N. Aini, and U. Enri3, "Komparasi Distance Measure Pada K-Medoids Clustering untuk Pengelompokkan Penyakit ISPA," Edumatic: Jurnal Pendidikan Informatika, vol. 5, no. 1, pp. 99–107, 2021, doi: https://doi.org/10.29408/edumatic.v5i1.3359

Y. Zhao, R. Dai, Y. Yang, F. Li, Y. Zhang, and X. Wang, "Integrated evaluation of resource and environmental carrying capacity during the transformation of resource-exhausted cities based on Euclidean distance and a Gray-TOPSIS model: A case study of Jiaozuo City, China," Ecological Indicators, vol. 142, p. 109282, Jul. 2022, doi: https://doi.org/10.1016/j.ecolind.2022.109282

A. Li, C. Fan, F. Xiao, and Z. Chen, "Distance measures in building informatics: An in-depth assessment through typical tasks in building energy management," Energy and Buildings, vol. 258, p. 111817, 2022, doi: https://doi.org/10.1016/j.enbuild.2021.111817

H. Ren, Y. Gao, and T. Yang, "A Novel Regret Theory-Based Decision-Making Method Combined with the Intuitionistic Fuzzy Canberra Distance," Discrete Dynamics in Nature and Society, vol. 2020, 2020, doi: https://doi.org/10.1155/2020/8848031

G X. Gao and G. Li, "A KNN Model Based on Manhattan Distance to Identify the SNARE Proteins," IEEE Access, vol. 8, pp. 112922–112931, 2020, doi: https://doi.org/10.1109/ACCESS.2020.3003086

G. T. Pranoto, W. Hadikristanto, and Y. Religia, "Grouping of Village Status in West Java Province Using the Manhattan, Euclidean and Chebyshev Methods on the K-Mean Algorithm," JISA (Jurnal Informatika Dan Sains), vol. 5, no. 1, pp. 28–34, 2022, doi: https://doi.org/10.31326/jisa.v5i1.1097

R. H. Singh, S. Maurya, T. Tripathi, T. Narula, and G. Srivastav, "Movie Recommendation System using Cosine Similarity and KNN," International Journal of Engineering and Advanced Technology, vol. 9, no. 5, pp. 556–559, 2020, doi: https://doi.org/10.35940/ijeat.e9666.069520

K. Park, J. S. Hong, and W. Kim, "A Methodology Combining Cosine Similarity with Classifier for Text Classification," Applied Artificial Intelligence, vol. 34, no. 5, pp. 396–411, 2020, doi: https://doi.org/10.1080/08839514.2020.1723868

W. S. Moola, W. Bijker, M. Belgiu, and M. Li, "Vegetable mapping using fuzzy classification of Dynamic Time Warping distances from time series of Sentinel-1A images," International Journal of Applied Earth Observation and Geoinformation, vol. 102, p. 102405, Jun. 2021, doi: https://doi.org/10.1016/j.jag.2021.102405

T. Z. Baharav, G. M. Kamath, N. T. David, and I. Shomorony, "Spectral jaccard similarity: a new approach to estimating pairwise sequence alignments," Patterns, vol. 1, no. 6, 2020.

M. Tang, Y. Kaymaz, B. L. Logeman, S. Eichhorn, Z. S. Liang, C. Dulac, and T. B. Sackton, "Evaluating single-cell cluster stability using the Jaccard similarity index," Bioinformatics, vol. 37, no. 15, pp. 2212–2214, 2021, doi: https://doi.org/10.1093/bioinformatics/btaa956

M. Ivaškevičius, "Influence of urban shape (as memory) on social capital," Ph.D. dissertation, Kauno technologijos universitetas, 2021.

N. Li and S. Wan, "Research on Fast Compensation Algorithm for Interframe Motion of Multimedia Video Based on Manhattan Distance," Journal of Mathematics, 2022, doi: https://doi.org/10.1155/2022/3468475

Published
2024-11-19
How to Cite
[1]
R. Buaton and S. Solikhun, “A COMPARATIVE EVALUATING NUMERICAL MEASURE VARIATIONS IN K-MEDOIDS CLUSTERING FOR EFFECTIVE DATA GROUPING”, jitk, vol. 10, no. 2, pp. 394 - 403, Nov. 2024.