OPTIMIZING DECISION TREE PERFORMANCE WITH RECURSIVE FEATURE ELIMINATION FOR HIGH-DIMENSIONAL MUSHROOM CLASSIFICATION

Authors

  • Lili Tanti Universitas Potensi Utama
  • Safrizal Universitas Muhammadiyah Asahan
  • Yan Yang Thanri Universitas Potensi Utama

DOI:

https://doi.org/10.33480/jitk.v11i2.6816

Keywords:

Decision Tree, Hyperparameter Tuning, Mushroom Classification, Recursive Feature Elimination (RFE)

Abstract

Classifying mushroom species presents a significant challenge within biological data analysis because of the wide variety of species and their distinct attributes. This research investigates the effectiveness of the Decision Tree classifier for mushroom categorization by comparing two splitting criteria, the Gini Index and Entropy. Additionally, the study employs the Recursive Feature Elimination (RFE) method for dimensionality reduction to enhance model efficiency and performance. The dataset was collected, cleaned, and analyzed exploratorily before feature selection was conducted using RFE. The Decision Tree model was trained and evaluated using accuracy, precision, recall, and F1-score metrics. The results showed that applying RFE improved computational efficiency without compromising model accuracy. The Gini criterion provided more stable results across all metrics, while Entropy demonstrated higher precision in certain cases. Model optimization through parameter tuning produced the best parameter combination at max_depth = 5, min_samples_leaf = 5, and min_samples_split = 10. This study concludes that integrating RFE with the Decision Tree can significantly enhance the performance of high-dimensional dataset classification. The findings are expected to serve as a reference for developing efficient and accurate biological data classification models

Downloads

Download data is not yet available.

References

A. Rianti, T. Ridwan, S. Widodo, and R. Andrian, ‘Application of Decision Tree Algorithm for Edible Mushroom Classification’, J. Appl. Informatics Comput., vol. 6, no. 1, 2022, doi: 10.30871/jaic.v6i1.4087.

S. Metlek and H. Çetiner, ‘Classification of Poisonous and Edible Mushrooms with Optimized Classification Algorithms’, Int. Conf. Appl. Eng. Nat. Sci., vol. 1, no. 1, 2023, doi: 10.59287/icaens.1030.

O. Günlük, J. Kalagnanam, M. Li, M. Menickelly, and K. Scheinberg, ‘Optimal decision trees for categorical data via integer programming’, J. Glob. Optim., vol. 81, no. 1, 2021, doi: 10.1007/s10898-021-01009-y.

A. S. Saud, S. Shakya, and B. Neupane, ‘Analysis of Depth of Entropy and GINI Index Based Decision Trees for Predicting Diabetes’, Indian J. Comput. Sci., vol. 6, no. 6, 2021, doi: 10.17010/ijcs/2021/v6/i6/167641.

O. Rahmati, M. Avand, P. Yariyan, J. P. Tiefenbacher, A. Azareh, and D. T. Bui, ‘Assessment of Gini-, entropy- and ratio-based classification trees for groundwater potential modelling and prediction’, Geocarto Int., vol. 37, no. 12, 2022, doi: 10.1080/10106049.2020.1861664.

A. M. Priyatno and T. Widiyaningtyas, ‘A SYSTEMATIC LITERATURE REVIEW: RECURSIVE FEATURE ELIMINATION ALGORITHMS’, JITK (Jurnal Ilmu Pengetah. dan Teknol. Komputer), vol. 9, no. 2, 2024, doi: 10.33480/jitk.v9i2.5015.

H. A. Al Essa and W. S. Bhaya, ‘Ensemble learning classifiers hybrid feature selection for enhancing performance of intrusion detection system’, Bull. Electr. Eng. Informatics, vol. 13, no. 1, 2024, doi: 10.11591/eei.v13i1.5844.

K. Tutuncu, I. Cinar, R. Kursun, and M. Koklu, ‘Edible and Poisonous Mushrooms Classification by Machine Learning Algorithms’, 2022, doi: 10.1109/MECO55406.2022.9797212.

S. Lee, C. Lee, K. G. Mun, and D. Kim, ‘Decision Tree Algorithm Considering Distances between Classes’, IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3187172.

M. A. Bouke, A. Abdullah, J. Frnda, K. Cengiz, and B. Salah, ‘BukaGini: A Stability-Aware Gini Index Feature Selection Algorithm for Robust Model Performance’, IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3284975.

D. Rianasari, M. N. Triana, M. R. Dewi, Y. Astutik, and R. Wirawan, ‘The Classification of Mushroom Types Using Naïve Bayes and Principal Component Analysis’, JISA(Jurnal Inform. dan Sains), vol. 5, no. 2, 2022, doi: 10.31326/jisa.v5i2.1380.

C. M. Wati, A. C. Fauzan, and H. Harliana, ‘PERFORMANCE COMPARISON OF MUSHROOM TYPE CLASSIFICATION BASED ON MULTI-SCENARIO DATASET USING DECISION TREE C4.5 AND C5.0’, J. Ris. Inform., vol. 4, no. 3, 2022, doi: 10.34288/jri.v4i3.383.

A. B. Siddique et al., ‘Studying the effects of feature selection approaches on machine learning techniques for Mushroom classification problem’, 2023, doi: 10.1109/ICIT59216.2023.10335842.

M. S. Morshed, F. Bin Ashraf, M. U. Islam, and M. S. R. Shafi, ‘Predicting Mushroom Edibility with Effective Classification and Efficient Feature Selection Techniques’, in International Conference on Robotics, Electrical and Signal Processing Techniques, 2023, vol. 2023-January, doi: 10.1109/ICREST57604.2023.10070049.

Y.-H. Hu, R.-Y. Wu, Y.-C. Lin, and T.-Y. Lin, ‘A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications’, BMC Med. Res. Methodol., vol. 24, no. 1, p. 269, 2024, doi: 10.1186/s12874-024-02392-2.

R. C. Chen, W. E. Manongga, and C. Dewi, ‘Recursive Feature Elimination for Improving Learning Points on Hand-Sign Recognition’, Futur. Internet, vol. 14, no. 12, 2022, doi: 10.3390/fi14120352.

N. M. Abdelwahed, G. S. El-Tawel, and M. A. Makhlouf, ‘Effective hybrid feature selection using different bootstrap enhances cancers classification performance’, BioData Min., vol. 15, no. 1, 2022, doi: 10.1186/s13040-022-00304-y.

M. Awad and S. Fraihat, ‘Recursive Feature Elimination with Cross-Validation with Decision Tree: Feature Selection Method for Machine Learning-Based Intrusion Detection Systems’, J. Sens. Actuator Networks, vol. 12, no. 5, 2023, doi: 10.3390/jsan12050067.

F. Camattari, S. Guastavino, F. Marchetti, M. Piana, and E. Perracchione, ‘Classifier-dependent feature selection via greedy methods’, Stat. Comput., vol. 34, no. 5, Oct. 2024, doi: 10.1007/s11222-024-10460-2.

O. Bulut, B. Tan, E. Mazzullo, and A. Syed, ‘Benchmarking Variants of Recursive Feature Elimination: Insights from Predictive Tasks in Education and Healthcare’, Information, vol. 16, no. 6, p. 476, Jun. 2025, doi: 10.3390/info16060476.

M. Morgan, C. Blank, and R. Seetan, ‘Plant disease prediction using classification algorithms’, IAES Int. J. Artif. Intell., vol. 10, no. 1, 2021, doi: 10.11591/ijai.v10.i1.pp257-264.

I. Johri, M. K. Nallakaruppan, B. Balusamy, G. V, and V. Grover, ‘Application of Neural Networks and Genetic Algorithms in Establishing Logical Rules for Evaluating the Edibility of Mushroom Data’, in Communications in Computer and Information Science, 2023, vol. 1920, doi: 10.1007/978-3-031-45121-8_18.

A. Al-majali, ‘The Effect of Democracy and Income Inequality (Gini Index)’, Jordanian J. Law Polit. Sci., vol. 15, no. 4, 2024, doi: 10.35682/jjlps.v15i4.501.

Y. Lu, T. Ye, and J. Zheng, ‘Decision Tree Algorithm in Machine Learning’, 2022, doi: 10.1109/AEECA55500.2022.9918857.

G. Sahbeni, J. B. Pleynet, and K. Jarocki, ‘A spatiotemporal analysis of precipitation anomalies using rainfall Gini index between 1980 and 2022’, Atmos. Sci. Lett., vol. 24, no. 7, 2023, doi: 10.1002/asl.1161.

E. Ketzaki and N. Farmakis, ‘A matrix based computational method of the Gini index’, Communications in Statistics - Theory and Methods, vol. 52, no. 17. 2023, doi: 10.1080/03610926.2021.2024233.

Y. Miao, J. Wang, B. Zhang, and H. Li, ‘Practical framework of Gini index in the application of machinery fault feature extraction’, Mech. Syst. Signal Process., vol. 165, 2022, doi: 10.1016/j.ymssp.2021.108333.

B. Shao, ‘Decomposition of the Gini index by income source for aggregated data and its applications’, Comput. Stat., vol. 36, no. 3, 2021, doi: 10.1007/s00180-021-01069-4.

B. Chen et al., ‘A full generalization of the Gini index for bearing condition monitoring’, Mech. Syst. Signal Process., vol. 188, 2023, doi: 10.1016/j.ymssp.2022.109998.

S. Settepanella, A. Terni, M. Franciosi, and L. Li, ‘The robustness of the generalized Gini index’, Decis. Econ. Financ., vol. 45, no. 2, 2022, doi: 10.1007/s10203-022-00378-7.

W. Nugraha and A. Sasongko, ‘Hyperparameter Tuning on Classification Algorithm with Grid Search’, SISTEMASI, vol. 11, no. 2, 2022, doi: 10.32520/stmsi.v11i2.1750.

M. Koeshardianto, K. E. Permana, D. S. Y. Kartika, and W. Setiawan, ‘BEANS CLASSIFICATION USING DECISION TREE AND RANDOM FOREST WITH RANDOMIZED SEARCH HYPERPARAMETER TUNING’, Commun. Math. Biol. Neurosci., vol. 2023, 2023, doi: 10.28919/cmbn/8225.

R. Gomes Mantovani et al., ‘Better trees: an empirical study on hyperparameter tuning of classification decision tree induction algorithms’, Data Min. Knowl. Discov., vol. 38, no. 3, 2024, doi: 10.1007/s10618-024-01002-5.

Downloads

Published

2026-01-12

How to Cite

[1]
L. Tanti, Safrizal, and Y. Y. Thanri, “OPTIMIZING DECISION TREE PERFORMANCE WITH RECURSIVE FEATURE ELIMINATION FOR HIGH-DIMENSIONAL MUSHROOM CLASSIFICATION”, jitk, vol. 11, no. 2, pp. 591–601, Jan. 2026.