
  • Eri Mardiani Universitas Nasional
  • Nur Rahmansyah Politeknik Negeri Media Kreatif
  • Andy Setiawan UPN Veteran Jakarta
  • Zakila Cahya Ronika UPN Veteran Jakarta
  • Dini Fatihatul Hidayah UPN Veteran Jakarta
  • Atira Syakira UPN Veteran Jakarta



algorithm method comparison, data mining, income classification, orange


Using the income classification dataset, we performed data analysis with the help of data mining to gather interesting information from the available data. Currently, data processing can be done using many tools. One of the tools that we use for data processing is the orange application. By using the dataset we looked at the welfare level ranging from marital status, school, gender, and from all fields related to income ranging from sales, to daily life to find out the income earned by employees or workers from several countries such as the United States, Cambodia, United Kingdom, Puerto-Rico, Canada, Germany, Outer US (Guam-USVI-etc). The purpose of this analysis is to determine the hourly income in one week that can affect the income classification. The classification technique uses various classification models, namely the K-Nearest Neighbor (KNN) algorithm model, Naïve Bayes, Decision Tree, Esemble Method and Linear Regression algorithm. The results of the analysis based on the test results of various algorithm models can be concluded that the best algorithm model for measuring workers' income is to use the Naive Bayes Decision. Analysis of variables based on Hours-per-Week and Capital-Gain affects Income Classification which determines whether the income earned is more than 50 thousand/50 K and the analysis results in a prediction of a person's income level.


