Analisis Perbandingan Prediksi Harga Rumah Dengan Random Forest, Gradient Boosting, dan XGBoost
DOI:
https://doi.org/10.57255/intellect.v4i1.1385Keywords:
Harga Rumah, Regresi, XGBoost, Random Forest, Gradient Boosting, YogyakartaAbstract
House price prediction poses a significant challenge in the property sector, especially in the Yogyakarta region, which exhibits a wide range of price variations. This study aims to compare the performance of three regression algorithms such as Random Forest, Gradient Boosting, and XGBoost, in building predictive models based on features such as land area, building area, number of bedrooms, bathrooms, and garage availability. The dataset analyzed consists of 1,642 entries, with house prices ranging from IDR 7 million to IDR 4.37 billion, an average price of IDR 1.14 billion, and a mode of IDR 775 million. Model evaluation was conducted using Mean Squared Error (MSE) and the coefficient of determination (R²), where XGBoost achieved the best performance with an MSE of 1.56 × 10¹⁴ IDR², an R² of 0.7746, and a Root Mean Squared Error (RMSE) of approximately IDR 12.5 million. These results indicate that XGBoost outperforms the other two models in handling complex tabular data and provides more accurate predictions. The predictive model has practical potential to be utilized by property developers, real estate agents, and local governments as a decision-support tool for price estimation, market evaluation, and data-driven urban planning. These findings highlight that selecting the appropriate algorithm can significantly enhance the quality of house price prediction.
Abstrak
Prediksi harga rumah menjadi tantangan penting dalam bidang properti, khususnya di wilayah Yogyakarta yang memiliki variasi harga cukup ekstrem. Penelitian ini bertujuan untuk membandingkan performa tiga algoritma regresi yaitu Random Forest, Gradient Boosting, dan XGBoost digunakan untuk membangun model prediksi harga rumah berdasarkan fitur seperti luas tanah, luas bangunan, jumlah kamar tidur, kamar mandi, dan garasi. Data yang dianalisis mencakup 1.642 entri dengan harga rumah berkisar antara Rp 7 juta hingga Rp 4,37 miliar, harga rata-rata sebesar Rp 1,14 miliar, dan modus Rp 775 juta. Evaluasi model dilakukan menggunakan metrik Mean Squared Error (MSE) dan koefisien determinasi (R²), di mana XGBoost menghasilkan performa terbaik dengan MSE sebesar 1,56 × 10¹⁴ rupiah², R² sebesar 0,7746, dan Root Mean Squared Error (RMSE) sekitar 12,5 juta rupiah. Hasil ini menunjukkan bahwa XGBoost lebih unggul dalam menangani data tabular kompleks dan memiliki akurasi prediksi yang lebih baik dibanding dua model lainnya. Model prediktif ini berpotensi digunakan oleh pengembang properti, agen real estate, maupun pemerintah daerah sebagai alat bantu dalam penetapan harga, evaluasi pasar, dan perencanaan tata ruang yang berbasis data. Temuan ini memberikan gambaran bahwa pemilihan algoritma yang tepat dapat meningkatkan kualitas prediksi harga properti.
Downloads
References
E. Febrion Rahayuningtyas, F. Novia Rahayu, Y. Azhar, and I. Artikel, “Prediksi Harga Rumah Menggunakan General Regression Neural Network,” J. Inform., vol. 8, no. 1, 2021, [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Real
S. N. Asri, A. W. Hasyim, W. Dwi, P. Jurusan, P. Wilayah, and D. Kota, “FAKTOR-FAKTOR YANG MEMPENGARUHI HARGA LAHAN PERMUKIMAN DI KOTA MALANG.”
D. Permata Sari, “FAKTOR-FAKTOR YANG MEMPENGARUHI KEPUTUSAN PEMBELIAN, KUALITAS PRODUK, HARGA KOMPETITIF, LOKASI (LITERATURE REVIEW MANAJEMEN PEMASARAN),” vol. 2, no. 4, 2021, doi: 10.31933/jimt.v2i4.
Hendra Di Kesuma, D. Apriadi, H. Juliansa, and E. Etriyanti, “Implementasi Data Mining Prediksi Mahasiswa Baru Menggunakan Algoritma Regresi Linear Berganda,” J. Ilm. Bin. STMIK Bina Nusant. Jaya Lubuklinggau, vol. 4, no. 2, pp. 62–66, Oct. 2022, doi: 10.52303/jb.v4i2.74.
E. Lette, M. Zunaidi, and W. R. Maya, “Prediksi Penjualan Crude Palm Oil (CPO) Menggunakan Metode Regresi Linear Berganda,” J. Sist. Inf. Triguna Dharma (JURSI TGD), vol. 1, no. 3, p. 128, May 2022, doi: 10.53513/jursi.v1i3.5106.
I. Muthahharah and Inayanti Fatwa, “Analisis Regresi Linear Berganda Untuk Media Pembelajaran Daring Terhadap Prestasi Belajar Mahasiswa di STKIP Pembangunan,” J. MSA ( Mat. dan Stat. serta Apl. ), vol. 10, no. 1, pp. 53–60, Jun. 2022, doi: 10.24252/msa.v10i1.25145.
A. N. Maharadja, I. Maulana, and B. A. Dermawan, “Penerapan Metode Regresi Linear Berganda untuk Prediksi Kerugian Negara Berdasarkan Kasus Tindak Pidana Korupsi,” J. Appl. Informatics Comput., vol. 5, no. 1, pp. 95–102, Jul. 2021, doi: 10.30871/jaic.v5i1.3184.
R. Yohanes and D. Lasut, “Web-Based used Car Price Prediction Application with Linear Regression Method,” bit-Tech, vol. 7, no. 3, pp. 687–695, Apr. 2025, doi: 10.32877/bt.v7i3.1722.
F. M. Talaat, A. Aljadani, M. Badawy, and M. Elhosseini, “Toward interpretable credit scoring: integrating explainable artificial intelligence with deep learning for credit card default prediction,” Neural Comput. Appl., vol. 36, no. 9, pp. 4847–4865, Mar. 2024, doi: 10.1007/s00521-023-09232-2.
J. Gao, Y. Lu, N. Ashrafi, I. Domingo, K. Alaei, and M. Pishgar, “Prediction of sepsis mortality in ICU patients using machine learning methods,” BMC Med. Inform. Decis. Mak., vol. 24, no. 1, p. 228, Aug. 2024, doi: 10.1186/s12911-024-02630-z.
A. A. Rasheed, “Improving prediction efficiency by revolutionary machine learning models,” Mater. Today Proc., vol. 81, pp. 577–583, 2023, doi: 10.1016/j.matpr.2021.04.014.
E. B. El Hakim and J. Aryanto, “Automated Maintenance System For Freshwater Aquascape Based On The Internet Of Things (Iot),” Adv. Sustain. Sci. Eng. Technol., vol. 6, no. 1, Nov. 2024, doi: 10.26877/asset.v6i1.17951.
Y. S. Zakaria, N. A. Ariffin, A. Ahmad, R. Rainis, A. M. Muslim, and W. M. M. Wan Ibrahim, “Optimizing Tuberculosis Treatment Predictions: A Comparative Study of XGBoost with Hyperparameter in Penang, Malaysia,” Sains Malaysiana, vol. 54, no. 1, pp. 3741–3752, Jan. 2025, doi: 10.17576/jsm-2025-5401-22.
L. Alfaris et al., “Predicting Ocean Current Temperature Off the East Coast of America with XGBoost and Random Forest Algorithms Using Rstudio,” ILMU Kelaut. Indones. J. Mar. Sci., vol. 29, no. 2, pp. 273–284, Jun. 2024, doi: 10.14710/ik.ijms.29.2.273-284.
E. Fitri, “Analisis Perbandingan Metode Regresi Linier, Random Forest Regression dan Gradient Boosted Trees Regression Method untuk Prediksi Harga Rumah,” J. Appl. Comput. Sci. Technol., vol. 4, no. 1, pp. 58–64, Jul. 2023, doi: 10.52158/jacost.v4i1.491.
F. A. Rangkuti, Khairunnisa, and S. Sundari, “IMPLEMENTASI GRADIENT BOOSTING MACHINES UNTUK PREDIKSI HARGA RUMAH PADA JAKARTA SELATAN,” J. Kecerdasan Buatan dan Teknol. Inf., vol. 4, no. 2, pp. 164–172, May 2025, doi: 10.69916/jkbti.v4i2.318.
A. N. M. Pudjianto and E. Y. Hidayat, “Perbandingan Prediksi Depresi Mahasiswa dengan Linear Regression, Random Forest, dan Gradient Boosting,” SINTECH (Science Inf. Technol. J., vol. 7, no. 3, pp. 180–189, Dec. 2024, doi: 10.31598/sintechjournal.v7i3.1729.
M. Bagus Prayogi and F. Apriani, “PREDIKSI ANGKA HARAPAN HIDUP MENGGUNAKAN RANDOM FOREST DAN XGBOOST REGRESSION,” 2025.
N. Hassan, S. T. Sheikh Abdul Kadir, M. L. Husain, B. Satyanarayana, M. A. Ambak, and A. M. Ghaffar, “Weight Prediction for Fishes in Setiu Wetland, Terengganu, using Machine Learning Regression Model,” BIO Web Conf., vol. 73, p. 01007, Nov. 2023, doi: 10.1051/bioconf/20237301007.
D. Ignasius, M. Akrom, and S. Budi, “Comparative Analysis of Linear Regression, Decision Tree, and Gradient Boosting Models for Predicting Drug Corrosion Inhibition Efficiency Using QSAR Descriptors,” Fakt. Exacta, vol. 17, no. 3, p. 251, Sep. 2024, doi: 10.30998/faktorexacta.v17i3.24679.
Nurdin, N. Suarna, and W. Prihartono, “ALGORITMA REGRESI LINIER SEDERHANA UNTUK PREDIKSI PENGGUNAAN VOLUME AIR BERDASARKAN JENIS PELANGGAN PDAM,” J. Kecerdasan Buatan dan Teknol. Inf., vol. 4, no. 1, pp. 43–52, Jan. 2025, doi: 10.69916/jkbti.v4i1.187.
I. D. Hartarti, I. A. Septiyani, D. A. Gultom, Y. Hendrian, and S. L. Kinanti, “Prediksi Harga Rumah di Boston Dengan Model Regresi Linear Menggunakan Python,” RIGGS J. Artif. Intell. Digit. Bus., vol. 4, no. 2, pp. 4250–4256, Jun. 2025, doi: 10.31004/riggs.v4i2.1210.
Pramudya Dika, “Dataset Harga Tanah Yogyakarta,” https://www.kaggle.com/datasets/pramudyadika/yogyakarta-housing-price-ndonesia/data.
A. Aliberti, Y. Xin, A. Viticchié, E. Macii, and E. Patti, “Comparative analysis of neural networks techniques to forecast Airfare Prices,” in Proceedings - International Computer Software and Applications Conference, 2023. doi: 10.1109/COMPSAC57700.2023.00157.
M. Raparthi, D. Dhabliya, T. Kumari, R. Upadhyaya, and A. Sharma, “Implementation and Performance Comparison of Gradient Boosting Algorithms for Tabular Data Classification,” 2024, pp. 461–479. doi: 10.1007/978-981-97-4533-3_36.
M. Imani, A. Beikmohammadi, and H. R. Arabnia, “Comprehensive Analysis of Random Forest and XGBoost Performance with SMOTE, ADASYN, and GNUS Under Varying Imbalance Levels,” Technologies, vol. 13, no. 3, p. 88, Feb. 2025, doi: 10.3390/technologies13030088.
N. S. Ahmed, “Machine Learning Models for Pavement Structural Condition Prediction: A Comparative Study of Random Forest (RF) and eXtreme Gradient Boosting (XGBoost),” Open J. Civ. Eng., vol. 14, no. 04, pp. 570–586, 2024, doi: 10.4236/ojce.2024.144031.
Y. Wu, Z. Zhang, X. Qi, W. Hu, and S. Si, “Prediction of flood sensitivity based on Logistic Regression, eXtreme Gradient Boosting, and Random Forest modeling methods,” Water Sci. Technol., vol. 89, no. 10, pp. 2605–2624, May 2024, doi: 10.2166/wst.2024.146.
K. Ita and J. Prinze, “Machine learning for skin permeability prediction: random forest and XG boost regression,” J. Drug Target., vol. 32, no. 1, pp. 57–65, Jan. 2024, doi: 10.1080/1061186X.2023.2284096.
A. K. Jana and S. Saha, “Comparative Performance analysis of Machine Learning Algorithms for stability forecasting in Decentralized Smart Grids with Renewable Energy Sources,” in 2024 International Conference on Electrical, Computer and Energy Technologies (ICECET, IEEE, Jul. 2024, pp. 1–7. doi: 10.1109/ICECET61485.2024.10698410.
Y. Qu et al., “Migratable Power System Transient Stability Assessment Method Based on Improved XGBoost,” Energy Eng., vol. 121, no. 7, pp. 1847–1863, 2024, doi: 10.32604/ee.2024.048300.
Downloads
Submitted
Accepted
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Bety Wulan, Donni Prabowo

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.















