STEMM Institute Press
Science, Technology, Engineering, Management and Medicine
Research on Asset Return Rate Prediction of Listed Companies Based on Random Forest and XGBoost Hybrid Artificial Intelligence Algorithm
DOI: https://doi.org/10.62517/jbdc.202601201
Author(s)
Yong Xiong1,2, Zhiming Wu3,*, Liu’an He1, Xiaoyan Zhang1, Zhu Zheng4, Fang Lan1, Hui Mi1, Yijia Liu1, Zongjun Lan1, Jinglin Huang1, Rui Li1, Meiyan Pang1
Affiliation(s)
1 Department of Accounting, Guangzhou College of Technology and Business, Guangzhou, Guangdong, China 2 Seokyeong University, Seoul, South Korea 3 Department of Accounting, Anhui Business and Technology College, Hefei, Anhui, China 4 Hainan Vocational University of Science and Technology, Haikou, Hainan, China *Corresponding Author
Abstract
To improve the prediction accuracy of return on assets (ROA) for listed companies and better support financial performance evaluation, risk early warning, and investment decision-making, this study proposes a hybrid artificial intelligence prediction model that combines Random Forest and Extreme Gradient Boosting (XGBoost). Based on 76,567 financial panel data of Chinese listed firms from 2001 to 2022 covering real estate, manufacturing, transportation, and comprehensive industries, we select 44 financial indicators reflecting solvency, operation capacity, profitability, and growth ability as input features. After rigorous preprocessing including column name standardization, IQR outlier detection, industry-median missing value imputation, and lagged feature construction, 70,845 valid observations are finally obtained. The two-stage model uses Random Forest for base prediction and feature selection, and adopts XGBoost to fit residuals and reduce prediction errors. Empirical results show that the hybrid model achieves a test R² of 0.9279, MAE of 0.004960, and RMSE of 0.011964, significantly outperforming traditional decision tree and linear regression models. With strong nonlinear fitting and autonomous learning ability, this model provides reliable technical support for financial data analysis and intelligent decision-making of listed companies.
Keywords
Artificial Intelligence; Integrated Learning; Random Forest; XGBoost; Return on Assets; Financial Forecasting
References
[1]Braga, A., & Neto, P. (2022). Firm performance prediction using ensemble learning: A comparative study. Journal of Business Research, 142, 496-507. [2]Wen, X., Li, Y., & Zhang, H. (2023). Financial distress prediction based on XGBoost and Random Forest ensemble. Expert Systems with Applications, 213, 119210. [3]Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785-794). [4]Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. [5]Du, X., & Tan, Y. (2024). Forecasting ROA using machine learning: Evidence from listed firms. Emerging Markets Finance & Trade, 60(3), 786-804. [6]Hastie, T., Tibshirani, R., & Friedman, J. H. (2021). Ensemble learning in finance: A review. Annual Review of Financial Economics, 13, 31-55. [7]Louzada, F., Araujo, M., & Fernandes, P. (2022). Machine learning for corporate profitability prediction. Applied Economics, 54(18), 2069-2083. [8]Li, Y., Wang, Z., & Chen, L. (2025). Hybrid ensemble model for financial performance forecasting. Sustainability, 17(4), 1892. [9]Wang, X., & Li, Y. (2023). Hybrid ensemble learning for financial performance forecasting. Computational Economics, 61(4), 1123-1145. [10]Kim, S., & Park, J. (2020). Machine learning vs. traditional regression in profitability forecasting. Journal of Business Research, 115, 342-350.
Copyright @ 2020-2035 STEMM Institute Press All Rights Reserved