Research on Identification of Taxpayers' Fraudulent Invoicing Behavior Based on Feature Engineering
DOI: 10.23977/ferm.2024.070524 | Downloads: 14 | Views: 550
Author(s)
Wei Liu 1, Jiyuan Chen 1, Lingjun Xiao 1, Yin Si 1, Jun Tang 1
Affiliation(s)
1 School of Information and Intelligent Engineering, Guangzhou Xinhua University, Guangzhou, China
Corresponding Author
Jun TangABSTRACT
Fraudulent invoicing is a key part of tax risk work, and how to accurately identify whether taxpayers have fraudulent invoicing behaviors from massive tax data to reduce the loss of tax is the focus of tax risk work. The existing tax data is large in volume, with fuzzy data features, and traditional machine learning models have limited generalization ability, which leads to poor performance in identifying false invoicing behaviors. To address the above problems, this paper establishes a high-quality sample dataset by establishing a tax feature project and proposes a learning model based on Stacking integrated ideas to identify taxpayers' false invoicing behavior. Taking the Stacking-based false invoicing behavior recognition model proposed in this paper as the core, on the tax sample dataset, the classification effect of the proposed model is compared with that of single models, and the results show that the Stacking-based recognition model is superior to others in terms of AUC value, accuracy, and F1 score. The experimental results validate the superiority of the model.
KEYWORDS
False invoicing; Feature engineering; Integrated learning; StackingCITE THIS PAPER
Wei Liu, Jiyuan Chen, Lingjun Xiao, Yin Si, Jun Tang, Research on Identification of Taxpayers' Fraudulent Invoicing Behavior Based on Feature Engineering. Financial Engineering and Risk Management (2024) Vol. 7: 186-193. DOI: http://dx.doi.org/10.23977/ferm.2024.070524.
REFERENCES
[1] Li Xiangrong, Zhu Keshi. Analysis of Tax Risk Prevention Countermeasures of VAT Invoice Management[J]. China Accountant General, 2020(09):52-54.
[2] Wolpert D H. Stacked generalization [J]. Neural Networks, 2017, 5(2):241-259.
[3] Ji Yanli, Wang Wenqing. Research on the stock of accuracy of tax risk identification in the context of big data - based on the perspective of machine learning[J]. Fiscal Research, 2020(09):119-129.DOI:10.19477/j.cnki.11-1077/f.2020. 09. 010
[4] Zhu Jiangtao. Utilizing "Internet+" thinking to crack the problem of export tax fraud[J]. Tax Research, 2016(05):22-27.DOI:10.19376/j.cnki.cn11-1011/f.2016.05.003.
[5] Chen Zaosheng, Zhang Junping. VAT tax source risk control model and empirical analysis[J]. Taxation Economics Research, 2015, 20(02):66-71.DOI:10.16340/j.cnki.ssjjyj.2015.02.011.
[6] Yao X, Wang Xiaodan, Zhang Yuxi, Quan Wen. A review of feature selection methods[J]. Control and Decision Making, 2012, 27(02):161-166+192.DOI:10.13195/j.cd.2012.02.4.yaox.013.
[7] J. W. Xu, Y.Y. Yang. Integrated learning methods:A research review[J]. Journal of Yunnan University(Natural Science Edition), 2018, 40(06):1082-1092.
[8] Hart P E. The Condensed Nearest Neighbor Rule[J]. IEEE Transactions on Information Theory, 1968, 14(3):515-516.
[9] Pregibon D. Logistic Regression Diagnostics[J]. Annals of Statistics, 1981, 9(4):705-724.
[10] Breiman L. Random Forests [J]. Machine Learning, 2001.
[11] Ke G L, Meng Q, Finley T, et al. Light GBM: A Highly Efficient Gradient Boosting Decision Tree[C]// Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA.
Downloads: | 35758 |
---|---|
Visits: | 862413 |
Sponsors, Associates, and Links
-
Information Systems and Economics
-
Accounting, Auditing and Finance
-
Industrial Engineering and Innovation Management
-
Tourism Management and Technology Economy
-
Journal of Computational and Financial Econometrics
-
Accounting and Corporate Management
-
Social Security and Administration Management
-
Population, Resources & Environmental Economics
-
Statistics & Quantitative Economics
-
Agricultural & Forestry Economics and Management
-
Social Medicine and Health Management
-
Land Resource Management
-
Information, Library and Archival Science
-
Journal of Human Resource Development
-
Manufacturing and Service Operations Management
-
Operational Research and Cybernetics