ARTICLE

Volume 1,Issue 9

Cite this article
2
Download
5
Citations
13
Views
20 November 2025

非均等误判代价下信用特征选择的HHG-CB-CSR协同模型

静赛 王1 志斌 熊2
Show Less
1 河南财政金融学院金融学院, 中国
2 华南师范大学数学科学学院, 中国
ASDS 2025 , 1(9), 55–60; https://doi.org/10.61369/ASDS.2025090010
© 2025 by the Author(s). Licensee Art and Design, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC BY-NC 4.0) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

特征选择是信用评估的关键环节。针对信用数据的类别不平衡、非均等误判代价及类别型特征多等问题,本文首先基于代价敏感学习提出代价敏感查全率(CSR) 指标;进而融合Heller-Heller-Gorfine(HHG)检验与CatBoost,构建HHG-CB-CSR 信用特征选择方法—— 以HHG 检验指导序列后向搜索,atBoost 为学习器,CSR 为特征子集评价与停止准则。该方法可解决高基数类别型特征数值化难题,精准度量特征相关性并赋予选择过程代价敏感性。4个信用数据集的实证表明,HHG-CB-CSR 在传统指标与CSR 指标上均表现优异,稳健性与实际应用性突出。

Keywords
信用评估
特征选择
代价敏感查全率
HHG 检验
CatBoost
References

[1] Altman E I, Marco G, Varetto F. Corporate Distress Diagnosis: Comparisons Using Linear Discriminant Analysis and Neural Networks [J]. Journal of Banking and Finance, 1994, 18(3): 505-529.
[2] 方匡南, 范新妍, 马双鸽. 基于网络结构 Logistic 模型的企业信用风险预警[J]. 统计研究, 2016, 33(4): 50-55.
[3] 王正位, 周从意, 廖理, 等. 消费行为在个人信用风险识别中的信息含量研究[J]. 经济研究, 2020, (1): 149-163.
[4] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System [C]. Proceeding of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining. ACM, 2016: 785-794.
[5] Ke G, Meng Q, Finley T, et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree [C]. In Advances in Neural Information Processing Systems, 2017: 3149-3157.
[6] Zhou Z H, Feng J. Deep Forest: Towards An Alternative to Deep Neural Networks [J]. National Science Review, 2019, 6(1): 74-86.
[7] 王重仁, 韩冬梅. 基于超参数优化和集成学习的互联网信贷个人信用评估[J]. 统计与决策, 2019(1): 87-91.
[8] 朱磊, 应瑛, 陈怡桐, 等. 基于 LightGBM 和 SHAP 值的企业信用预警模型和实证分析[J]. 征信, 2023, 41(11): 49-56.
[9] Prokhorenkova L, Gusev G, Vorobev A, et al. CatBoost: Unbiased boosting with categorical features [C]. Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, QC, Canada. 2018: 6639-6649.
[10] 吴星泽. 财务危机预警研究: 存在问题与框架重构[J]. 会计研究, 2011(2): 59-65, 97.
[11] Fritz S, Hosemann D. Restructuring the Credit Process: Behaviour Scoring for German Corporates [J]. Intelligent Systems in Accounting Finance & Management, 2000,9(1): 9-21.
[12] 胡心瀚, 叶五一, 缪柏其. 上市公司信用风险分析模型中的变量选择[J]. 数理统计与管理, 2012, 31(6): 1117-1124.
[13] 王小燕, 方匡南, 谢邦昌. Logistic 回归的双层变量选择研究[J]. 统计研究, 2014, 31(9): 107-112.
[14] 方匡南, 章贵军, 张惠颖. 基于Lasso-logistic 模型的个人信用风险预警方法[J]. 数量经济技术经济研究, 2014, 31(2): 125-136.
[15] 王小燕, 江建伟, 姚欣悦. 基于CMCP-LMCL 的多分类深度神经网络及其应用[J]. 统计研究, 2024, 41(7): 148-160.
[16] Sakar C O, Kursun O. A Method for Combining Mutual Information and Canonical Correlation Analysis: Predictive Mutual Information and Its Use in Feature Selection [J]. Expert Systems with Applications, 2012, 39(3): 3333-3344.
[17] 曾津, 周建军. 高维数据变量选择方法综述[J]. 数理统计与管理, 2017, 36(4): 678-692.
[18] Reshef D N, Reshef Y A, Finucane H K, et al. Detecting Novel Associations in Large Data Sets [J]. Science, 2011, 334(6062): 1518-1524.
[19] 樊嵘, 孟大志, 徐大舜. 统计相关性分析方法研究进展[J]. 数学建模及其应用, 2014, 3(1): 1-12.
[20] 袁哲明, 杨晶晶, 陈渊. 基于最大信息系数与冗余分摊的特征选择方法[J]. 计算机工程, 2020, 46(8): 101-105.
[21] Heller R, Heller Y, Gorfine M. A Consistent Multivariate Test of Association Based on Ranks of Distances [J]. Biometrika, 2013, 100(2): 503-510.
[22] Santos S S, Takahashi D Y, Nakata A, et al. A Comparative Study of Statistical Methods Used to Identify Dependencies between Gene Expression Signals [J]. Briefings in Bioinformatics, 2014, 15(6): 906-918.
[23] 李妍峰, 李文豪. 面向信用风险评估的相对混合支持向量前沿方法研究[J/OL]. 系统科学与数学, 1-21 [2025-10-11].
[24] Dushimimana B, Wambui Y, Lubega T, et al. Use of Machine Learning Techniques to Create a Credit Score Model for Airtime Loans [J]. Journal of Risk and Financial Management, 2020, 13(8): 180.
[25] Junior L M, Nardini F M, Renso C, et al. A Novel Approach to Define the Local Region of Dynamic Selection Techniques in Imbalanced Credit Scoring Problems [J]. Expert Systems with Applications, 2020, 152: 113351.
[26] Ning C, Ribeiro B M, An C. Financial Credit Risk Assessment: A Recent Review [J]. Artificial Intelligence Review, 2016, 45(1): 1-23.

Share
Back to top