On this page

Research on performance enhancement of integrated learning algorithm based on sample weight allocation mechanism and Its Application in Image Classification

By: Yuting Zhang 1, Jiao Bao 1, Jueyan Li 2
1Chengdu Technological University, Chengdu, Sichuan, 610000, China
2460media, Adelaide, SA5000, Australia

Abstract

Aiming at the challenges of insufficient model generalization ability and computational inefficiency in class imbalance multiclassification problems, this paper proposes an integrated learning algorithm optimization framework based on the sample weight distribution mechanism. A Gaussian mapping-enhanced G-SMOTE oversampling method is designed to dynamically adjust the boundary distribution weights of a few class samples. Combining the fast binary classification property of TWSVM and the weight adaptation mechanism of AdaBoost, the integrated model based on the OVO strategy is constructed.The average AUC value of the G-SMOTE method on the 2 datasets with low imbalance ratios is 0.891, which is higher than that of the original dataset, the single downsampling, and the SMOTE oversampling, respectively, by 0.181, 0.187, and 0.137.The mean AUC value on the 2 datasets with high imbalance ratios is 0.891, which is higher than that of the original dataset, the single downsampling, and the SMOTE oversampling, respectively. The same performance is optimal on the 2 datasets with high imbalance ratios.The convergence speed of AdaBoost-TWSVM has an advantage over Pa_Ada, and a large advantage over SWA_Adaboost and IPAB. The test error of AdaBoost-TWSVM is reduced by an average of 9.22, 15.41, 6.08, and 9.38 percentage points compared to the other six algorithms on the four datasets, respectively. Compared with TWSVM, the acceleration ratios of AdaBoost-TWSVM algorithms are all improved to a certain extent, and the acceleration effect is most significant in the high-dimensional dataset Kddcupbuffer, with the acceleration ratio of node 3 reaching 2.37 ± 0.03. This algorithm demonstrates strong parallel computing capabilities and scalability when handling large-scale datasets, making it suitable for the classification and detection of painted images. When applied to the classification of Zhang Daqian’s early and later landscape paintings, the algorithm achieved more satisfactory results in image classification accuracy.