Random Balance ensembles for multiclass imbalance learning

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygladolygiad gan gymheiriaid

StandardStandard

Random Balance ensembles for multiclass imbalance learning. / Rodriguez, Juan; Diez-Pastor, Jose-Francisco; Arnaiz-Gonzalez, Alvar et al.
Yn: Knowledge-Based Systems, Cyfrol 193, 105434, 06.04.2020.

Allbwn ymchwil: Cyfraniad at gyfnodolynErthygladolygiad gan gymheiriaid

HarvardHarvard

Rodriguez, J, Diez-Pastor, J-F, Arnaiz-Gonzalez, A & Kuncheva, L 2020, 'Random Balance ensembles for multiclass imbalance learning', Knowledge-Based Systems, cyfrol. 193, 105434. https://doi.org/10.1016/j.knosys.2019.105434

APA

Rodriguez, J., Diez-Pastor, J.-F., Arnaiz-Gonzalez, A., & Kuncheva, L. (2020). Random Balance ensembles for multiclass imbalance learning. Knowledge-Based Systems, 193, Erthygl 105434. https://doi.org/10.1016/j.knosys.2019.105434

CBE

Rodriguez J, Diez-Pastor J-F, Arnaiz-Gonzalez A, Kuncheva L. 2020. Random Balance ensembles for multiclass imbalance learning. Knowledge-Based Systems. 193:Article 105434. https://doi.org/10.1016/j.knosys.2019.105434

MLA

VancouverVancouver

Rodriguez J, Diez-Pastor JF, Arnaiz-Gonzalez A, Kuncheva L. Random Balance ensembles for multiclass imbalance learning. Knowledge-Based Systems. 2020 Ebr 6;193:105434. Epub 2019 Rhag 27. doi: 10.1016/j.knosys.2019.105434

Author

Rodriguez, Juan ; Diez-Pastor, Jose-Francisco ; Arnaiz-Gonzalez, Alvar et al. / Random Balance ensembles for multiclass imbalance learning. Yn: Knowledge-Based Systems. 2020 ; Cyfrol 193.

RIS

TY - JOUR

T1 - Random Balance ensembles for multiclass imbalance learning

AU - Rodriguez, Juan

AU - Diez-Pastor, Jose-Francisco

AU - Arnaiz-Gonzalez, Alvar

AU - Kuncheva, Ludmila

PY - 2020/4/6

Y1 - 2020/4/6

N2 - Random Balance strategy (RandBal) has been recently proposed for constructing classifier ensembles for imbalanced, two-class data sets. In RandBal, each base classifier is trained with a sample of the data with a random class prevalence, independent of the a priori distribution. Hence, for each sample, one of the classes will be undersampled while the other will be oversampled. RandBal can be applied on its own or can be combined with any other ensemble method. One particularly successful variant is RandBalBoost which integrates Random Balance and boosting. Encouraged by the success of RandBal, this work proposes two approaches which extend RandBal to multiclass imbalance problems. Multiclass imbalance implies that at least two classes have substantially different proportion of instances. In the first approach proposed here, termed Multiple Random Balance (MultiRandBal), we deal with all classes simultaneously. The training data for each base classifier are sampled with random class proportions. The second approach we propose decomposes the multiclass problem into two-class problems using one-vs-one or one-vs-all, and builds an ensemble of RandBal ensembles. We call the two versions of the second approach OVO-RandBal and OVA-RandBal, respectively. These two approaches were chosen because they are the most straightforward extensions of RandBal for multiple classes. Our main objective is to evaluate both approaches for multiclass imbalanced problems. To this end, an experiment was carried out with 52 multiclass data sets. The results suggest that both MultiRandBal, and OVO/OVA-RandBal are viable extensions of the original two-class RandBal. Collectively, they consistently outperform acclaimed state-of-the art methods for multiclass imbalanced problems.

AB - Random Balance strategy (RandBal) has been recently proposed for constructing classifier ensembles for imbalanced, two-class data sets. In RandBal, each base classifier is trained with a sample of the data with a random class prevalence, independent of the a priori distribution. Hence, for each sample, one of the classes will be undersampled while the other will be oversampled. RandBal can be applied on its own or can be combined with any other ensemble method. One particularly successful variant is RandBalBoost which integrates Random Balance and boosting. Encouraged by the success of RandBal, this work proposes two approaches which extend RandBal to multiclass imbalance problems. Multiclass imbalance implies that at least two classes have substantially different proportion of instances. In the first approach proposed here, termed Multiple Random Balance (MultiRandBal), we deal with all classes simultaneously. The training data for each base classifier are sampled with random class proportions. The second approach we propose decomposes the multiclass problem into two-class problems using one-vs-one or one-vs-all, and builds an ensemble of RandBal ensembles. We call the two versions of the second approach OVO-RandBal and OVA-RandBal, respectively. These two approaches were chosen because they are the most straightforward extensions of RandBal for multiple classes. Our main objective is to evaluate both approaches for multiclass imbalanced problems. To this end, an experiment was carried out with 52 multiclass data sets. The results suggest that both MultiRandBal, and OVO/OVA-RandBal are viable extensions of the original two-class RandBal. Collectively, they consistently outperform acclaimed state-of-the art methods for multiclass imbalanced problems.

KW - classifier ensembles

KW - Imbalanced data

KW - Multiclass classification

U2 - 10.1016/j.knosys.2019.105434

DO - 10.1016/j.knosys.2019.105434

M3 - Article

VL - 193

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

SN - 0950-7051

M1 - 105434

ER -