Building semi-supervised decision trees with semi-cart algorithm

Aydin Abedinia; Vahid Seydi

doi:10.1007/s13042-024-02161-z

Building semi-supervised decision trees with semi-cart algorithm

Allbwn ymchwil: Cyfraniad at gyfnodolyn › Erthygl › adolygiad gan gymheiriaid

StandardStandard

Building semi-supervised decision trees with semi-cart algorithm. / Abedinia, Aydin; Seydi, Vahid.
Yn: International Journal of Machine Learning and Cybernetics, Cyfrol 15, Rhif 10, 01.10.2024, t. 4493-4510.

Allbwn ymchwil: Cyfraniad at gyfnodolyn › Erthygl › adolygiad gan gymheiriaid

HarvardHarvard

Abedinia, A & Seydi, V 2024, 'Building semi-supervised decision trees with semi-cart algorithm', International Journal of Machine Learning and Cybernetics, cyfrol. 15, rhif 10, tt. 4493-4510. https://doi.org/10.1007/s13042-024-02161-z

VancouverVancouver

Abedinia A, Seydi V. Building semi-supervised decision trees with semi-cart algorithm. International Journal of Machine Learning and Cybernetics. 2024 Hyd 1;15(10):4493-4510. Epub 2024 Ebr 24. doi: 10.1007/s13042-024-02161-z

Author

Abedinia, Aydin ; Seydi, Vahid. / Building semi-supervised decision trees with semi-cart algorithm. Yn: International Journal of Machine Learning and Cybernetics. 2024 ; Cyfrol 15, Rhif 10. tt. 4493-4510.

RIS

TY - JOUR

T1 - Building semi-supervised decision trees with semi-cart algorithm

AU - Abedinia, Aydin

AU - Seydi, Vahid

PY - 2024/10/1

Y1 - 2024/10/1

N2 - Decision trees are a fundamental statistical learning tool for addressing classification and regression problems through a recursive partitioning approach that effectively accommodates numerical and categorical data [1, 2]. The Classification and regression tree (CART) algorithm underlies modern Boosting methodologies such as Gradient boosting machine (GBM), Extreme gradient boosting (XGBoost), and Light gradient boosting machine (LightGBM). However, the standard CART algorithm may require improvement due to its inability to learn from unlabeled data. This study proposes several modifications to incorporate test data into the training phase. Specifically, we introduce a method based on Graph-based semi-supervised learning called “Distance-based Weighting,” which calculates and removes irrelevant records from the training set to accelerate the training process and improve performance. We present Semi-supervised classification and regression tree (Semi-Cart), a new implementation of CART that constructs a decision tree using weighted training data. We evaluated its performance on thirteen datasets from various domains. Our results demonstrate that Semi-Cart outperforms standard CART methods and contributes to statistical learning.

AB - Decision trees are a fundamental statistical learning tool for addressing classification and regression problems through a recursive partitioning approach that effectively accommodates numerical and categorical data [1, 2]. The Classification and regression tree (CART) algorithm underlies modern Boosting methodologies such as Gradient boosting machine (GBM), Extreme gradient boosting (XGBoost), and Light gradient boosting machine (LightGBM). However, the standard CART algorithm may require improvement due to its inability to learn from unlabeled data. This study proposes several modifications to incorporate test data into the training phase. Specifically, we introduce a method based on Graph-based semi-supervised learning called “Distance-based Weighting,” which calculates and removes irrelevant records from the training set to accelerate the training process and improve performance. We present Semi-supervised classification and regression tree (Semi-Cart), a new implementation of CART that constructs a decision tree using weighted training data. We evaluated its performance on thirteen datasets from various domains. Our results demonstrate that Semi-Cart outperforms standard CART methods and contributes to statistical learning.

U2 - 10.1007/s13042-024-02161-z

DO - 10.1007/s13042-024-02161-z

M3 - Article

VL - 15

SP - 4493

EP - 4510

JO - International Journal of Machine Learning and Cybernetics

JF - International Journal of Machine Learning and Cybernetics

SN - 1868-8071

IS - 10

ER -

Porth Ymchwil

Building semi-supervised decision trees with semi-cart algorithm

StandardStandard

HarvardHarvard

APA

CBE

MLA

VancouverVancouver

Author

RIS