Semi-supervised extensions of multi-task tree ensembles
|Başlık||Semi-supervised extensions of multi-task tree ensembles|
|Publication Type||Journal Article|
|Year of Publication||2022|
|Authors||Adıyeke, E., and M. Gokce Baydogan|
|Anahtar kelimeler||Ensemble learning, Multi-objective trees, Multi-task learning, Semi-supervised learning, Totally randomized trees|
Scale inconsistency is a widely encountered issue in multi-output learning problems. Specifically, target sets with multiple real valued or a mixture of categorical and real valued variables require addressing the scale differences to obtain predictive models with sufficiently good performance. Data transformation techniques are often employed to solve that problem. However, these operations are susceptible to different shortcomings such as changing the statistical properties of the data and increase the computational burden. Scale differences also pose problem in semi-supervised learning (SSL) models as they require processing of unsupervised information where distance measures are commonly employed. Classical distance metrics can be criticized as they lose efficiency when variables exhibit type or scale differences, too. Besides, in higher dimensions distance metrics cause problems due to loss of discriminative power. This paper introduces alternative semi-supervised tree-based strategies that are robust to scale differences both in terms of feature and target variables. We propose use of a scale-invariant proximity measure by means of tree-based ensembles to preserve the original characteristics of the data. We update classical tree derivation procedure to a multi-criteria form to resolve scale inconsistencies. We define proximity based clustering indicators and extend the supervised model with unsupervised criteria. Our experiments show that proposed method significantly outperforms its benchmark learning model that is predictive clustering trees.