Novembro, 2025
Maia, C.M., de Amorim, L.B.V., Cavalcanti, G.D.C. et al. MetaML: a multi-label meta-learning approach for pipeline recommendation. Mach Learn 114, 278 (2025). https://doi.org/10.1007/s10994-025-06909-8

Abstract
In the machine learning (ML) literature, AutoML refers to the automated definition of a sequence of necessary steps to achieve an ML task, such as classification. Each of these steps of the ML pipeline, involving data preprocessing and algorithm selection, normally allows extensive variation, leading to large search spaces which makes it hard to find optimal pipelines for a certain problem. Most of the approaches so far presented to carry out AutoML rely on Bayesian optimization methods that have shown to be successful, albeit at high computational costs. Therefore, we propose a method that employs meta-learning (MtL) for recommending pipelines, taking into account the interdependence of its steps. MtL allows us to shift the computational complexity to an offline training phase. At the same time, we approach the search space complexity problem by designing an algorithm that carefully curates the pipeline candidates based on past ML experiments, optimizing the training and effective performance of the pipeline recommendation model. An analysis using 152 datasets shows that MetaML achieves final classification performance equivalent or superior to state-of-the-art methods but incurs much lower computational times. The source code for the experiments is available at the project’s repository (https://github.com/cynthiamaia/MetaML).
Authors
Cynthia Moreira Maia, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
Lucas B. V. de Amorim, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
George D. C. Cavalcanti, Centro de Informática, Universidade Federal de Pernambuco, Recife, Brazil
Rafael M. O. Cruz, Ecole de Technologie Supérieure, Montreal, Canada
Comentários desativados