Non-sequential Pipelines and Tuning

Publikation: Bidrag til bog/antologi/rapport › Bidrag til bog/antologi › Forskning › fagfællebedømt

Martin Binder
Florian Pfisterer
Marc Becker
Wright, Marvin Nils Ole

Real-world applications often require complicated pipeline that do not progress sequentially. For example, many experiments have demonstrated that bagging is a powerful method to improve model performance. Bagging can be thought of as a non-sequential pipeline where a learner is replicated, each separate learner is trained and makes predictions, and their results are combined. This is non-sequential as data is not flowing sequentially through the pipeline but is instead passed to all learners (who may then subsample the data) and then recombined, thus creating a pipeline where operations have multiple inputs and outputs. Pipeline operations also have hyperparameters that can be set and tuned to improve model performance. Moreover the choice of operations to include in a pipeline can also be tuned, known as combined algorithm selection and hyperparameter optimization (CASH). This chapter looks at more advanced uses of mlr3pipelines. This is put into practice by demonstrating how to build a bagging and stacking pipeline from scratch, as well as how to access common pipelines that are readily available in mlr3pipelines. The chapter then looks at tuning pipelines and CASH.

Originalsprog	Engelsk
Titel	Applied Machine Learning Using mlr3 in R
Redaktører	Bernd Bischl, Raphael Sonabend, Lars Kotthoff, Michel Lang
Antal sider	22
Forlag	CRC Press
Publikationsdato	2024
Sider	174-195
Kapitel	8
ISBN (Trykt)	978-1-032-51567-0, 978-1-032-50754-5
ISBN (Elektronisk)	978-1-003-40284-8
DOI	https://doi.org/10.1201/9781003402848-8
Status	Udgivet - 2024

Bibliografisk note

ID: 390194958