mlr3resampling: Resampling Algorithms for 'mlr3' Framework

A supervised learning algorithm inputs a train set, and outputs a prediction function, which can be used on a test set. If each data point belongs to a group (such as geographic region, year, etc), then how do we know if it is possible to train on one group, and predict accurately on another group? Cross-validation can be used to determine the extent to which this is possible, by first assigning fold IDs from 1 to K to all data (possibly using stratification, usually by group and label). Then we loop over test sets (group/fold combinations), train sets (same group, other groups, all groups), and compute test/prediction accuracy for each combination. Comparing test/prediction accuracy between same and other, we can determine the extent to which it is possible (perfect if same/other have similar test accuracy for each group; other is usually somewhat less accurate than same; other can be just as bad as featureless baseline when the groups have different patterns). For more information, <https://tdhock.github.io/blog/2023/R-gen-new-subsets/> describes the method in depth. How many train samples are required to get accurate predictions on a test set? Cross-validation can be used to answer this question, with variable size train sets.

Version: 2024.4.14
Imports: data.table, R6, checkmate, paradox, mlr3, mlr3misc
Suggests: animint2, mlr3tuning, lgr, future, testthat, knitr, rmarkdown, nc, rpart
Published: 2024-04-16
Author: Toby Hocking ORCID iD [aut, cre], Michel Lang ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Bernd Bischl ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Jakob Richter ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Patrick Schratz ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Giuseppe Casalicchio ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Stefan Coors ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Quay Au ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Martin Binder [ctb], Florian Pfisterer ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Raphael Sonabend ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Lennart Schneider ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Marc Becker ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified), Sebastian Fischer ORCID iD [ctb] (Author of mlr3 when Resampling/ResamplingCV was copied/modified)
Maintainer: Toby Hocking <toby.hocking at r-project.org>
BugReports: https://github.com/tdhock/mlr3resampling/issues
License: GPL-3
URL: https://github.com/tdhock/mlr3resampling
NeedsCompilation: no
Materials: NEWS
CRAN checks: mlr3resampling results

Documentation:

Reference manual: mlr3resampling.pdf
Vignettes: Comparing training on same or other subsets
Comparing sizes when training on same or other groups
Comparing train set sizes

Downloads:

Package source: mlr3resampling_2024.4.14.tar.gz
Windows binaries: r-devel: mlr3resampling_2024.4.14.zip, r-release: mlr3resampling_2024.4.14.zip, r-oldrel: mlr3resampling_2024.4.14.zip
macOS binaries: r-release (arm64): mlr3resampling_2024.4.14.tgz, r-oldrel (arm64): mlr3resampling_2024.4.14.tgz, r-release (x86_64): mlr3resampling_2024.4.14.tgz, r-oldrel (x86_64): mlr3resampling_2024.4.14.tgz
Old sources: mlr3resampling archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=mlr3resampling to link to this page.