Skip to contents

New training sets are generated from the original data by selecting samples at random. This can be based on levels in a factor or on the whole dataset.

Usage

resample(
  number_of_iterations = 10,
  method = "split_data",
  factor_name,
  p_train = 0.8,
  collect = NULL,
  ...
)

Arguments

number_of_iterations

(numeric, integer) The number of training sets to generate. The default is 10.

method

(character) Resampling method. Allowed values are limited to the following:

  • "split_data": Samples for the training set are selected at random from the full dataset.

  • "stratified_split": Samples for the training set are randomly selected from each level of the chosen factor.

  • "equal_split": Samples for the training set are selected at random from each level of the main factor such that all group sizes are equal.

The default is "split_data".

factor_name

(character) The name of a sample-meta column to use.

p_train

(numeric) The proportion of samples selected for the training set. The default is 0.8.

collect

(NULL, character) The name of a model output to collect over all bootstrap repetitions, in addition to the input metric. The default is NULL.

...

Additional slots and values passed to struct_class.

Value

A resample object with the following output slots:

results.training(data.frame)
results.testing(data.frame)
metric(data.frame)
collected(list)
metric.train(numeric)
metric.test(numeric)

Inheritance

A resample object inherits the following struct classes:

[resample] >> [resampler] >> [iterator] >> [struct_class]

Examples

M = resample(
      number_of_iterations = 100,
      method = "split_data",
      factor_name = "V1",
      p_train = 0.75,
      collect = NULL)

I = resample(
    number_of_iterations = 10, 
    factor_name = 'Species', 
    method = 'split_data',
    p_train = 0.8)