Tidying methods for a randomForest model — augment.randomForest • broomstick

These methods tidy the variable importance of a random forest model summary, augment the original data with information on the fitted values/classifications and error, and construct a one-row glance of the model's statistics.

# S3 method for randomForest
augment(x, data = NULL, ...)

# S3 method for randomForest
glance(x, ...)

# S3 method for randomForest
tidy(x, ...)

Arguments

x	randomForest object
data	Model data for use by `augment.randomForest()`.
...	Additional arguments (ignored)

Value

augment.randomForest returns the original data with additional columns:

.oob_times

The number of trees for which the given case was "out of bag". See randomForest::randomForest() for more details.

.fitted

The fitted value or class.

augment returns additional columns for classification and usupervised trees:

.votes

For each case, the voting results, with one column per class.

.local_var_imp

The casewise variable importance, stored as data frames in a nested list-column, with one row per variable in the model. Only present if the model was created with importance = TRUE

glance.randomForest returns a data.frame with the following columns for regression trees:

mse

The average mean squared error across all trees.

rsq

The average pesudo-R-squared across all trees. See randomForest::randomForest() for more information.

For classification trees: one row per class, with the following columns:

precision

recall

accuracy

f_measure

All tidying methods return a data.frame without rownames. The structure depends on the method chosen. tidy.randomForest returns one row for each model term, with the following columns:

term

The term in the randomForest model

MeanDecreaseAccuracy

A measure of variable importance. See randomForest::randomForest() for more information. Only present if the model was created with importance = TRUE

MeanDecreaseGini

A measure of variable importance. See randomForest::randomForest() for more information.

MeanDecreaseAccuracy_sd

Standard deviation of MeanDecreaseAccuracy. See randomForest::randomForest() for more information. Only present if the model was created with importance = TRUE

classwise_importance

Classwise variable importance for each term, stored as data frames in a nested list-column, with one row per class. Only present if the model was created with importance = TRUE