🚀 What is vip?
vip provides a unified framework for constructing variable importance plots from virtually any machine learning model in R. Stop juggling different importance()
functions across packages – vip
gives you one consistent interface for interpretable ML.
✨ Key features
- Universal interface: Works with 40+ model types from different packages
- Multiple methods: Model-specific, permutation, SHAP, and variance-based importance
- Beautiful plots: Publication-ready visualizations with ggplot2
- Efficient algorithms: Optimized with parallel processing support
- Extensible design: Easy to add support for new model types
- Well-documented: Comprehensive guides and academic backing
🛠️ Quick start
Installation
# Install from CRAN (stable)
install.packages("vip")
# Install development version (latest features)
# install.packages("pak")
pak::pak("koalaverse/vip")
30-second example
library(vip)
library(randomForest)
# Fit a model
model <- randomForest(Species ~ ., data = iris)
# Get importance scores
vi_scores <- vi(model)
print(vi_scores)
#> # A tibble: 4 × 2
#> Variable Importance
#> <chr> <dbl>
#> 1 Petal.Length 32.4
#> 2 Petal.Width 31.3
#> 3 Sepal.Length 9.51
#> 4 Sepal.Width 6.75
# Create a beautiful plot
vip(model)
🎯 Supported methods
Method | Description | Use case | Function |
---|---|---|---|
Model-specific | Extract built-in importance | Fast, model-native | vi(model, method = "model") |
Permutation | Shuffle features, measure impact | Model-agnostic, robust | vi(model, method = "permute") |
Shapley values | Game theory attribution | Detailed explanations | vi(model, method = "shap") |
Variance-based | FIRM approach | Feature ranking | vi(model, method = "firm") |
🔧 Supported models (40+)
Tree-based models - randomForest • ranger • xgboost • lightgbm • gbm • C50 • Cubist • rpart • party • partykit
Linear models - glmnet • earth (MARS) • Base R (lm, glm)
Neural networks - nnet • neuralnet • h2o • RSNNS
Meta-frameworks - caret • tidymodels • parsnip • workflows • mlr • mlr3 • sparklyr
Specialized models - pls • mixOmics (Bioconductor) • And many more…
🏃♂️ Advanced examples
Permutation importance with custom metrics
library(ranger)
# Fit model
rf_model <- ranger(mpg ~ ., data = mtcars, importance = "none")
# Permutation importance with custom metric
vi_perm <- vi(
rf_model,
method = "permute",
train = mtcars,
target = "mpg",
metric = "rmse",
nsim = 50, # 50 permutations for stability
parallel = TRUE # Speed up with parallel processing
)
# Create enhanced plot
vip(vi_perm, num_features = 10, geom = "point") +
labs(title = "Permutation-based Variable Importance",
subtitle = "RMSE metric, 50 permutations") +
theme_minimal()
SHAP values for detailed attribution
library(xgboost)
# Prepare data
X <- data.matrix(subset(mtcars, select = -mpg))
y <- mtcars$mpg
# Fit XGBoost model
xgb_model <- xgboost(data = X, label = y, nrounds = 100, verbose = 0)
# SHAP-based importance
vi_shap <- vi(
xgb_model,
method = "shap",
train = X,
nsim = 30
)
# Beautiful SHAP plot
vip(vi_shap, geom = "col", aesthetics = list(fill = "steelblue", alpha = 0.8)) +
labs(title = "SHAP-based Variable Importance") +
theme_light()
🤝 Contributing and development
We welcome contributions! Here’s how to get involved:
Testing framework
We use tinytest for lightweight, reliable testing:
# Run all tests
tinytest::test_package("vip")
# Test specific functionality
tinytest::run_test_file("inst/tinytest/test_vip.R")
Development workflow
- Check issues: Look for good first issues
-
Create branch:
git checkout -b feature/awesome-feature
-
Write tests: Follow TDD principles (see
CLAUDE.md
) -
Run checks:
R CMD check
and tests - Submit PR: With clear description
Adding model support
Adding support for new models is straightforward:
# Add S3 method to R/vi_model.R
vi_model.your_model_class <- function(object, ...) {
# Extract importance from your model
importance_scores <- your_model_importance_function(object)
# Return as tibble
tibble::tibble(
Variable = names(importance_scores),
Importance = as.numeric(importance_scores)
)
}
See CLAUDE.md
for detailed instructions!
📚 Learning resources
- Package website - Comprehensive documentation
- R Journal paper - Academic foundation
- IML book - Theory background
- Development guide - Contributing guidelines
✨ What’s new in v0.4.1
- ✅ ggplot2 S7 compatibility - Future-proof plotting
- lightgbm support - Popular gradient boosting
- Enhanced yardstick integration - Better metrics
- Improved documentation - Clearer examples
See NEWS.md for complete version history and migration notes.
🆘 Getting help
- Bug reports: GitHub Issues
- Feature requests: GitHub Discussions
-
Questions: Stack Overflow (tag:
vip
)
📜 License
GPL (>= 2) © Brandon M. Greenwell, Brad Boehmke
⭐ Star us on GitHub if vip helps make your models more interpretable! ⭐
Built with ❤️ by the koalaverse team