predictive_model is a scikit-learn-style object for multivariate
predictive modelling. It holds the hyperparameters that specify a model
(algorithm, task, cross-validation scheme, scorer, …) and, after fitting,
the artifacts it produces (predictions, weight maps, bootstrap
statistics, error metrics, the underlying trained MATLAB model). It is the
canonical output type of xval_SVM, xval_SVR, fmri_data.predict
('newapi'), and friends; the constructor also accepts a plain struct
(such as the legacy wrapper outputs) and routes each field to its
categorised home, with translations for legacy names like
SVMModel/SVRModel, vox_weights, wZ/wP, etc.
Type methods(my_obj) in MATLAB for the live list on any instance.
You never have to build a predictive_model field by field — there are three
entry points, and they all return the same object type, so everything in
this reference applies regardless of how you made it.
-
Standalone, scikit-learn style — construct from hyperparameters and drive it with the method verbs (
fit→predict→crossval→bootstrap/permutation_test→weight_map_object/plot). Most control; operates on numeric(X, Y, groups).pm = predictive_model('algorithm','svm','task','classification'); pm = crossval(pm, X, Y, 'groups', id); pm = bootstrap(pm, X, Y, 'nboot', 1000, 'groups', id); pm = weight_map_object(pm, dat); % attach a brain map for montage(pm) summary(pm);
-
From an
fmri_dataobject —fmri_data.predict(..., 'newapi')takes a brain-data object directly, runs the cross-validated prediction, and returns apredictive_modelas its 4th output (already carrying a weight map):[cverr, stats, optout, pm] = predict(dat, 'algorithm_name','cv_svm', ... 'nfolds', 5, 'newapi');
-
Via an
xval_*wrapper — single-call functions that take numeric matrices and return a populatedpredictive_model:xval_SVM,xval_SVR,xval_discriminant_classifier,xval_lasso_brain,xval_cross_classify, …. These never see an image, so attach a weight map afterward withweight_map_object(pm, reference_image):pm = xval_SVM(X, Y, id); pm = weight_map_object(pm, dat); % then montage(pm) / surface(pm)
The constructor also accepts a plain struct (e.g. a legacy wrapper's STATS
output): predictive_model(stats_struct) maps each field to its consolidated
home.
Hands-on walkthroughs in the multivariate-decoding series — each has a
rendered Markdown page and a runnable plain-text Live Script (.m, opens in
the Live Editor on R2025a+):
| Part | Topic | Markdown | Live Script (.m) |
|---|---|---|---|
| 1 | Classification basics with SVM (ROC, confusion, effect sizes, weight maps, apply-to-test) | part 1 .md | .m |
| 2 | Classification and regression; one-line loaders; xval_* family; predicted R² |
part 2 .md | .m |
| 3 | The sklearn-style predictive_model API (fit/predict/crossval/bootstrap/permutation, tuning, calibration, stability selection) |
part 3 .md | .m |
| 4 | Cross-classification (Woo et al., 2014: does a pain pattern decode rejection?) | part 4 .md | .m |
| 5 | Algorithms, multiclass ECOC, tuning, and inference | part 5 .md | .m |
- Value semantics. Every mutating method returns a NEW object — write
pm = fit(pm, X, Y), notfit(pm, X, Y). - Hyperparameters vs fitted state are disjoint.
clone(pm)returns a fresh, unfitted copy with identical hyperparameters (used for per-fold refitting).is_fitted(pm)tests for any populated fitted state. - Numeric core, image adapters on top.
fit/predict/crossvaloperate on numeric(X, Y, groups). Neuroimaging awareness lives in the adapter methods (weight_map_object,montage,surface) that map the coefficient vector back into voxel space using a source image'svolInfo.
| Property | Description |
|---|---|
algorithm |
Registry name: svm, linear_svm, logistic, lda, ecoc, svr, lasso, ridge, gp, pcr (PCA+OLS), lassopcr (PCA+LASSO+relaxed-OLS), … |
task |
'classification' or 'regression' |
modeloptions |
Cell of fitter options, e.g. {'KernelFunction','linear'} |
cv |
A cv_splitter (kfold, stratified_group_kfold, …) |
scorer |
A cv_scorer (accuracy, roc_auc, r2, rmse, …) |
random_state, standardize, use_parallel, nboot, nperm, do_calibrate |
reproducibility / behaviour flags |
class_labels, Y_name, X_name |
labels for plotting |
| Property | Description |
|---|---|
Y |
Outcome actually fit (after bad-case removal) |
id |
Grouping/subject vector |
omitted_cases / omitted_features |
Logical masks (over the original cases / features) of what was dropped |
inputParameters |
Struct of stored fit-time parameters (standardization mean/std, …) |
fit_type |
'crossval' | 'insample' | 'test' — provenance of yfit AND weights |
| Property | Key fields |
|---|---|
fitted_values |
.yfit, .scores, .score_type, .dist_from_hyperplane_xval, within-person arrays, .calibrator |
weights |
.w, .intercept, .w_perfold, .boot_w*, .z, .p, .fdr_thr, .fdr_sig, .thresh_fdr, .weight_obj |
error_metrics |
(value, descrip) tuples: .crossval_accuracy, .r2, .rmse, .prediction_outcome_r, .d_within, … |
cv_partition |
.trIdx, .teIdx, .nfolds |
ml_model / fold_models |
trained MATLAB model(s) |
bootstrap_results / permutation_results |
full bootstrap / permutation output |
diagnostics |
.mult_obs_within_person, .grid_search, .feature_selection, .stability_selection, … |
cross_classify |
cross-classification output (from xval_cross_classify) |
note / history |
cell arrays of commentary / provenance lines |
error_metrics entries are (value, descrip) tuples; read e.g.
pm.error_metrics.crossval_accuracy.value.
| Method | One-liner |
|---|---|
predictive_model |
Constructor: name/value hyperparameters, or a legacy struct |
fit |
Train on the full sample (in-sample fit) |
predict |
Apply the fitted model: returns [yhat, scores] |
score |
Evaluate obj.scorer on (Y, predict(obj, X)) |
crossval |
Cross-validate: refit per fold, predict held-out, score; auto within-person stats when groups given |
clone / is_fitted / is_classifier / is_regressor / validate_object |
model bookkeeping |
| Method | One-liner |
|---|---|
bootstrap |
Bootstrap weights → SE, Z, empirical p, FDR threshold + significant mask |
permutation_test |
Null distribution by shuffling Y (free / between- / within-subjects schemes) |
grid_search |
Exhaustive hyperparameter search via CV |
stability_selection |
Per-bootstrap top-k selection frequency (high-dim feature inference) |
select_features |
Univariate / top-k-correlation feature pre-screen |
calibrate / predict_proba |
Platt / isotonic probability calibration and calibrated P(class) |
| Method | One-liner |
|---|---|
report_accuracy |
Model-type-relevant performance metrics as a struct + printed table. Classification: accuracy / balanced acc / AUC / sensitivity / specificity / PPV / NPV / d / forced-choice acc (from the cross-validated ROC). Regression: prediction–outcome r, predicted R², out-of-sample R², RMSE / MAE / MSE |
summary |
Human-readable overview: identity (algorithm, task, fit provenance), data (n obs / features / groups), CV scheme, which inference is available (bootstrap / permutation / calibration / cross-classification), and the report_accuracy performance block |
Regression R² variants (Wager & Lindquist, Principles of fMRI Ch. 39.4), computed in crossval from the pooled cross-validated predictions and exposed in error_metrics:
predicted_r2= 1 − PRESS / SST, where PRESS = Σ(yᵢ − ŷᵢ)² over the held-out (cross-validated) predictions and SST = Σ(yᵢ − ȳ)² uses the grand mean. This is §39.4's predicted R².out_of_sample_r2= 1 − PRESS / Σ(yᵢ − ȳ_train(fold))², using each fold's training mean as the baseline.- Both can be negative when the model predicts worse than the mean. Distinct from the per-fold-averaged
r2cv_scorer (each fold's own test mean, then averaged — akin to scikit-learn's default).
| Method | One-liner |
|---|---|
plot |
Task-dispatched core figure (predicted-vs-observed scatter for regression; scores-by-class + ROC for classification), plus a weight-map montage and permutation-null histogram when those artifacts are present ('noweights' / 'noperm' / 'surface' to control) |
plot_predicted_vs_observed |
Scatter (regression) or violin (binary) of predictions vs Y |
plot_permutation |
Histogram of the permutation null with observed-score and α-level critical-value lines (after permutation_test) |
rocplot |
ROC curve from cross-validated scores (wraps roc_plot) |
confusionchart / confusion_matrix |
Confusion chart / raw + normalized confusion matrices |
weight_map_object |
Map the weight vector into a statistic_image (with bootstrap .p / .sig) and cache it on pm.weights.weight_obj; second output is the image ([~, si] = ...) |
montage / surface |
Render the weight map on slices / cortical surface (thin delegates) |
These standalone classes back crossval and are shared with @pipeline:
| Class | One-liner |
|---|---|
cv_splitter |
Fold generators: kfold, stratified_kfold, group_kfold, stratified_group_kfold, logo, holdout, shuffle_split, repeated_kfold, custom_partition |
cv_scorer |
Metrics: accuracy, balanced_accuracy, f1, roc_auc, log_loss, mse, rmse, mae, r2, pearson_r |
@pipeline composes zero or more transform steps (center, zscore,
pca, or a custom fit_transform/transform struct) feeding a final
predictive_model estimator. Its crossval refits every step per fold
(leakage-free PCR, standardize-then-SVM, …), and weight_map_object(pipe, source) back-projects the estimator weights through the steps into voxel
space.
dat = load_image_set('DPSP_hotwarm'); % 118 imgs (59 subj x hot/warm), Y = +/-1
X = dat.dat'; Y = dat.Y; id = dat.metadata_table.subj_id;
pm = predictive_model('algorithm','svm','task','classification');
pm = crossval(pm, X, Y, 'groups', id, ...
'cv', cv_splitter.stratified_group_kfold(5));
pm.error_metrics.crossval_accuracy.value % cross-validated accuracy (%)
pm = bootstrap(pm, X, Y, 'nboot', 1000, 'groups', id);
plot(pm); % scores-by-class + ROC
montage(pm, dat, 'use', 'thresh_fdr', 'regions'); % FDR-thresholded clusters