| Title: | A Procedure for Multicollinearity Testing using Bootstrap |
|---|---|
| Description: | Functions for detecting multicollinearity. This test gives statistical support to two of the most famous methods for detecting multicollinearity in applied work: Klein’s rule and Variance Inflation Factor (VIF). See the URL for the papers associated with this package, as for instance, Morales-Oñate and Morales-Oñate (2015) <doi:10.33333/rp.vol51n2.05>. |
| Authors: | Víctor Morales-Oñate [aut, cre] (ORCID: <https://orcid.org/0000-0003-1922-6571>), Bolívar Morales-Oñate [aut] (ORCID: <https://orcid.org/0000-0003-4980-8759>) |
| Maintainer: | Víctor Morales-Oñate <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.0.4 |
| Built: | 2026-06-08 09:27:51 UTC |
| Source: | https://github.com/vmoprojs/mtest |
MTest implements a nonparametric (pairs) bootstrap to assess
multicollinearity by providing achieved significance levels (ASL)
for two widely used diagnostics: Klein's rule and the Variance Inflation
Factor (VIF). It returns bootstrap distributions of the global
and the auxiliary (from regressions of each predictor on the remaining
predictors), along with p-values for both rules.
MTest(object, nboot = 100, nsam = NULL, trace = FALSE, seed = NULL, valor_vif = 0.9)MTest(object, nboot = 100, nsam = NULL, trace = FALSE, seed = NULL, valor_vif = 0.9)
object |
A fitted model, typically of class |
nboot |
Integer. Number of bootstrap iterations (rows resampled with replacement). |
nsam |
Integer. Bootstrap sample size per iteration. Defaults to the original number of rows. |
trace |
Logical. If |
seed |
Integer. Optional RNG seed for reproducibility. |
valor_vif |
Numeric in |
Model. Consider the linear regression model
and the auxiliary regressions obtained by regressing each predictor
on the remaining predictors . Let be the global coefficient of
determination and the coefficient of determination of the -th auxiliary regression.
Diagnostics and achieved significance levels (ASL).
Klein's rule: flag multicollinearity if .
We estimate the ASL as using the bootstrap distribution.
VIF rule: flag multicollinearity if VIF exceeds a threshold.
Since , this is equivalent to testing
against valor_vif. We estimate valor_vif.
Bootstrap scheme.
The function resamples rows of the model frame (pairs bootstrap) and, for each bootstrap
sample, computes and (hence VIF) using the same expanded design
matrix as the original fit. This makes the procedure robust to transformed terms on either
side of the formula (e.g., log(y), I(X1^2), interactions, factors,
poly(), etc.).
An object of class MTest, which is a list containing:
pval_vif |
Named numeric vector of ASL for the VIF rule,
|
pval_klein |
Named numeric vector of ASL for Klein's rule,
|
Bvals |
Numeric matrix of size |
VIFvals |
Numeric matrix |
vif.tot |
Observed VIF per predictor from the original design. |
R.tot |
Named numeric vector with observed |
nsam |
Bootstrap sample size actually used. |
nboot |
Number of bootstrap iterations actually performed. |
Larger pval_klein[j] indicates stronger evidence that predictor
violates Klein's rule ( often exceeds ).
Larger pval_vif[j] indicates that frequently exceeds
valor_vif (equivalently, VIF exceeds the implied threshold).
For factor predictors, the underlying design includes multiple columns;
VIFvals and VIF-related summaries are returned per design column.
In singular bootstrap samples some statistics may be NA.
Víctor Morales Oñate [email protected]
Bolívar Morales Oñate [email protected]
https://sites.google.com/site/moralesonatevictor/
https://www.linkedin.com/in/vmoralesonate/
Morales-Oñate, V., and Morales-Oñate, B. (2023). MTest: a Bootstrap Test for Multicollinearity. Revista Politécnica, 51(2), 53–62. doi:10.33333/rp.vol51n2.05
vif for classical VIF computation.
## Minimal example (small nboot for speed) set.seed(1) data(simDataMTest, package = "MTest") m1 <- stats::lm(y ~ ., data = simDataMTest) boot.sol <- MTest(m1, nboot = 50, trace = FALSE, seed = 123, valor_vif = 0.90) boot.sol$pval_vif boot.sol$pval_klein head(boot.sol$Bvals) print(boot.sol)## Minimal example (small nboot for speed) set.seed(1) data(simDataMTest, package = "MTest") m1 <- stats::lm(y ~ ., data = simDataMTest) boot.sol <- MTest(m1, nboot = 50, trace = FALSE, seed = 123, valor_vif = 0.90) boot.sol$pval_vif boot.sol$pval_klein head(boot.sol$Bvals) print(boot.sol)
Computes pairwise Kolmogorov–Smirnov (KS) tests between all columns of a
numeric matrix or data frame, returning the matrix of p-values. Typical inputs
include the Bvals matrix from MTest. For one-sided alternatives
("greater" or "less"), the p-value matrix is directional:
rows correspond to x and columns to y.
pairwiseKStest(X, alternative = c("greater","less","two.sided"), use = c("asis","pairwise.complete.obs"), exact = NULL)pairwiseKStest(X, alternative = c("greater","less","two.sided"), use = c("asis","pairwise.complete.obs"), exact = NULL)
X |
Numeric matrix or data frame. Columns are compared pairwise by KS tests.
A common use is |
alternative |
Character string: |
use |
Character string: |
exact |
Logical or |
The function performs a KS test for each ordered pair of columns (i, j)
using ks.test(X[, i], X[, j], alternative = alternative, exact = exact).
For one-sided alternatives, the result is not symmetric, since rows play the
role of x and columns the role of y.
The returned Suggestion follows the same rule as the original function:
for alternative = "greater", it sorts the row sums of the p-value matrix
(descending); for "less", it sorts the column sums; for "two.sided",
no suggestion is returned.
A list of class pairwiseKStest with components:
KSpwMatrix |
Numeric matrix of p-values. Rows are |
alternative |
Character string describing the alternative hypothesis used. |
Suggestion |
For |
Víctor Morales Oñate [email protected]
Bolívar Morales Oñate [email protected]
https://sites.google.com/site/moralesonatevictor/
https://www.linkedin.com/in/vmoralesonate/
Morales-Oñate, V., and Morales-Oñate, B. (2023). MTest: a Bootstrap Test for Multicollinearity. Revista Politécnica, 51(2), 53–62. doi:10.33333/rp.vol51n2.05
## Typical workflow with MTest: ## (use small nboot for speed in examples) set.seed(1) data(simDataMTest, package = "MTest") m1 <- stats::lm(y ~ ., data = simDataMTest) boot.sol <- MTest(m1, nboot = 30, trace = FALSE, seed = 123) ## Compare only predictors (exclude "global"): ks_res_greater <- pairwiseKStest(boot.sol$Bvals[, -1], alternative = "greater", use = "asis", # same behavior as the original exact = NULL) # let ks.test decide ks_res_greater$KSpwMatrix ks_res_greater$Suggestion ## Two-sided (no suggestion by design): ks_res_twosided <- pairwiseKStest(boot.sol$Bvals[, -1], alternative = "two.sided") ks_res_twosided$KSpwMatrix## Typical workflow with MTest: ## (use small nboot for speed in examples) set.seed(1) data(simDataMTest, package = "MTest") m1 <- stats::lm(y ~ ., data = simDataMTest) boot.sol <- MTest(m1, nboot = 30, trace = FALSE, seed = 123) ## Compare only predictors (exclude "global"): ks_res_greater <- pairwiseKStest(boot.sol$Bvals[, -1], alternative = "greater", use = "asis", # same behavior as the original exact = NULL) # let ks.test decide ks_res_greater$KSpwMatrix ks_res_greater$Suggestion ## Two-sided (no suggestion by design): ks_res_twosided <- pairwiseKStest(boot.sol$Bvals[, -1], alternative = "two.sided") ks_res_twosided$KSpwMatrix
Plot density or empirical cumulative distribution from Bvals in MTest output.
## S3 method for class 'MTest' plot(x, type=1,plotly = FALSE,...)## S3 method for class 'MTest' plot(x, type=1,plotly = FALSE,...)
x |
an object of the class |
type |
Numeric; 1 if density, 2 if ecdf plot is returned |
plotly |
Logical; if |
... |
other arguments to be passed to the function
|
This function plots density or empirical cumulative distribution function from MTest bootstrap replications.
Produces a plot. No values are returned.
MTest for procedure and examples.
This data set helps testing functions in MTest package, the generating process is documented in the reference.
simDataMTestsimDataMTest
A dataframe containing 10000 observations and four columns.
Morales-Oñate, V., and Morales-Oñate, B. (2023). MTest: a Bootstrap Test for Multicollinearity. Revista Politécnica, 51(2), 53–62. doi:10.33333/rp.vol51n2.05