Permute and bootstrap data fed to PCA n
times. Bootstrapped data is used to
estimate confidence bands for variance explained by each PC and for each
loading. Squared loadings are multiplied by the squared eigenvalue of the
relevant PC. This ranks the loadings of PCs which explain a lot of variance
higher than those from PCs which explain less. This approach to PCA testing
follows Carmago (2022) and Vieria (2012). This approach differs from
Carmago's PCAtest package by separating data generation and plotting.
Arguments
- pca_data
data fed to the
prcomp
function.- n
the number of times to permute and bootstrap that data. Warning: high values will take a long time to compute.
- scale
whether the PCA variables should be scaled (default: TRUE).
- variance_confint
size of confidence intervals for variance explained (default: 0.95).
- loadings_confint
size of confidence intervals for index loadings (default: 0.9).
Value
object of class pca_test_results
, containing:
$variance
a tibble containing the variances explained and confidence intervals for each PC.$loadings
a tibble containing the index loadings and confidence intervals for each variable and PC.$raw_data
a tibble containing the variance explained and loadings for each bootstrapped and permuted analysis.$variance_confint
confidence intervals applied to variance explained.$loadings_confint
confidence interval applied to loadings.$n
the number of iterations of both permutation and bootstrapping.
Details
Default confidence bands on variance explained at 0.95 (i.e. alpha of 0.05). In line with Vieria (2012), the default confidence bands on the index loadings are at 0.9.
See plot_loadings()
and plot_variance_explained()
for useful plotting
functions.
References
Camargo, Arley (2022), PCAtest: testing the statistical significance of Principal Component Analysis in R. PeerJ 10. e12967. doi:10.7717/peerj.12967
Vieira, Vasco (2012): Permutation tests to estimate significances on Principal Components Analysis. Computational Ecology and Software 2. 103–123.
Examples
onze_pca <- pca_test(
onze_intercepts |> dplyr::select(-speaker),
n = 10,
scale = TRUE
)
summary(onze_pca)
#> PCA Permutation and Bootstrapping Test
#>
#> Iterations: 10
#>
#> Significant PCs at 0.05 level: PC1, PC2, PC3, PC4, PC5.
#>
#> Significant loadings at 0.1 level:
#> PC1: F1_FLEECE
#> PC1: F1_GOOSE
#> PC1: F1_START
#> PC1: F1_STRUT
#> PC1: F1_THOUGHT
#> PC1: F1_TRAP
#> PC1: F2_FLEECE
#> PC1: F2_NURSE
#> PC1: F2_STRUT
#> PC1: F2_THOUGHT
#> PC2: F1_FLEECE
#> PC2: F1_NURSE
#> PC2: F2_DRESS
#> PC2: F2_KIT
#> PC2: F2_LOT
#> PC2: F2_STRUT
#> PC2: F2_THOUGHT
#> PC2: F2_TRAP
#> PC3: F2_FLEECE
#> PC3: F2_GOOSE
#> PC3: F2_LOT
#> PC3: F2_NURSE
#> PC4: F1_GOOSE
#> PC4: F1_KIT
#> PC4: F1_LOT
#> PC5: F1_START
#> PC5: F1_STRUT
#> PC6: F1_DRESS
#> PC6: F1_NURSE
#> PC6: F2_START
#> PC8: F1_KIT
#> PC10: F1_THOUGHT
#> PC10: F2_GOOSE
#> PC11: F2_DRESS
#> PC16: F2_KIT