The function to test the lda model for multiple dimensions, e.g., 2.
topicsTest(
data,
model = NULL,
preds = NULL,
ngrams = NULL,
pred_var_x = NULL,
pred_var_y = NULL,
group_var = NULL,
control_vars = c(),
test_method = "linear_regression",
p_alpha = 0.05,
p_adjust_method = "fdr",
seed = 42,
load_dir = NULL,
save_dir = "./results"
)
(tibble) The data to test on
(list) The trained model
(tibble) The predictions
(list) output of the ngram function
(string) The x variable name to be predicted, and to be plotted (only needed for regression or correlation)
(string) The y variable name to be predicted, and to be plotted (only needed for regression or correlation)
(string) The variable to group by (only needed for t-test)
(vector) The control variables (not supported yet)
(string) The test method to use, either "correlation","t-test", "linear_regression","logistic_regression", or "ridge_regression"
(numeric) Threshold of p value set by the user for visualising significant topics
(character) Method to adjust/correct p-values for multiple comparisons (default = "none"; see also "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr").
(integer) The seed to set for reproducibility
(string) The directory to load the test from, if NULL, the test will not be loaded
(string) The directory to save the test, if NULL, the test will not be saved
A list of the test results, test method, and prediction variable
# \donttest{
# Test the topic document distribution in respect to a variable
dtm <- topicsDtm(data = dep_wor_data$Depphrase)
#> [1] "The Dtm, data, and summary are saved in./results/seed_42/dtms.rds"
model <- topicsModel(dtm = dtm, # output of topicsDtm()
num_topics = 20,
num_top_words = 10,
num_iterations = 1000,
seed = 42,
save_dir = "./results")
#> [1] "The Model is saved in./results/seed_42/model.rds"
preds <- topicsPreds(model = model, # output of topicsModel()
data = dep_wor_data$Depphrase)
#> [1] "Predictions are saved in./results/seed_42/preds.rds"
test <- topicsTest(model = model, # output of topicsModel()
data=dep_wor_data,
preds = preds, # output of topicsPreds()
test_method = "linear_regression",
pred_var_x = "Age")
#> Directory already exists.
#> [1] "The test object of Age was saved in: ./results/seed_42/test_linear_regression_Age.rds"
#> [1] "The parameter pred_var_y is not set! Output 1 dimensional results."
# }