The function to test the lda model for multiple dimensions, e.g., 2.

topicsTest(
  data,
  model = NULL,
  preds = NULL,
  ngrams = NULL,
  pred_var_x = NULL,
  pred_var_y = NULL,
  group_var = NULL,
  control_vars = c(),
  test_method = "linear_regression",
  p_alpha = 0.05,
  p_adjust_method = "fdr",
  seed = 42,
  load_dir = NULL,
  save_dir = "./results"
)

Arguments

data

(tibble) The data to test on

model

(list) The trained model

preds

(tibble) The predictions

ngrams

(list) output of the ngram function

pred_var_x

(string) The x variable name to be predicted, and to be plotted (only needed for regression or correlation)

pred_var_y

(string) The y variable name to be predicted, and to be plotted (only needed for regression or correlation)

group_var

(string) The variable to group by (only needed for t-test)

control_vars

(vector) The control variables (not supported yet)

test_method

(string) The test method to use, either "correlation","t-test", "linear_regression","logistic_regression", or "ridge_regression"

p_alpha

(numeric) Threshold of p value set by the user for visualising significant topics

p_adjust_method

(character) Method to adjust/correct p-values for multiple comparisons (default = "none"; see also "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr").

seed

(integer) The seed to set for reproducibility

load_dir

(string) The directory to load the test from, if NULL, the test will not be loaded

save_dir

(string) The directory to save the test, if NULL, the test will not be saved

Value

A list of the test results, test method, and prediction variable

Examples

# \donttest{
# Test the topic document distribution in respect to a variable
dtm <- topicsDtm(data = dep_wor_data$Depphrase)
#> [1] "The Dtm, data, and summary are saved in./results/seed_42/dtms.rds"

model <- topicsModel(dtm = dtm, # output of topicsDtm()
                     num_topics = 20,
                     num_top_words = 10,
                     num_iterations = 1000,
                     seed = 42,
                     save_dir = "./results")
#> [1] "The Model is saved in./results/seed_42/model.rds"
                     
preds <- topicsPreds(model = model, # output of topicsModel()
                     data = dep_wor_data$Depphrase)
#> [1] "Predictions are saved in./results/seed_42/preds.rds"
                     
test <- topicsTest(model = model, # output of topicsModel()
                   data=dep_wor_data,
                   preds = preds, # output of topicsPreds()
                   test_method = "linear_regression",
                   pred_var_x = "Age")
#> Directory already exists.
#> [1] "The test object of Age was saved in: ./results/seed_42/test_linear_regression_Age.rds"
#> [1] "The parameter pred_var_y is not set! Output 1 dimensional results."
# }