MATLAB Answers

Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?

1 view (last 30 days)
I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?


Sign in to comment.

Accepted Answer

Christopher Creutzig
Christopher Creutzig on 26 Nov 2018
The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999

More Answers (0)

Sign in to answer this question.

Translated by