Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?
1 vue (au cours des 30 derniers jours)
Afficher commentaires plus anciens
Stephen Bruestle
le 15 Oct 2018
Commenté : Stephen Bruestle
le 30 Nov 2018
I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?
0 commentaires
Réponse acceptée
Christopher Creutzig
le 26 Nov 2018
The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999
2 commentaires
Plus de réponses (0)
Voir également
Catégories
En savoir plus sur Text Analytics Toolbox dans Help Center et File Exchange
Produits
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!