MATLAB Answers

Question on running fitlda

1 view (last 30 days)
I want to run fitlda, with the following specification:
* use Griffiths and Steyvers (2004) Gibbs Sampling algorithm for LDA as they ran it,
* 12 topics (i.e. K=12),
* a symmetric alpha of 50/K (no updating),
* a symmetric beta of .01 (no updating), and
* exactly 2000 iterations (without early termination).
Would that be:
numTopics = 12;
mdl = fitlda(bag,numTopics,'Verbose',1,'InitialTopicConcentration',50,'FitTopicConcentration',false,'WordConcentration',.01,'LogLikelihoodTolerance',0,'IterationLimit',2000);

  0 Comments

Sign in to comment.

Accepted Answer

Christopher Creutzig
Christopher Creutzig on 10 Dec 2018
Gibbs sampling involves stochastic elements (i.e., a pseudorandom number generator), meaning reproducing exactly the results of the 2004 paper will require using their code and their rng settings. (Which is also why in degenerate cases, you do get substantially different answers for multiple fitlda calls.)
Without looking up the definition of β in the original paper, I'm not sure if you want to set 'WordConcentration',.01 or 'WordConcentration',.01*bag.NumWords.
Other than that, the call looks like it should do what you ask, yes.

  1 Comment

Stephen Bruestle
Stephen Bruestle on 10 Dec 2018
Thanks.
For future users trying to do the same thing: Chris's answer helped me figure out that I would need to set β to be:
'WordConcentration',.01*bag.NumWords
The reason that you would want with this alpha and beta is to be consistant with the recommendations in Steyvers and Griffiths (2007).

Sign in to comment.

More Answers (0)

Sign in to answer this question.

Products


Release

R2018b

Translated by