How was the exampleWordEmbedding example in the text analytics toolbox trained, in detail?

Question

William Smith le 19 Nov 2017

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/368006-how-was-the-examplewordembedding-example-in-the-text-analytics-toolbox-trained-in-detail

Réponse apportée : Christopher Creutzig le 9 Mar 2020

The documentation for readWordEmbedding gives a pre-trained embedding, saying only that it was "derived by analyzing text from Wikipedia".

How was it trained?

Should we consider it a 'high quality' word embedding, better than anything a user could generate without extensive work and CPU time? Or is it a quick and dirty starting point, and we are encouraged to train our own for better performance?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Christopher Creutzig le 9 Mar 2020

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/368006-how-was-the-examplewordembedding-example-in-the-text-analytics-toolbox-trained-in-detail#answer_419231

The embedding is rather low-dimensional (50 dimensions) and has a small vocabulary (with 9999 words). It is unlikely to be “high quality” unless your analysis just happens to need precisely this dataset.

For production use, it is much more likely you'll find fastTextWordEmbedding useful, which downloads data from https://www.mathworks.com/matlabcentral/fileexchange/66229-text-analytics-toolbox-model-for-fasttext-english-16-billion-token-word-embedding for you.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

How was the exampleWordEmbedding example in the text analytics toolbox trained, in detail?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

How was the exampleWordEmbedding example in the text analytics toolbox trained, in detail?

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens