Least Frequent Words in document

13 vues (au cours des 30 derniers jours)
Charmaine Tan
Charmaine Tan le 28 Nov 2018
Réponse apportée : Snehal le 29 Jan 2025 à 10:29
If I use topkwords to find the most-frequent words, what code can I use to show the 10-least frequent words?
  1 commentaire
KSSV
KSSV le 28 Nov 2018
Read about strfind, strcmp.

Connectez-vous pour commenter.

Réponses (1)

Snehal
Snehal le 29 Jan 2025 à 10:29
Hi,
I understand that you want to display the 10-least frequent words from a given set of words or sentences.
This can be achieved using the 'topkwords' function. Pass the input to 'topkwords', setting the k value to 'inf'. Then, sort the output of 'topkwords' in ascending order and display the top 10 words.
Refer to the sample code below for better understanding:
% Sample text data
textData = "This is a sample text. This text is for testing if our approach can display the least frequent words correctly or not";
% before using the ‘topkwords’ function, we need to convert the text into bag-of-words format
documents = tokenizedDocument(textData);
docs = bagOfWords(documents);
table = topkwords(docs, inf);
sortedTable = sortrows(table,'Count');
% Select the 10 least frequent words
numLeastFrequent = 10;
leastFrequentWords = sortedTable.Word(1:numLeastFrequent);
leastFrequentCounts = sortedTable.Count(1:numLeastFrequent);
% Display the 10 least frequent words and their counts
disp(leastFrequentWords);
"a" "sample" "." "for" "testing" "if" "our" "approach" "can" "display"
Refer to the following documentations for more details:
Hope this helps.

Catégories

En savoir plus sur Characters and Strings dans Help Center et File Exchange

Produits


Version

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by