Main Content

isVocabularyWord

Test if word is member of word embedding or encoding

Description

tf = isVocabularyWord(emb,words) tests if the elements of words are members of the word embedding emb. The function returns a logical array containing 1 (true) where the words are members of the word embedding. Elsewhere, the array contains 0 (false). The function, by default, is case sensitive.

example

tf = isVocabularyWord(enc,words) tests if the elements of words are members of the word encoding enc. The function, by default, is case sensitive.

tf = isVocabularyWord(___,'IgnoreCase',true) tests if the specified words are in the vocabulary ignoring case using any of the previous syntaxes.

Examples

collapse all

Test to determine if words are members of a word embedding.

Load a pretrained word embedding using the fastTextWordEmbedding function. This function requires Text Analytics Toolbox™ Model for fastText English 16 Billion Token Word Embedding support package. If this support package is not installed, then the function provides a download link.

emb = fastTextWordEmbedding
emb = 
  wordEmbedding with properties:

     Dimension: 300
    Vocabulary: [1×999994 string]

Test if the words "I", "love", and "fastTextWordEmbedding" are in the word embedding.

words = ["I" "love" "fastTextWordEmbedding"];
tf = isVocabularyWord(emb,words)
tf = 1×3 logical array

   1   1   0

Input Arguments

collapse all

Input word embedding, specified as a wordEmbedding object.

Input word encoding, specified as a wordEncoding object.

Input words, specified as a string vector, character vector, or cell array of character vectors. If you specify words as a character vector, then the function treats the argument as a single word.

Data Types: string | char | cell

Version History

Introduced in R2018b