Main Content

doclength

Length of documents in document array

Description

example

N = doclength(documents) returns the number of tokens in each document in documents.

Examples

collapse all

Find the number of words in an array of tokenized documents. Erase the punctuation characters so they do not get counted as words.

str = [ ...
    "An example of a short sentence." 
    "A second short sentence."];
documents = tokenizedDocument(str)
documents = 
  2x1 tokenizedDocument:

    7 tokens: An example of a short sentence .
    5 tokens: A second short sentence .

documents = erasePunctuation(documents)
documents = 
  2x1 tokenizedDocument:

    6 tokens: An example of a short sentence
    4 tokens: A second short sentence

N = doclength(documents)
N = 2×1

     6
     4

Input Arguments

collapse all

Input documents, specified as a tokenizedDocument array.

Output Arguments

collapse all

Document lengths, returned as a vector of nonnegative integers. The size of N is the same as the size of documents.

Version History

Introduced in R2017b