Create Word Cloud from String Arrays
This example shows how to create a word cloud from plain text by reading it into a string array, preprocessing it, and passing it to the wordcloud
function. If you have Text Analytics Toolbox™ installed, then you can create word clouds directly from string arrays. For more information, see wordcloud
(Text Analytics Toolbox) (Text Analytics Toolbox).
Read the text from Shakespeare's Sonnets with the fileread
function.
sonnets = fileread('sonnets.txt');
sonnets(1:135)
ans = 'THE SONNETS by William Shakespeare I From fairest creatures we desire increase, That thereby beauty's rose might never die,'
Convert the text to a string using the string
function. Then, split it on newline characters using the splitlines
function.
sonnets = string(sonnets); sonnets = splitlines(sonnets); sonnets(10:14)
ans = 5x1 string
" From fairest creatures we desire increase,"
" That thereby beauty's rose might never die,"
" But as the riper should by time decease,"
" His tender heir might bear his memory:"
" But thou, contracted to thine own bright eyes,"
Replace some punctuation characters with spaces.
p = ["." "?" "!" "," ";" ":"]; sonnets = replace(sonnets,p," "); sonnets(10:14)
ans = 5x1 string
" From fairest creatures we desire increase "
" That thereby beauty's rose might never die "
" But as the riper should by time decease "
" His tender heir might bear his memory "
" But thou contracted to thine own bright eyes "
Split sonnets
into a string array whose elements contain individual words. To do this, join all the string elements into a 1-by-1 string and then split on the space characters.
sonnets = join(sonnets); sonnets = split(sonnets); sonnets(7:12)
ans = 6x1 string
"From"
"fairest"
"creatures"
"we"
"desire"
"increase"
Remove words with fewer than five characters.
sonnets(strlength(sonnets)<5) = [];
Convert sonnets
to a categorical array and then plot using wordcloud
. The function plots the unique elements of C
with sizes corresponding to their frequency counts.
C = categorical(sonnets);
figure
wordcloud(C);
title("Sonnets Word Cloud")