Main Content


Erase HTML and XML tags from text



newStr = eraseTags(str) erases HTML and XML comments and tags from the elements of str.

The function erases comments and tags with tag name a, abbr, acronym, b, bdi, bdo, big, code, del, dfn, em, font, i, ins, kbd, mark, rp, rt, ruby, s, small, span, strike, strong sub, sup, tt, u, var and wbr, and replaces all other tags with a space.

The function does not remove HTML and XML elements (the tags as well anything between start and end tags). For example, eraseTags("x<a>y</a>") returns the string "xy". It only removes the tags <a> and </a>, and does not remove the element <a>y</a>.


collapse all

Erase the tags from some HTML code. The function replaces the <br> tag with a space.

htmlCode = "one.<br>two";
newStr = eraseTags(htmlCode)
newStr = 
"one. two"

Erase the tags from some XML code. The function removes the <sub> tags and does not replace them with a space.

xmlCode = "H<sub>2</sub>O";
newStr = eraseTags(xmlCode)
newStr = 

Input Arguments

collapse all

Input text, specified as a string array, character vector, or cell array of character vectors.

Example: ["An example of a short sentence."; "A second short sentence."]

Data Types: string | char | cell

Output Arguments

collapse all

Output text, returned as a string array, a character vector, or cell array of character vectors. str and newStr have the same data type.

Introduced in R2017b