Main Content

eraseTags

Erase HTML and XML tags from text

Description

example

newStr = eraseTags(str) erases HTML and XML comments and tags from the elements of str.

The function erases comments and tags with tag name a, abbr, acronym, b, bdi, bdo, big, code, del, dfn, em, font, i, ins, kbd, mark, rp, rt, ruby, s, small, span, strike, strong sub, sup, tt, u, var and wbr, and replaces all other tags with a space.

Tip

The eraseTags function erases the HTML and XML tags only. It does not erase HTML and XML elements. That is, the function removes tags of the form <X>, where X denotes the tag name and any attributes. The function does not remove content that appears between opening and closing tags. For example, eraseTags("x<a>y</a>") returns the string "xy". It only removes the tags <a> and </a>, and does not remove the element <a>y</a>.

Examples

collapse all

Erase the tags from some HTML code. The function replaces the <br> tag with a space.

htmlCode = "one.<br>two";
newStr = eraseTags(htmlCode)
newStr = 
"one. two"

Erase the tags from some XML code. The function removes the <sub> tags and does not replace them with a space.

xmlCode = "H<sub>2</sub>O";
newStr = eraseTags(xmlCode)
newStr = 
"H2O"

Input Arguments

collapse all

Input text, specified as a string array, character vector, or cell array of character vectors.

Example: ["An example of a short sentence."; "A second short sentence."]

Data Types: string | char | cell

Output Arguments

collapse all

Output text, returned as a string array, character vector, or cell array of character vectors. str and newStr have the same data type.

Version History

Introduced in R2017b