Convert HTML Content to a DOM Object

You can convert HTML content to a DOM object that you can add to a report. The HTML content can be in a string or a file. To convert HTML content that is in a string, use one of these approaches:

See Convert HTML Content in a String.

To convert HTML content that is in a file, use one of these approaches:

See Convert HTML File Content.

Prepare HTML Before Conversion

MATLAB® Report Generator™ mlreportgen.dom.HTML and mlreportgen.dom.HTMLFile objects typically cannot accept the raw HTML output of third-party applications, such as Microsoft® Word, that export native documents as HTML markup. In these cases, your Report API report generation program can use the mlreportgen.utils.html2dom.prepHTMLString and mlreportgen.utils.html2dom.prepHTMLFile functions to prepare the raw HTML for use with the mlreportgen.dom.HTML or mlreportgen.dom.HTMLFile objects. These functions:

  • Correct invalid markup by calling mlreportgen.utils.tidy with the settings for HTML output.

  • Use the MATLAB web browser to convert the tidied markup to an HTML DOM document. See https://www.w3.org/TR/WD-DOM/introduction.html.

    The MATLAB web browser computes the CSS properties of the elements in the HTML input based on internal and external style sheets specified by the input HTML, and on the style attribute of an element. The CSS property computation supports all valid CSS style sheet selectors, including selectors not directly supported by mlreportgen.dom.HTML and mlreportgen.dom.HTMLFile objects.

  • Converts the HTML DOM document to HTML markup that is supported by mlreportgen.dom.HTML and mlreportgen.dom.HTMLFile objects. The style attribute for each element specifies the element CSS properties that the MATLAB web browser computed.

Typically, your program will have to further process the prepared HTML to remove valid but undesirable objects, such as line feeds that were in the raw content.

For an example that prepares HTML content from a file, see Prepare HTML for Conversion to a DOM Object.

Convert HTML Content in a String

To convert HTML content in a string to a DOM object, create an mlreportgen.dom.HTML object and add the object to the report.

import mlreportgen.dom.*;
d = Document('MyDoc','docx');
htmlObj = HTML('<p><b>Hello </b> <i style="color:green">World</i></p>');
append(d,htmlObj);
close(d);
rptview(d);

Alternatively, convert the HTML and add it to the document by using the addHTML method. The method returns an HTML object.

import mlreportgen.dom.*;
d = Document('MyDoc','docx');
addHTML(d, '<p><b>Hello </b> <i style="color:green">World</i></p>');
close(d);
rptview(d);

Once you create an HTML object, you can append more HTML content to the object. For example:

import mlreportgen.dom.*;
d = Document('MyDoc','docx');
htmlObj = HTML('<p><b>Hello </b> <i style="color:green">World</i></p>');

append(htmlObj,'<p>This is <u>me</u> speaking</p>');
append(d,htmlObj);

close(d);
rptview(d);

To append the content of an HTML object more than once in a report, use the clone method with the HTML object. Then, append the clone to the report.

Convert HTML File Content

To convert HTML file content to a DOM object, create an mlreportgen.dom.HTMLFile object and add the object to the report.

Create a file, MyHTML.html, that contains this HTML:

<html><p style="color:green;font-family:arial">Hello World</p></html>

Generate a PDF report based on the contents of the HTML file.

import mlreportgen.dom.*;
d = Document('MyPDF','pdf');
htmlObj = HTMLFile('MyHTML.html');
append(d,htmlObj);
close(d);
rptview(d);

Alternatively, convert the HTML and add it to the document by using the addHTMLFile method.

import mlreportgen.dom.*;
d = Document('MyPDF','pdf');
addHTMLFile(d,'MyHTML.html');
close(d);
rptview(d);

See Also

| | | | | |

Related Topics