I’ve a complex XML that I want to transform in HTML. Some tags need to be replaced in html tags.
The XML is this:
<root> <div> <p> <em>bol text</em>, some normale text </p> </div> <list> <listitem> normal text inside list <em>bold inside list</em> </listitem> <listitem> another text in list... </listitem> </list> <p> A sample paragraph </p>
The text inside the element is variable, which means that the other xml that I parse can completely change.
The output I want is this (for this scenario):
<root> <div> <p> <strong>bol text</strong>, some normale text </p> </div> <ul> <li> normal text inside list <strong>bold inside list</strong> </li> <li> another text in list... </li> </ul> <p> A sample paragraph </p> </root>
I make a recursive function for parse any single node of xml and replace it in HTML tag (but doesn’t work):
$doc = new DOMDocument(); $doc->preserveWhiteSpace = false; $doc->load('section.xml'); echo $doc->saveHTML(); function printHtml(DOMNode $node) { if ($node->hasChildNodes()) { foreach ($node->childNodes as $child) { printHtml($child); } } if ($node->nodeName == 'em') { $newNode = $node->ownerDocument->createElement('strong', $node->nodeValue); $node->parentNode->replaceChild($newNode, $node); } if ($node->nodeName == 'listitem') { $newNode = $node->ownerDocument->createElement('li', $node->nodeValue); $node->parentNode->replaceChild($newNode, $node); } }
Can anyone help me?
This is an example of a complete xml:
<root> <div> <p> <em>bol text</em>, some normale text </p> </div> <list> <listitem> normal text inside list <em>bold inside list</em> </listitem> <listitem> another text in list... </listitem> </list> <media> <info isVisible="false"> <title> <p>Image title <em>in bold</em> not in bold</p> </title> </info> <file isVisible="true"> <href> "path/to/file.jpg" </href> </file> </media> <p> A sample paragraph </p> </root>
Which has to be transformed into:
<root> <div> <p> <strong>bol text</strong>, some normale text </p> </div> <ul> <li> normal text inside list <em>bold inside list</em> </li> <li> another text in list... </li> </ul> <!-- the media tag can be presented in two mode: with title visible, and title hidden --> <!-- this is the case when the title is hidden --> <img src="path/to/file.jpg" /> <!-- this is the case when the title is visible --> <!-- the info tag (inside media tag) has an attribute isVisible="false" which means it doesn't have to be shown. --> <!-- if the info tag has visible=true, the media tag must be translated into <div> <img src="path/to/file.jpg" /> <p>Image title <strong>in bold</strong> not in bold</p> <div> --> <p> A sample paragraph </p> </root>
Advertisement
Answer
There’s a language specially designed for this task: it’s called XSLT, and you can easily express your desired transformation in XSLT and invoke it from your PHP program. There’s a learning curve, of course, but it’s a much better solution than writing low-level DOM code.
In XSLT you write a set of template rules saying how individual elements should be handled. Many elements in your example are copied through unchanged, so you can start with a default rule that does this:
<xsl:template match="*"> <xsl:copy><xsl:apply-templates/></xsl:copy> </xsl:template>
The “match” part says what part of the input you are matching; the body of the rule says what output to produce. The xsl:apply-templates does a recursive descent to process the children of the current element.
Some of your elements are simply renamed, for example
<xsl:template match="listitem"> <li><xsl:apply-templates/></li> </xsl:template>
Some of the rules are a little bit more complex, but still easily expressed:
<xsl:tempate match="media/file[@isVisible='true']"> <img src="{href}"/> </xsl:template>
I hope you agree that this declarative rule-based approach is much clearer than your procedural code; it’s also much easier for someone else to change the rules in six months’ time.