I’ve a complex XML that I want to transform in HTML. Some tags need to be replaced in html tags.
The XML is this:
<root>
<div>
<p>
<em>bol text</em>, some normale text
</p>
</div>
<list>
<listitem>
normal text inside list <em>bold inside list</em>
</listitem>
<listitem>
another text in list
</listitem>
</list>
<p>
A sample paragraph
</p>
The text inside the element is variable, which means that the other xml that I parse can completely change.
The output I want is this (for this scenario):
<root>
<div>
<p>
<strong>bol text</strong>, some normale text
</p>
</div>
<ul>
<li>
normal text inside list <strong>bold inside list</strong>
</li>
<li>
another text in list
</li>
</ul>
<p>
A sample paragraph
</p>
</root>
I make a recursive function for parse any single node of xml and replace it in HTML tag (but doesn’t work):
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->load('section.xml');
echo $doc->saveHTML();
function printHtml(DOMNode $node)
{
if ($node->hasChildNodes())
{
foreach ($node->childNodes as $child)
{
printHtml($child);
}
}
if ($node->nodeName == 'em')
{
$newNode = $node->ownerDocument->createElement('strong', $node->nodeValue);
$node->parentNode->replaceChild($newNode, $node);
}
if ($node->nodeName == 'listitem')
{
$newNode = $node->ownerDocument->createElement('li', $node->nodeValue);
$node->parentNode->replaceChild($newNode, $node);
}
}
Can anyone help me?
This is an example of a complete xml:
<root>
<div>
<p>
<em>bol text</em>, some normale text
</p>
</div>
<list>
<listitem>
normal text inside list <em>bold inside list</em>
</listitem>
<listitem>
another text in list
</listitem>
</list>
<media>
<info isVisible="false">
<title>
<p>Image title <em>in bold</em> not in bold</p>
</title>
</info>
<file isVisible="true">
<href>
"path/to/file.jpg"
</href>
</file>
</media>
<p>
A sample paragraph
</p>
</root>
Which has to be transformed into:
<root>
<div>
<p>
<strong>bol text</strong>, some normale text
</p>
</div>
<ul>
<li>
normal text inside list <em>bold inside list</em>
</li>
<li>
another text in list
</li>
</ul>
<!-- the media tag can be presented in two mode: with title visible, and title hidden -->
<!-- this is the case when the title is hidden -->
<img src="path/to/file.jpg" />
<!-- this is the case when the title is visible -->
<!-- the info tag (inside media tag) has an attribute isVisible="false" which means it doesn't have to be shown. -->
<!-- if the info tag has visible=true, the media tag must be translated into
<div>
<img src="path/to/file.jpg" />
<p>Image title <strong>in bold</strong> not in bold</p>
<div>
-->
<p>
A sample paragraph
</p>
</root>
Advertisement
Answer
There’s a language specially designed for this task: it’s called XSLT, and you can easily express your desired transformation in XSLT and invoke it from your PHP program. There’s a learning curve, of course, but it’s a much better solution than writing low-level DOM code.
In XSLT you write a set of template rules saying how individual elements should be handled. Many elements in your example are copied through unchanged, so you can start with a default rule that does this:
<xsl:template match="*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
The “match” part says what part of the input you are matching; the body of the rule says what output to produce. The xsl:apply-templates does a recursive descent to process the children of the current element.
Some of your elements are simply renamed, for example
<xsl:template match="listitem">
<li><xsl:apply-templates/></li>
</xsl:template>
Some of the rules are a little bit more complex, but still easily expressed:
<xsl:tempate match="media/file[@isVisible='true']">
<img src="{href}"/>
</xsl:template>
I hope you agree that this declarative rule-based approach is much clearer than your procedural code; it’s also much easier for someone else to change the rules in six months’ time.