Skip to content
Advertisement

PHP simpleXMLelment xpath returns unexpected results

I have executed the following code on the the sample XML at the bottom of the question and I am getting unexpected results.

$xml = simplexml_load_string($xml_string);
$addresses = $xml->response->addressinformation;
var_dump($addresses->xpath('//record'));

I would expect this to return only the two record nodes that are children of the current addresses node. But, it actually returns all 5 of the record nodes of the original $xml element. Everything I have read says that the // notation is relative to the current node. I realize that there are other ways to get to just the two records I’ve referenced in the questions. $addresses->xpath('records/record'); is just one example. But, the strange behavior is part of a larger problem I’m having and I just need to understand why it is behaving this way. Everything I’ve read would lead me to believe otherwise. Can anyone help me understand?

Sample XML

$xml_string = '
<?xml version="1.0" encoding="utf-8"?>
<root>
    <response>
        <addressinformation>
            <records>
                <record id="1">
                    <fullname>JOHN E DOE</fullname>
                    <firstname>JOHN</firstname>
                    <middlename>E</middlename>
                    <lastname>DOE</lastname>
                    <fulldob>01/01/1970</fulldob>
                </record>
                <record id="2">
                    <fullname>JOHN E DOE</fullname>
                    <firstname>JOHN</firstname>
                </record>
            </records>
        </addressinformation>
        <otherinformation>
            <records>
                <record id="3">
                    <fullname>JOHN DOE</fullname>
                    <firstname>JOHN</firstname>
                    <lastname>DOE</lastname>
                    <fulldob>01/01/1970</fulldob>
                </record>
                <record id="4">
                    <fullname>JOHN EDWARD DOE</fullname>
                    <firstname>JOHN</firstname>
                    <middlename>EDWARD</middlename>
                    <lastname>DOE</lastname>
                    <fulldob>19700000</fulldob>
                </record>
                <record id="5">
                    <fullname>JOHN EDWARD DOE</fullname>
                    <firstname>JOHN</firstname>
                    <middlename>EDWARD</middlename>
                    <lastname>DOE</lastname>
                    <fulldob>19830000</fulldob>
                </record>
            </records>
        </otherinformation>
    </response>
</root>
';

Advertisement

Answer

According to https://www.w3.org/TR/1999/REC-xpath-19991116/:

//para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node

and

.//para selects the para element descendants of the context node

Note the dot before the latter one. This also works in your case:

var_dump($addresses->xpath('.//record'));

properly only shows the two nodes you are expecting.

The thing is – apparently – that even all the objects are just of type SimpleXMLElement, the first one that you are creating by calling simplexml_load_string() is for some reason considered the document root. When you “destruct” your document into nodes and subnodes this all makes sense to me.

However I would agree that this is at least some behaviour that is not documented in the PHP docs, so I recommend you suggest an edit there.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement