Skip to content
Advertisement

Get text value of the last children of PHP DOM-XML takes too long time

This post is somewhat related to this post: Increase performance of PHP DOM-XML. Currently takes too long time . So it might be better to visit that post first before delve into this post

I have an array which contains 7000+ value

JavaScript

This foreach loop gets text value inside <mrk> tags in a given XML file

JavaScript

It takes approximately 45 seconds. I can’t wait any longer than 5 seconds

What is the fastest way to achieve this?

Segment of the XML file:

JavaScript

Anyway, I’m doing this on an M1 Mac

Advertisement

Answer

There are a couple of things you can do here to speed up your processing. First, you are currently running an XPATH query against the entire document for each ID you are looking for. The larger your document is, and the more IDs you are searching for, the longer the process is going to take. It would be more efficient to loop through the document once, and test the person-name attribute of each unit element to see if it is in your list of IDs to extract data for. That change alone will give you a decent speedup.

However at that point, XPATH is not really doing much for you, so you might as well use XMLReader to parse the document efficiently without having to load the whole thing into memory. The code is more complex, so it’s more error-prone and difficult to understand, but if you need to efficiently process large XML documents, you need to use a streaming parser.

The speed difference between looping mechanisms in PHP is insignificant compared to the difference you could see between your current XPATH approach and using a streaming parser.

JavaScript
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement