Creating large XML files with PHP

When creating large XML files with PHP, there are some important considerations to bear in mind with regards to scalability. There are several libraries available for writing XML files of small to intermediate size (such as DOMDocument), but when dealing with very large files (eg. > 500Mb, or several million elements), these libraries are no longer useful as the size of the file then can create is memory-bound.

For example, DOMDocument stores the XML tree in memory while it is being built - you then flush it out to file after all elements have been created:

$dom = new DOMDocument('1.0');
for ($i=0; $i<=10000; ++$i) {
    $root = $dom->createElement('message');
    $content = $dom->createElement('content');
    $content->appendChild($dom->createTextNode('Example content'));
// Flush XML from memory to file in one go
file_put_contents('example.xml', $dom->saveXML());

However, this doesn’t scale once your feed size starts exceeding the available memory (teaking memory settings in php.ini is only a short-term fix). A good solution to this is to use the XMLWriter library as this provides the ability to periodically flush the XML in memory out to file. By doing so, you reclaim the memory so you can keep building the XML tree without exceeding memory limitations.

$xmlWriter = new XMLWriter();
$xmlWriter->startDocument('1.0', 'UTF-8');
for ($i=0; $i<=10000000; ++$i) {
    $xmlWriter->writeElement('content', 'Example content');
    // Flush XML in memory to file every 1000 iterations
    if (0 == $i%1000) {
        file_put_contents('example.xml', $xmlWriter->flush(true), FILE_APPEND);
// Final flush to make sure we haven't missed anything
file_put_contents('example.xml', $xmlWriter->flush(true), FILE_APPEND);

Here we flush the XML in memory to file every 1000 iterations. This ensures that memory usage is capped and opens up the possibility of creating very large XML files.


Something wrong? Suggest an improvement or add a comment (see article history)
Tagged with: PHP
Filed in: tips

Previous: Monitoring MySQL
Next: Date conditional redirects with mod_rewrite

Copyright © 2005-2024 David Winterbottom
Content licensed under CC BY-NC-SA 4.0.