<?xml version="1.0" encoding="utf-8" ?>

<rss version="2.0" 
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:admin="http://webns.net/mvcb/"
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
   xmlns:wfw="http://wellformedweb.org/CommentAPI/"
   xmlns:content="http://purl.org/rss/1.0/modules/content/"
   >
<channel>
    <title>Code in the hole (Entries tagged as xml)</title>
    <link>http://codeinthehole.com/</link>
    <description>David Winterbottom</description>
    <dc:language>en</dc:language>
    <generator>Serendipity 1.3.1 - http://www.s9y.org/</generator>
    
    

<item>
    <title>Creating (very) large XML files with PHP</title>
    <link>http://codeinthehole.com/archives/3-Creating-very-large-XML-files-with-PHP.html</link>
            <category>Tidbits</category>
    
    <comments>http://codeinthehole.com/archives/3-Creating-very-large-XML-files-with-PHP.html#comments</comments>
    <wfw:comment>http://codeinthehole.com/wfwcomment.php?cid=3</wfw:comment>

    <slash:comments>3</slash:comments>
    <wfw:commentRss>http://codeinthehole.com/rss.php?version=2.0&amp;type=comments&amp;cid=3</wfw:commentRss>
    

    <author>nospam@example.com (David Winterbottom)</author>
    <content:encoded>
    &lt;p&gt;When creating large XML files with PHP, there are some important considerations to bear in mind with regards to scalability.  There are several libraries available for writing XML files of small to intermediate size (such as &lt;a onclick=&quot;javascript: pageTracker._trackPageview(&#039;/extlink/uk2.php.net/book.dom&#039;);&quot;  href=&quot;http://uk2.php.net/book.dom&quot; title=&quot;DOMDocument&quot;&gt;DOMDocument&lt;/a&gt;), but when dealing  with very large files (eg. &gt; 500Mb, or several million elements), these libraries are no longer useful as the size of the file then can create is memory-bound.&lt;/p&gt;
&lt;p&gt;
For example, DOMDocument stores the XML tree in memory while it is being built - you then flush it out to file after all elements have been created: &lt;/p&gt;
&lt;div class=&quot;php&quot; style=&quot;text-align: left&quot;&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt; = &lt;span style=&quot;color: #000000; font-weight: bold;&quot;&gt;new&lt;/span&gt; DOMDocument&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;1.0&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&lt;span style=&quot;color: #b1b100;&quot;&gt;for&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;=&lt;span style=&quot;color: #cc66cc;&quot;&gt;0&lt;/span&gt;; &lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;&amp;lt;=&lt;span style=&quot;color: #cc66cc;&quot;&gt;10000&lt;/span&gt;; ++&lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#123;&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$root&lt;/span&gt; = &lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;createElement&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;message&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;appendChild&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$root&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$content&lt;/span&gt; = &lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;createElement&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;content&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$root&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;appendChild&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$content&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$content&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;appendChild&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;createTextNode&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;Example content&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#125;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;color: #808080; font-style: italic;&quot;&gt;// Flush XML from memory to file in one go&lt;/span&gt;&lt;br /&gt;file_put_contents&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;example.xml&#039;&lt;/span&gt;, &lt;span style=&quot;color: #0000ff;&quot;&gt;$dom&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;saveXML&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;/div&gt;
&lt;p&gt;
However, this doesn&#039;t scale once your feed size starts exceeding the available memory (teaking memory settings in php.ini is only a short-term fix).  A good solution to this is to use the &lt;a href=&quot;www.php.net/xmlwriter&quot; title=&quot;XMLWriter&quot;&gt;XMLWriter&lt;/a&gt; library as this provides the ability to periodically flush the XML in memory out to file.  By doing so, you reclaim the memory so you can keep building the XML tree without exceeding memory limitations. &lt;/p&gt;
&lt;div class=&quot;php&quot; style=&quot;text-align: left&quot;&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt; = &lt;span style=&quot;color: #000000; font-weight: bold;&quot;&gt;new&lt;/span&gt; XMLWriter&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;openMemory&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;startDocument&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;1.0&#039;&lt;/span&gt;, &lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;UTF-8&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&lt;span style=&quot;color: #b1b100;&quot;&gt;for&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;=&lt;span style=&quot;color: #cc66cc;&quot;&gt;0&lt;/span&gt;; &lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;&amp;lt;=&lt;span style=&quot;color: #cc66cc;&quot;&gt;10000000&lt;/span&gt;; ++&lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#123;&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;startElement&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;message&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;writeElement&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;content&#039;&lt;/span&gt;, &lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;Example content&#039;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;endElement&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #808080; font-style: italic;&quot;&gt;// Flush XML in memory to file every 1000 iterations&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #b1b100;&quot;&gt;if&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #cc66cc;&quot;&gt;0&lt;/span&gt; == &lt;span style=&quot;color: #0000ff;&quot;&gt;$i&lt;/span&gt;%&lt;span style=&quot;color: #cc66cc;&quot;&gt;1000&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#123;&lt;/span&gt;&lt;br /&gt;&amp;#160; &amp;#160; &amp;#160; &amp;#160; file_put_contents&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;example.xml&#039;&lt;/span&gt;, &lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;flush&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #000000; font-weight: bold;&quot;&gt;true&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;, FILE_APPEND&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;br /&gt;&amp;#160; &amp;#160; &lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#125;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#125;&lt;/span&gt;&lt;br /&gt;&lt;span style=&quot;color: #808080; font-style: italic;&quot;&gt;// Final flush to make sure we haven&#039;t missed anything&lt;/span&gt;&lt;br /&gt;file_put_contents&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #ff0000;&quot;&gt;&#039;example.xml&#039;&lt;/span&gt;, &lt;span style=&quot;color: #0000ff;&quot;&gt;$xmlWriter&lt;/span&gt;-&amp;gt;&lt;span style=&quot;color: #006600;&quot;&gt;flush&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#40;&lt;/span&gt;&lt;span style=&quot;color: #000000; font-weight: bold;&quot;&gt;true&lt;/span&gt;&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;, FILE_APPEND&lt;span style=&quot;color: #66cc66;&quot;&gt;&amp;#41;&lt;/span&gt;;&lt;/div&gt;
&lt;p&gt;
Here we flush the XML in memory to file every 1000 iterations.  This ensures that memory usage is capped and opens up the possiblity of creating very large XML files.&lt;/p&gt; 
    </content:encoded>

    <pubDate>Wed, 29 Oct 2008 22:37:51 +0000</pubDate>
    <guid isPermaLink="false">http://codeinthehole.com/archives/3-guid.html</guid>
    <category>php</category>
<category>xml</category>

</item>

</channel>
</rss>