Now let’s do some serious XML output in Python using cElementTree.
# this is included in Python 2.5+
from xml.etree.cElementTree import ElementTree, Element, dump
# let's create the root element
root = Element("teabag")
# give it a child with an attribute
child1 = Element("spam")
child1.attrib["name"] = "value"
# and a child with text content
child2 = Element("eggs")
child2.text = "spam and eggs"
# print the whole thing to stdout
# or to a file
See the author’s website for downloads and usage information of cElementTree.
Using this API, I was able to create a 47 Mb XML file in a few minutes, burning roughly 300 Mb of heap space. This XML file represents a graph of graphs, namely the control flow graph of each function of IDA Pro. Here are some screenshots, using yEd for the visualization part:
[UPDATED] Finally xml.dom.minidom sucks balls, it can burn up to hundreds/gigs of megabytes of sweet memory when working with “large” xml files (10 Mb or more). See this post for a really lightweight implementation.
Here is a quick reference of how to create an XML document and output it in Python.
# create the document
doc = xml.dom.minidom.Document()
# populate it with an element
root = doc.createElement("teabag")
# time to give some children to the root element, one with an attribute for instance
child1 = doc.createElement("spam")
# and another one with some text
child2 = doc.createElement("eggs")
text = doc.createTextNode("spam and eggs!")
# let's get the output, as a string
# you're supposed to get the following output:
#<?xml version="1.0" ?>
# <spam name="value"/>
# spam and eggs!
How nice is that ? Yep, a lot.