Python XML for Real Men Cheat Sheet

Howdy again,

Now let’s do some serious XML output in Python using cElementTree.

# this is included in Python 2.5+
from xml.etree.cElementTree import ElementTree, Element, dump

# let's create the root element
root = Element("teabag")

# give it a child with an attribute
child1 = Element("spam")
child1.attrib["name"] = "value"
root.append(child1)

# and a child with text content
child2 = Element("eggs")
child2.text = "spam and eggs"
root.append(child2)

# print the whole thing to stdout
dump(root)

# or to a file
ElementTree(root).write("teabag.xml")

See the author’s website for downloads and usage information of cElementTree.

Using this API, I was able to create a 47 Mb XML file in a few minutes, burning roughly 300 Mb of heap space. This XML file represents a graph of graphs, namely the control flow graph of each function of IDA Pro. Here are some screenshots, using yEd for the visualization part:

Advertisements

Python XML Cheat Sheet

[UPDATED] Finally xml.dom.minidom sucks balls, it can burn up to hundreds/gigs of megabytes of sweet memory when working with “large” xml files (10 Mb or more). See this post for a really lightweight implementation.

Howdy,

Here is a quick reference of how to create an XML document and output it in Python.

import xml.dom.minidom

# create the document
doc = xml.dom.minidom.Document()

# populate it with an element
root = doc.createElement("teabag")
doc.appendChild(root)

# time to give some children to the root element, one with an attribute for instance
child1 = doc.createElement("spam")
child1.setAttribute("name", "value")
root.appendChild(child1)

# and another one with some text
child2 = doc.createElement("eggs")
text = doc.createTextNode("spam and eggs!")
child2.appendChild(text)
root.appendChild(child2)

# let's get the output, as a string
print doc.toprettyxml()

# you're supposed to get the following output:
#<?xml version="1.0" ?>
#<teabag>
#    <spam name="value"/>
#    <eggs>
#        spam and eggs!
#    </eggs>
#</teabag>

How nice is that ? Yep, a lot.