#native_company# #native_desc#
#native_cta#

DOM XML: An Alternative to Expat

By Matt Dunford
on December 27, 2000

Overview: An alternative to expat.

There are many xml tutorials for php on the web, but few show how to
parse xml using DOM. I would like to take this opportunity to show
there is an alternative to the widespread SAX implementation for php
programmers.
DOM (Document Object Model) and SAX (Simple API for XML) have
different philosophies on how to parse xml. The SAX engine is
extremely event-driven. When it comes across a tag, it calls an
appropriate function to handle it. This makes SAX very fast and
efficient. However, it feels like you’re trapped
inside an eternal loop when writing code. You find yourself using many global variables
and conditional statements.
On the other hand, the DOM method is somewhat memory intensive. It
loads an entire xml document into memory as a hierarchy. The upside
is that all of the data is available to the programmer organized
much like a family tree. This approach is more intuitive,
easier to use, and affords better readability.
In order to use the DOM functions, you must configure php by specifying
the ‘–with-dom’ argument. They are not a part of the standard
configuration. Here is a sample compilation.
%> ./configure --with-dom --with-apache=../apache_1.3.12
%> make
%> make install

How DOM structures XML

Since DOM loads an entire xml string or file into memory as a tree,
this allows us to manipulate the data as a whole. To show what xml
looks like as a tree, take this xml document as an example.
<?xml version="1.0"?>

<book type="paperback">
	<title>Red Nails</title>
	<price>$12.99</price>
	<author>
		<name first="Robert" middle="E" last="Howard"/>
		<birthdate>9/21/1977</birthdate>
	</author>
</book>
The data would be structured like this.
DomNode book
	|
	|-->DomNode title
	|		|
	|		|-->DomNode text
	|
	|-->DomNode price
	|		|
	|		|-->DomNode text
	|
	|-->DomNode author
			|
			|-->DomNode name
			|
			|-->DomNode birthdate
					|
					|-->DomNode text
Any text enclosed within tags are really nodes in themselves. For instance,
“Red Nails” is a child node of title, “$12.99” is a child node of
price.

1
|
2
|
3
|
4
|
5
|
6