#native_company# #native_desc#
#native_cta#

XML How-To Page 3

By Joe Stump
on September 7, 2000

<?php

function return_page(){

    global 
$temp;

    echo 
'o <A HREF="'.$temp['url'].'">'.$temp['title'].'</A><BR>';

}

// what are we parsing?

$xml_file 'slashdot.xml';

// declare the character set - UTF-8 is the default

$type 'UTF-8';

// create our parser

$xml_parser xml_parser_create($type);

// set some parser options 

xml_parser_set_option($xml_parserXML_OPTION_CASE_FOLDINGtrue);

xml_parser_set_option($xml_parserXML_OPTION_TARGET_ENCODING'UTF-8');

// this tells PHP what functions to call when it finds an element

// these funcitons also handle the element's attributes

xml_set_element_handler($xml_parser'startElement','endElement');

// this tells PHP what function to use on the character data

xml_set_character_data_handler($xml_parser'characterData');

if (!($fp fopen($xml_file'r'))) {

    die(
"Could not open $xml_file for parsing!n");

}

// loop through the file and parse baby!

while ($data fread($fp4096)) {

    if (!(
$data utf8_encode($data))) {

        echo 
'ERROR'."n";

    }

    if (!
xml_parse($xml_parser$datafeof($fp))) {

        die(
sprintf"XML error: %s at line %dnn",

        
xml_error_string(xml_get_error_code($xml_parser)),

        
xml_get_current_line_number($xml_parser)));

    }

}

xml_parser_free($xml_parser);

?>



Now this is what happens: PHP starts parsing along until if finds <ELEMENT ATTRIBUTE='bold'>
it then passes ELEMENT and its attributes to the startElement function. Since Slashdot’s
file doesn’t have any attributes we don’t worry about it – but that is were they will be if
a file does have them. Then it passes the data between the closing element and the starting
element to characterData and finally it passes the ending element and its attributes to the
endElement function. The endElement function is what calls the return_page()
function, but only when it sees that we have hit the end of the story. Up until that point
our variable $temp holds the data we have been collecting in startElement and characterData.
Now all that is left is to put a wget in your cron!
— Joe