Introduction
Well my loyal readers (and you must be loyal if you’ve made
it this far). Here we are at the final installment of the
‘ABC’s of PHP’ where we are going to put some of what we’ve
learned over the past couple of months into practice.
it this far). Here we are at the final installment of the
‘ABC’s of PHP’ where we are going to put some of what we’ve
learned over the past couple of months into practice.
We’re going to go step by step through a small script to
read the latest headlines from Slashdot.org, for those of
you who may be a little young or have never heard of
Slashdot, it’s basically a news aggregation site, but with a
difference. All the news available there is aimed squarely
at geeks and nerds the world over. There are articles on the
latest tech, and what’s going on in the heady world of
corporate I.T or just downright bizarre things that people
do.
read the latest headlines from Slashdot.org, for those of
you who may be a little young or have never heard of
Slashdot, it’s basically a news aggregation site, but with a
difference. All the news available there is aimed squarely
at geeks and nerds the world over. There are articles on the
latest tech, and what’s going on in the heady world of
corporate I.T or just downright bizarre things that people
do.
All where going to do is read the XML feed file from http://slashdot.org/
slashdot.xml and then parse the XML data using regular
expressions. If you are going to use this feed then please
take a few minutes to read http://slashdot
.org/faq/code.shtml and learn the rules and regulations
of using the feed, Slashdot is very open about what you can
do with the data, but they do ask that you respect their
wishes to keep server loads to a minimum. Please note also,
that there are better ways to work with XML in PHP and there
are a number of built in functions detailed in the PHP
manual that make this process much easier.
slashdot.xml and then parse the XML data using regular
expressions. If you are going to use this feed then please
take a few minutes to read http://slashdot
.org/faq/code.shtml and learn the rules and regulations
of using the feed, Slashdot is very open about what you can
do with the data, but they do ask that you respect their
wishes to keep server loads to a minimum. Please note also,
that there are better ways to work with XML in PHP and there
are a number of built in functions detailed in the PHP
manual that make this process much easier.
As the feed is very simple however, I decided to simply just
use ‘
material from part 9 on using
you where reading anything more complex then you would
almost certainly want to use the proper XML functions.
use ‘
preg_xxx
‘ calls to elaborate on thematerial from part 9 on using
reg-ex
calls, ifyou where reading anything more complex then you would
almost certainly want to use the proper XML functions.
On with the script
The first thing where going to want to do is to actually
load the XML, this can be achieved by using the PHP
‘
file from any supported file or stream type that PHP can
handle, and will then store each line of the file into a
single array containing one entry for each line in the file.
NOTE: I’ve deliberately NOT included any error handling
here, so if for some reason the script is not able to
retrieve the feed, you will get an error displayed at this
point.
load the XML, this can be achieved by using the PHP
‘
file
‘ command. The file command will load anyfile from any supported file or stream type that PHP can
handle, and will then store each line of the file into a
single array containing one entry for each line in the file.
NOTE: I’ve deliberately NOT included any error handling
here, so if for some reason the script is not able to
retrieve the feed, you will get an error displayed at this
point.
$file_contents = file("http://slashdot.org/slashdot.xml");
We then set a couple of default values, for variables we’ll be using soon:
$in_story = 0;
$storys = array();
The ‘in_story’ variable is used as a flag to let the main
reading loop know when it is inside a pair of
<story> </story> tags in the XML file, the
‘
array of smaller arrays each containing one story described
in the XML.
reading loop know when it is inside a pair of
<story> </story> tags in the XML file, the
‘
storys array
‘ on the other hand will hold anarray of smaller arrays each containing one story described
in the XML.
Next we loop over each line in the loaded text array using
‘
arrays, this takes each element in the array one at a time
and in sequence and presents it to the inside of the loop as
a single variable, which in this case will be a single
string.
‘
foreach
‘, if you remember our discussion onarrays, this takes each element in the array one at a time
and in sequence and presents it to the inside of the loop as
a single variable, which in this case will be a single
string.
foreach($file_contents as $line)
{
??????
}
Inside this loop is where we perform the necessary actions
to extract the information from each story in the XML and
load it into an in memory array.
to extract the information from each story in the XML and
load it into an in memory array.
The first thing we need to do in this loop is to decide when we are, and when we are not inside a story (I’m not going to retype the XML here, but if you load the URL mentioned above into your browser, you’ll clearly see the structure), this is handled by the “if then” decisions that look like this:
if(preg_match("/
{
$in_story = true;
$current_story = array();
}
and
if(preg_match("//",$line))
{
$in_story = false;
$storys[] = $current_story;
}
All they simply do is, when a line with “” appears,
we know that we are now in a story block, so we set up a new
empty array to hold the details, and we set ‘in_story’ to
true, the second one resets ‘in_story’ to false, and adds
the single story array to the bigger full list of story’s
array.
we know that we are now in a story block, so we set up a new
empty array to hold the details, and we set ‘in_story’ to
true, the second one resets ‘in_story’ to false, and adds
the single story array to the bigger full list of story’s
array.