#native_company# #native_desc#

Creating an RSS Aggregator with the PHP SimplePie Library

By W. Jason Gilmore
on August 17, 2010

RSS aggregators such as Google Reader provide a great way to quickly peruse the latest updates and other news from websites that you follow regularly. All you need to do is provide the aggregator with each site’s RSS feed location, and the aggregator will retrieve and parse the feed, converting it into a format (HTML in the case of Google Reader) that you can easily peruse.
But what if you want to integrate feeds into your website, or create your own version of an aggregator? Writing custom code capable of efficiently retrieving and parsing the XML that comprises a feed can be a difficult and tedious process, one which has grown increasingly complex with the added support for multimedia content such as podcasts. Thankfully, a number of open source libraries can handle the RSS retrieval and parsing tasks for you. Many of these solutions also offer a number of advanced features such as feed caching in order to reduce bandwidth consumption.
PHP developers are particularly lucky as a fantastic library named SimplePie not only offers the aforementioned features but also supports both RSS and Atom formats, multiple character encodings, and an architecture that makes integration with your favorite content management and blogging platforms a breeze. In this tutorial I’ll introduce you to SimplePie, showing you how easy it is to create a rudimentary custom RSS aggregator using this powerful library.

Installing SimplePie

SimplePie requires PHP 4.3 or newer, in addition to PHP’s PCRE and XML extensions, both of which are enabled by default. Presuming you meet these minimal requirements, browse to SimplePie’s GitHub site and download the latest stable version. Unzip the download and place the directory somewhere within your PHP’s include path.
To begin using SimplePie all you need to do is include the simplepie.inc within your PHP script, a task that is typically done using PHP’s require_once statement:


Provided that you have added the SimplePie directory to PHP’s include path, you won’t need to reference the path within the require statement.
Finally, create a directory named cache somewhere within your project directory, and change the directory owner to the server daemon owner and the permissions to 755, which will allow the server to write to it. SimplePie will use this directory to cache the RSS feeds.

Retrieving a Feed

To demonstrate SimplePie’s capabilities let’s retrieve and publish the WJGilmore.com RSS feed in HTML format. Believe it or not, you can retrieve and parse the feed using four simple commands:

01 $feed = new SimplePie('http://feeds.feedburner.com/wjgilmorecom');
02 $feed->set_cache_location('/var/www/dev.spiesindc.com/library/cache/');
03 $feed->set_feed_url('http://feeds.feedburner.com/wjgilmorecom');
04 $feed->init();
05 $feed->handle_content_type();

Line 01 instantiates the SimplePie class, exposing the methods we’ll subsequently use to retrieve, parse and render the feed. Line 02 defines the location of the cache directory we created earlier in the tutorial. Line 03 defines the RSS feed we’d like to retrieve. Finally, line 04 retrieves and parses the feed, whether via the cache or by reaching out to the feed’s online location.
When the feed has been retrieved and parsed, you can use a number of methods to access the feed data, including the feed title, description, and feed items’ title and publication date.

foreach ($feed->get_items() as $item) {
  $permalink = $item->get_permalink();
  $title = $item->get_title();
  echo "{$title}
"; }

Executing this example produces the output presented in Figure 1.

SimplePie: Rendering a Feed's Items to a Web Page
Click here for larger image

Figure 1. Rendering a Feed’s Items to a Web Page