In October and November, I wrote a two-part tutorial on reading RSS feeds in PHP. Partly motivated by the chance to introduce RSS to PHP coders who hadn’t yet gotten an understanding of it, it was also motivated by my frustration in struggling to find an aggregator that did what I wanted it to. This month, I introduce an RSS/Atom aggregator that, so far, does just what I need. It’s called Gregarius, which I originally thought was a misspelling of gregarious, which means seeking and enjoying the company of others. I later found that it is actually from the Latin root word meaning belonging to a herd or flock. It’s based upon the MagpieRSS parser. My main need was to create a number of web pages that I could visit, containing a combined list of sites and blogs that I’m interested in, ordered from newest to oldest. Gregarius achieves this easily, and comes with a neat interface, great documentation and a good (and growing) set of additional features and plugins. It will appeal to both novice users as well as more experienced developers who’re looking for something to run on their own server that they can possibly extend, rather than the usual web-based versions.
Downloading and Installing Gregarius
To run Gregarius, you need PHP (recommended > 4.3), an Apache server, and either MySQL or SQLite as a database. It’s available for download from Sourceforge. Once downloaded and placed in the webtree, you need to unpack or unzip it.
tar xvfz rss-0.5.2a.tar.gz or unzip rss-0.5.2a.zip
This creates a directory called rss, which can be renamed if desired.
Next you need to configure it. The configuration file is extremely simple, containing only database access information. Stripped of comments, the configuration file appears as follows.
<?php define ('DBTYPE','mysql'); //define ('DBTYPE',"sqlite"); define ('DBNAME','database_name'); define ('DBUNAME','database_username'); define ('DBPASS', 'database_password'); define ('DBSERVER', 'localhost'); //define ('DB_TABLE_PREFIX',''); ?>
It’s easiest to use the sample as a basis and then edit the new file.
cp dbinit.php.sample dbinit.php
Simply choose between MySQL and SQLite, and then enter the database name, as well as the username, password and hostname. Next, ensure that the database has been created, and the correct permissions granted (Gregarius needs to be able to SELECT, INSERT, UPDATE, ALTER and CREATE tables). Finally, point your browser to the rss directory (or to whatever you renamed it), and, if all is going well, the tables should be populated, and Gregarius will be ready for use.
Now’s the time to add the feeds you want to view. If you’re migrating from an existing aggregator, you can simply import the OPML file for use in Gregarius (it’s under Admin/OPML – see screenshot). If you’re starting from scratch, first, you’ll need to add any folders (Admin/Folders – see screenshot). Folders allow you to combine feeds together in a way that makes sense to you. Each feed is assigned to one folder. To add a feed, simply click on Admin and then Feeds and enter the URL, as well as the folder to which you’re assigning the feed. (screenshot) You don’t even have to know the specific location of the feed. If you just enter the website URL, Gregarius attempts to find the location of the feed. If it finds more than one, it offers you the choice.
After you’ve created some feeds and you attempt to view them (below), you may find that you are getting 404 errors. The most likely reason for this is that your server is not running the Apache mod-rewrite module, so the short url’s Gregarius uses by default are not being accepted. Simply go to Admin/Config and change the rss.output.usemodrewrite setting (second from the bottom) to false (screenshot).
Clicking on Home, or going to the root directory of the installation, you’ll see two main content areas. On the left, a list of all feeds, ordered by folder. On the right, a list of the individual items in the feeds (screenshot). Within a folder, you can change the order of each item by going to Admin/Feeds, and clicking the up or down arrows to move the feeds. Similarly, changing the folder order is achieved by doing the same in Admin/Folders. You can view content by individual feed (clicking on the feed), by folder (clicking on the folder name), or just a long list of all the most recent (by default – you can change this) content. An alternative way of viewing (available from version 0.5.2) is by using categories. You can assign each feed to multiple categories (by editing the feed, and adding the categories in the clearly marked input box, separated by a space – see (screenshot)). This allows more flexibility in ordering than is possible with folders.
All feeds will be updated when you click Refresh (screenshot). Feeds will also be updated if the rss.config.refreshafter option (under Admin/Config) is set to a time period (in minutes). By default it’s set to 45 minutes. If you have many feeds, this may be a bit time-consuming.
Administration and configuration
You perform all administration from the admin section. Note that by default, this part of the site is available without any password protection. If your installation is publicly available, you’re likely to want to protect this section of the site. The easiest way to do this is by adding a file called rss_extra.php in the root directory of your installation (rss by default), containing the code (with of course a username and password of your own choice):
<?php define ('ADMIN_USERNAME', 'username' ); define ('ADMIN_PASSWORD', 'password' ); ?>
If this doesn’t work (in some setups it may bar access altogether), you’ll need to handle the authentication yourself. Authentication is beyond the scope of this article, but you can read the Apache authentication documentation for more information.
The following are the configuration variables present in Gregarius 0.5.2, their defaults, and brief descriptions of what they each do. Click on Admin/Config and edit the variable you want to change (screenshot).
|rss.config.absoluteordering||true||Allows you to order folders and channels (by clicking on the up or down arrows). If set to false, channels are organized by title.|
|rss.config.autologout||false||Logs you out when you close the browser window, by removing admin cookie.|
|rss.config.datedesc.read||true||Displays recently read items first (older items are shown first if this is set to false).|
|rss.config.datedesc.unread||true||Displays newer unread items first (older items are shown first if this is set to false).|
|rss.config.dateformat||F jS, Y, g:ia T (January 1st, 2006, 1:01am SAST)||The date format (using the standard PHP date format – read more in the PHP documentation. You’ll want to keep the “F” (Month) and “jS” (day) elements so as to maintain the day and month archives, which depend upon these.|
|rss.config.feedgrouping||false||Groups unread items by feed, and the feeds by the rss.config.absoluteordering setting. When false, items are sorted by date instead.|
|rss.config.markreadonupdate||false||When updating, all old unread feeds are marked as read if new unread feeds are found.|
|rss.config.plugins||Url filter v1.4, Rounded Corners v0.1||Displays the currently active plugins. When editing this setting, a list of all available plugins are displayed. Check those you want active. You can download more from the Plugin Repository.|
|rss.config.publictagging||false||Allows all visitors to your site to tag items. When set to false, only the Administrator can do so.|
|rss.config.rating||true||Meant to enable item ratings, but this feature was removed shortly before the release of 0.5.2, leaving only the configuration option. This feature will most probably appear in a future version.|
|rss.config.refreshafter||45||When your browser window is open to a Gregarius page, it automatically updates feeds after this many minutes of inactivity. Set to 0 to turn this option off. Don’t abuse this, and risk getting banned by feed providers, by reducing this too much from the default.|
|rss.config.robotsmeta||index,follow||Commands for spiders when they crawl. See this document for more on this.|
|rss.config.serverpush||true||Uses server push when updating. Only supported by Mozilla and Opera browsers, which will be autodetected.|
|rss.config.showdevloglink||false||Only really useful on the main Gregarius site, this shows a link to the gregarius devlog.|
|rss.config.tzoffset||0||Timezone offset, in hours, between the server and your local time. Can range from -12 to 12.|
|rss.input.allowed||<a href=”…” title=”…” > <b> <blockquote> <br> <code> <del> <em> <i> <img src=”…” alt=”…” > <ins> <li> <ol> <p> <pre> <sup> <table> <td> <th> <tr> <tt> <ul>.||Any tags not explicitly listed here are filtered out when importing new items.|
|rss.input.allowupdates||true||When refreshing,looks in existing items for updates.|
|rss.meta.debug||false||A more verbose set or error reporting and debug info when in debug mode.|
|rss.output.cachecontrol||false||Gregarius will check whether to get a fresh document or not.|
|rss.output.cachedir||/tmp/magpierss||Where to store temporary files. Apache needs to be able to write to this directory.|
|rss.output.channelcollapse||true||Allows channel collapsing on the main page.|
|rss.output.compression||true||Turns output compression on, allowing most browsers to download smaller pages.|
|rss.output.encoding||UTF-8||The output encoding for the PHP XML parser.|
|rss.output.itemsinchannelview||10||Number of read items shown on for a single channel.|
|rss.output.lang||English||Which language pack to use. Currently English, German, Danish, Spanish, French, Italian, Portuguese, Russian and Swedish are supported, but, as with most open source projects, this could expand quickly.|
|rss.output.noreaditems||false||Only unread items displayed on the front page. This option no longer appears in the development version 0.5.4.|
|rss.output.numitemsonpage||100||The maximum number of items displayed on the main page. You can set this to 0 to have no limit. Removed in development version 0.5.4.|
|rss.output.showfavicons||true||Displays favicons if the feed has one.|
|rss.output.showfeedmeta||false||Displays each feed’s meta-information in the feed side-column.|
|rss.output.theme||default||Which theme to use. You can download more from the Gregarius Themes Repository. Removed in development version 0.5.4.|
|rss.output.titleunreadcnt||false||Displays the unread count in the document title.|
|rss.output.usemodrewrite||true||Uses short URL’s. Requires Apache’s mod_rewrite module. Set this to false if you’re getting 404’s when clicking on feeds.|
|rss.output.usepermalinks||true||Allows direct linking to an item (and displays a permalink icon).|
Using Gregarius has been a pleasant experience all around. Documentation is good, the code is easy to extend, and it comes with most of the features I want, as well as a well-designed front end. If you’re looking to run your own PHP-based feed aggregator, give this one a try.