#native_company# #native_desc#
#native_cta#

Downloading and Parsing Gmail Messages in PHP

By Rose Kelleher
on August 31, 2010

Some friends of mine publish a literary journal that accepts submissions via email. At their request I wrote a script to download messages from the journal’s Gmail account and do some simple parsing tasks. Most of the submissions are made using an HTML form and a corresponding mailer script on their website, so I knew the precise format of the incoming messages (see Figure 1). What I didn’t know was how to access Gmail in PHP.




Click here for larger image


Figure 1. Format of Incoming Submission Messages

After wasting some time fussing with libgmailer, an unofficial Gmail API, I had a “D’oh!” moment: I realized that all I really needed were the IMAP functions built into PHP 4 and up. In this article I will demonstrate how to use the PHP IMAP functions to download and parse Gmail messages.

The Setup

For the IMAP functions to work, the IMAP extension needs to be installed on your server. You can check the installation status with a call to phpinfo(), which prints information about your server’s PHP setup.
To access Gmail messages, you must also enable IMAP for your Gmail account:
  1. Log into Gmail and select Settings.
  2. Select the Forwarding and POP/IMAP tab.
  3. Select Enable IMAP and save your changes.

The PHP IMAP Functions

Now for the code. The following IMAP functions are used in this example:
  • imap_open()
  • imap_num_msg()
  • imap_headerinfo()
  • imap_body()
  • imap_close()
The following sections discuss each of these functions.

imap_open()

This function opens an IMAP stream to a mailbox.

   $mbox = imap_open($mailbox, $username, $password);

The $username and $password parameters obviously are the username and password for the account. For a Gmail account, the username is the user’s full email address, e.g. [email protected].
What’s not immediately obvious is what to use for the $mailbox parameter. It’s a string parameter with a special format. I also had a special requirement. In Gmail, you use labels to group related emails rather than filing them in folders. I wanted to access only those messages with the label “Current Batch” (see Figure 2).




Click here for larger image


Figure 2. Gmail Labels for Group Related Emails

The value of $mailbox is set as follows:

   $mailbox = '{imap.gmail.com:993/ssl/novalidate-cert}Current Batch';

The first part of the string consists of the server information inside curly braces. (To get the server name and port number, I clicked on the Configuration instructions link on the same Gmail screen where I enabled IMAP, and then clicked on Other.)
Gmail requires an SSL connection for IMAP access, so you also need to include the /ssl flag after the port number. The /novalidate-cert flag tells the function not to bother validating the SSL server’s certificate. In addition, there are several other flags you can specify, such as /readonly for a read-only connection. (See the IMAP manual on php.net for the full list.)
After the curly braces, you can optionally specify the mailbox name. The default is INBOX. You can also use a label such as “Current Batch” to specify a set of labeled messages within your Gmail Inbox.