#native_company# #native_desc#
#native_cta#

Advanced String Processing – How Regular Are Your Expressions Page 3

By PHP Builder Staff
on May 19, 2009

One more example:


$text = "the letter a is a vowel"
$reg-ex = "/thesletters[aeiou]sissasvowel/i";

This reads:


Search for "the letter " followed by one of the letters a,e,i,o,u and none other
Followed by " is a vowel"

On a positive match, then
$matches[0] will hold “the letter “a” is a
vowel” , there will be no other parts in
$matches as there are no bracket sections.
In case your wondering
s is a special character called a meta-
character, and it means anything classed as white space.
The * symbol is also
a meta character and means match “0 or more occurrences” eg:


A*

The above axample will match any text starting with ‘A‘, the ^ and $ meta characters mean start and end of the text, so:


^A*

Will match any and all the text in
a phrase as long as it starts with an ‘A‘ right
at the beginning, which is different to the previous,
because that will match on the first ‘A‘ it
encounters in the text, then match on the rest of the line,
and that brings us to my next point.
Regular expressions are greedy.
They will try and match the largest amount possible at any
given time in any given match string, which is why you
really only want to use * if it’s really necessary, if you
can, always try to narrow your search as much as possible
EG:


"Alan went to meet marsha"

To get the word ‘Alan’ use an expression of:


/^A.*swent/

Or use the count control match meta characters:


/^A.{4}s.+/

What this expression says is, look
for a 4 character word beginning with ‘A’ right at the
beginning of the line, followed by a space and at least 1 or
more characters.
The {4} means 4 characters of any
description, and only 4 characters. It’s also possible to
specify ranges. Take a look at this example:


/^A.{1,4}/

This example would specify an A
followed by between 1 and 4 characters, but no less than 1
and no more than 4. And this snippet:


/^A.{4,}/

This code would mean an ‘A’ at the
beginning followed by at least 4 characters, possibly more.
You can also combine other rules,
this does not just have to be a ‘.‘,
*‘ or ‘+‘ , you can use a
character class like this:


/^A[aeiou]{4,}/

This would match on a line
beginning with ‘A‘ and at least 4 of any of the
characters in the square brackets in any order, but only the
characters in the square brackets.
Summary
We’ve really only just scraped the
tip of the iceberg with regular expressions, it’s a huge
subject for which many books have been written. I urge you
to read more about them and you can always look to the PHP
manual, the expressions section is at https://phpbuilder.com/manual/en/language.expressions
.php
t.
Next time will be the final part
in our series, in which we wrap up and look at some
practical examples of what we’ve learned so far.
It’s also your chance to tell me
what you’d like to cover. If there is a particular thing
you’ve been trying to do, or a technique your not sure how
to make work, then please leave a comment using the form at
the bottom of this page.
Between now and the final article,
I’ll be checking these comments, and I’ll use them as a
basis for what I put in the last article, please note
however, I’m not going to complete your project for you or
your homework assignment, so please don’t put things in like
“please show me how to make a project that does xxxx” all
I’m looking for are real world ideas based on common
scenarios that you guys are currently learning.
Until next time
May your expressions remain regular
Shawty