So What Else Can Regular Expressions do?
The best way for me to describe
that is to show you a few examples:
that is to show you a few examples:
Let’s say we have the string:
"Long live PHP Builder in 2009"
We can find and extract the 2009 using:
^.*(dddd)$
If we use this in PHP with the
preg_match
function:
$found = preg_match("/^.*(dddd)$/", "Long live PHP Builder in 2009",$matches);
$found
will be trueif the text provided had 4 digits at the end of the string,
the
/
at either end of the pattern are how theregular expression engine knows the start and finish of the
search (more on that in just a moment), if a match is found
then the array matches will contain the following:
$matches[0] = "Long live PHP Builder in 2009"
$matches[1] = "2009"
Here’s how the reg-ex pattern reads:
^ = at the start of the line
. = Read any character
* = for as many as you can, until
dddd = you encounter 4 digits in a row
$ = at the end of the string
() =
keeps the part of the patternyou found in any rule between these separate, in this case
the 4 digits.
or in English. Look for 4
consecutive digits that occur at the end of the string, and
retrieve them.
consecutive digits that occur at the end of the string, and
retrieve them.
Here’s another one:
$text = "Peter Shaw"
$reg-ex = "/(Peter)s(Sh(aw|ore))/"
I’ll not repeat the preg line this time.
The rule here says Return the
first word before the space, and after the space match it if
it’s “Shaw” or a common misspelling “Shore”, the pattern
reads:
first word before the space, and after the space match it if
it’s “Shaw” or a common misspelling “Shore”, the pattern
reads:
s = look for the first space you encounter with
Peter = on the left side of it and
Sh = on the right side, followed by either
(aw|ore) = 'aw' OR 'ore'
In all cases keep the 2 found words.
The result in
$matches
will be
$matches[0] = "Peter Shaw" (or "Peter Shore")
$matches[1] = "Peter"
$matches[2] = "Shaw" (or "Shore")
$matches[3] = "aw" (or "ore")
Pay attention above to the
(aw|ore) bit. This has to be in
the 2 parts either side of the OR decision, so even if you
don’t intend to look for that part, it still uses up a slot
in the results.
(aw|ore) bit. This has to be in
()
to groupthe 2 parts either side of the OR decision, so even if you
don’t intend to look for that part, it still uses up a slot
in the results.