Validating E-mail Addresses
Ok, let’s take on e-mail addresses. There are three parts in an e-mail address: the
POP3 user name (everything to the left of the
rest). The user name may contain upper or lowercase letters, digits, periods (
signs (‘-‘), and underscore signs (‘_’). That’s also the case for the server name, except
for underscore signs, which may not occur.
POP3 user name (everything to the left of the
'@'
), the '@'
, and the server name (therest). The user name may contain upper or lowercase letters, digits, periods (
'.'
), minussigns (‘-‘), and underscore signs (‘_’). That’s also the case for the server name, except
for underscore signs, which may not occur.
Now, you can’t start or end a user name with a period, it doesn’t seem reasonable. The
same goes for the domain name. And you can’t have two consecutive periods, there should be
at least one other character between them. Let’s see how we would write an expression to
validate the user name part:
same goes for the domain name. And you can’t have two consecutive periods, there should be
at least one other character between them. Let’s see how we would write an expression to
validate the user name part:
^[_a-zA-Z0-9-]+$
That doesn’t allow a period yet. Let’s change it:
^[_a-zA-Z0-9-]+(.[_a-zA-Z0-9-]+)*$
That says: “at least one valid character followed by zero or more sets consisting
of a period and one or more valid characters.”
of a period and one or more valid characters.”
To simplify things a bit, we can use the expression above with
instead of
have to specify both ranges “
of them is enough:
eregi()
,instead of
ereg()
. Because eregi()
is not sensitive to case, we don’thave to specify both ranges “
a-z
” and “A-Z
” — oneof them is enough:
^[_a-z0-9-]+(.[_a-z0-9-]+)*$
For the server name it’s the same, but without the underscores:
^[a-z0-9-]+(.[a-z0-9-]+)*$
Done. Now, joining both expressions around the ‘at’ sign, we get:
^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*$