#native_company# #native_desc#

PHP-Based Chat Room Page 4

By Mike Hall
on July 30, 2000

Now we’re getting somewhere in terms of design. Another
feature the regulars at my chat room enjoy is the
ability to display email and URL link icons in their message.
Two more form inputs were incorporated and the links processed
thus :



$link_html .= " <a href="$url" target="_new">".

"<font face="wingdings">2</font></a>";


$link_html .= " <a href="$mail" target="_new">".

"<font face="wingdings">*</font></a>";

$new_message "<font color="$color"><b><i>$name</i></b>".

" $link_html <font size="1">($time)</font> : $message</font><br>n";


Again, we could just could just leave things at
that, but there are certain security issues.
What is to stop someone entering nasty HTML
into the message box? A little JavaScript?
A little VBScript? Even something as simple as a
5,000k JPEG image can do harm. Refreshing every
eight seconds on the screens of heaven-knows how
many people across the globe. Could be murder
on your bandwidth – not something we want.
We could remove all HTML and PHP elements using
the strip_tags() function, but I
want the chatters to be able to use basic HTML
in their posts. Basic elements like <i>,
<b> and <font> that can be
used to spruce up a message.
For almost two years I used a complicated series
of regex statements to screen out the nasty HTML.
However I found that I was more or less constantly
adding to this filter, until it was taking up most
of my code! Frustrated by inefficient code I was
again rescued by a friend who suggested approaching
the problem from the other direction. Instead of
telling the script what HTML it can use, tell it
what it can’t.
htmlspecialchars() is a much
under-used PHP function. It replaces certain
characters with their HTML entities.
So " becomes &quot;,
& becomes &amp;,
< becomes &lt;
and > becomes &gt;.
By running the $new_message variable
through htmlspecialchars() I turn …
<iframe src="http://www.microsoft.com">
… into …
&lt;iframe src=&quot;http://www.microsoft.com&quot;&gt;
… rendering it useless. A series of string
replace functions can then re-enable certain
tags. Then comes the clever part. We use
str_replace() to undo some of
what htmlspecialchars() did.


$message htmlspecialchars($message);

$message str_replace("&gt;"">"$message);

$message str_replace("&lt;b>""<b>"$message);

$message str_replace("&lt;/b>""</b>"$message);

$message str_replace("&lt;i>""<i>"$message);

$message str_replace("&lt;/i>""</i>"$message);

$message str_replace("&lt;font ""<font "$message);

$message str_replace("&lt;/font>""</font>"$message);


And so on. There are cleverer ways of doing this using
eregi_replace() but I don’t want to
complicate matters.
We have to make sure we run the $name,
$color, $url
and $mail through this filter too, otherwise the
malicious users can enter code that way. Save yourself work and bundle
the filter off in a function.


$name filterHTML($name);

$message filterHTML($message);

$color filterHTML($color);

$url filterHTML($url);

$mail filterHTML($mail);