To register for an Internet.com membership to receive newsletters and white papers, use the Register button ABOVE.
To participate in the message forums BELOW, click here
PHPBuilder.com  
 

 

Go Back   PHPBuilder.com > PHP Help > General Help

General Help Forum for General Help questions pertaining to PHP

Reply
 
Thread Tools Rate Thread Display Modes
Old 11-04-2009, 04:03 PM   #1
habbardone
Senior Member
 
Join Date: Dec 2004
Location: London Uk and Turkey
Posts: 219
How do I clean up the output og this ?

Hi,

I would like to display just the real text (words) from an imported page

I have used $text = strip_tags($data); to
clean up the text, but when I print out the
contents, I still get lots of garbage.

Is there a function that will take out non-character blocks from the string ?

E.g. to discard: "dg354srt" and "%uhr(66" , but keep "goats"

Or do I have to use a regex for that ?

Thanks
__________________
Developers Choice Revealed: www.devchoice.info
Which host has won, and why ?
habbardone is offline   Reply With Quote
Old 11-04-2009, 04:24 PM   #2
big.nerd
i like computers
 
Join Date: Jul 2006
Location: Canada
Posts: 466
There is functions to remove non-char's, but not really non-words.

dg354srt is valid characters, and all letters could potentially be used in a sentence.

What I would do is take a look at your string and see if there is any kind of pattern that could be used against what you have.

You can do it if you want to remove any words that contain anything other than a specified set of characters, i.e. A-Z, a-z, ',", etc, in other words if the string was "happy dg354srt goats can eat %uhr(66 grass", you can remove "dg354srt" and "%uhr(66", since the other words only contain a-z.

see if you are able to clarify what kind of patterns you need to remove/save post it here.
__________________
big.nerd

Most Code Provided is UNTESTED (unless otherwise specified).
... nerds are real people too!
big.nerd is offline   Reply With Quote
Old 11-04-2009, 04:41 PM   #3
habbardone
Senior Member
 
Join Date: Dec 2004
Location: London Uk and Turkey
Posts: 219
Well,
I don't need anything with numbers or other non-alphabetical characters
All I want is the words.

Is there any function for that ?
__________________
Developers Choice Revealed: www.devchoice.info
Which host has won, and why ?
habbardone is offline   Reply With Quote
Old 11-04-2009, 05:57 PM   #4
big.nerd
i like computers
 
Join Date: Jul 2006
Location: Canada
Posts: 466
Is this what your looking for?

PHP Code:
$string = "23320 98slkdj1239 0sd giant fKL flow88er hj32kuhSDFSDK Jh2380 u*Q@(&!(*&kajsdhfqk3jh4rqihcas sadkjhq23 8uwf 32iugh2f3";
$regex = "/[^A-Za-z ]*/";
echo
preg_replace($regex,"",$string);
Outputs:

Code:
slkdj sd giant fKL flower hjkuhSDFSDK Jh uQkajsdhfqkjhrqihcas sadkjhq uwf iugh
__________________
big.nerd

Most Code Provided is UNTESTED (unless otherwise specified).
... nerds are real people too!
big.nerd is offline   Reply With Quote
Old 11-05-2009, 05:26 AM   #5
habbardone
Senior Member
 
Join Date: Dec 2004
Location: London Uk and Turkey
Posts: 219
Yep,

I guess that is it

Thanks

Edit:

If the preg_replace is replacing the letters with "" i.e. nothing,
surely this should be doing the opposite of what I want,
or is there a negative in that expression somewhere ??
__________________
Developers Choice Revealed: www.devchoice.info
Which host has won, and why ?

Last edited by habbardone; 11-05-2009 at 05:34 AM.
habbardone is offline   Reply With Quote
Old 11-05-2009, 09:27 AM   #6
big.nerd
i like computers
 
Join Date: Jul 2006
Location: Canada
Posts: 466
habbardone,

The preg_replace is replacing anything that is NOT A-Z, a-z, or a space " " with nothing, effectively removing it.

That ^ (shift 6 on my keyboard) means "NOT" in regex.

Note: To those who may correct me, if it doesn't mean not, sorry but it does have that effect.
__________________
big.nerd

Most Code Provided is UNTESTED (unless otherwise specified).
... nerds are real people too!
big.nerd is offline   Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -4. The time now is 04:30 PM.






Acceptable Use Policy

internet.comMediabistrojusttechjobs.comGraphics.com

WebMediaBrands Corporate Info


Advertise | Newsletters | Feedback | Submit News

Legal Notices | Licensing | Permissions | Privacy Policy


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.