To register for an Internet.com membership to receive newsletters and white papers, use the Register button ABOVE.
To participate in the message forums BELOW, click here
PHPBuilder.com  
 

 

Go Back   PHPBuilder.com > PHP Help > Coding

Coding Help with PHP coding

Reply
 
Thread Tools Search this Thread Rate Thread Display Modes
Old 05-16-2006, 11:07 AM   #1
sahammondsr
Member
 
Join Date: Oct 2001
Posts: 35
Special Character Handling

I have an XML feed which appears to have a bullet in it.

When I parse it in PHP, it turns into a question mark.

I'm completely ignorant on character sets and handling of special characters. I searched around but couldn't find any good resources. Any ideas???

Code:
- <specification name="Image Quality">
- <spec-value>
  <name>Camera Resolution</name> 
  <value>6.2 Megapixel •</value> 
  </spec-value>
- <spec-value>
  <name>Image Resolutions</name> 
  <value>640 x 480 • 2816 x 2112 • 2272 x 1704 • 1600 x 1200 •</value> 
  </spec-value>
  </specification>
After I parse it, I'm ending up with:

640 x 480 ? 2816 x 2112 ? 2272 x 1704 ? 1600 x 1200 ?

I can do a preg_replace on "/\?/" and it properly replaces it but I'm worried about taking out a valid ?...

Last edited by sahammondsr; 05-16-2006 at 11:10 AM.
sahammondsr is offline   Reply With Quote
Old 05-17-2006, 03:50 PM   #2
essexboyracer
Member
 
Join Date: Jun 2001
Location: UK
Posts: 51
try doing a search on ascii values for bullets and search on that, i am sure there is a tut out there that covers this, perhaps even php.net
essexboyracer is offline   Reply With Quote
Old 05-17-2006, 06:27 PM   #3
Bobulous
Member
 
Join Date: Feb 2006
Location: London
Posts: 68
What are you using to parse the XML?
Bobulous is offline   Reply With Quote
Old 05-18-2006, 12:20 AM   #4
sahammondsr
Member
 
Join Date: Oct 2001
Posts: 35
I am parsing it myself in an earlier version of PHP 4 through a relatively typical function startElement, function characterData and function endElement. Within each, I'm using a switch statement to determine handling of various elements of the data in the XML feed.

Where I'm getting the ? essentially looks like this:

Code:
function characterData ($xmlParser, $data)
{
     global $tagFlag, $specsData;
     $specsData[$tagFlag] .= $data;
}
sahammondsr is offline   Reply With Quote
Old 05-18-2006, 12:34 AM   #5
sahammondsr
Member
 
Join Date: Oct 2001
Posts: 35
I searched on the big G and didn't find anything that helped. Maybe I'm trying the wrong search terms or not realizing that I have the info. I'm very experienced at PHP and MySQL and rarely need to seek assistance, but because I've never dealt with special character handling before and don't have anything on it in the books and sites I visit, I'm completely stumped.
sahammondsr is offline   Reply With Quote
Old 05-18-2006, 05:08 AM   #6
MarkR
Senior Member
 
Join Date: Jul 2004
Location: Oxford, England
Posts: 1,983
Which XML parser are you using? If you're using DOM (which is really the best plan), then you will find that everything is in utf8 when it comes out, regardless of the encoding it was in when it went in (Remember that XML files are unambiguous in encoding; all XML files must either specify encoding= or be in utf8).

You can easily preg_match with the "u" modifier against that specific character.

But there is a more important issue here. Why is the supplier of data using such a strange method to specify machine-readable data? Why not simply have one element containing each supported resolution?

If the data aren't actually in machine readable format and you're trying to parse them as such when they just happen to have been put in like that by a human, then you're in trouble, because it WILL change.

Mark
MarkR is offline   Reply With Quote
Old 05-20-2006, 01:33 PM   #7
sahammondsr
Member
 
Join Date: Oct 2001
Posts: 35
Mark:

I agree whole heartedly. We've had nothing but trouble with this particular XML feed but they won't budge an inch because a lot of other developers aren't complaining. And its a major outfit (an eBay company)...

I'll try using the u code and see what I can do. I don't think I will ever flat bid a project again . I've been coding for years and never struggled this way with an XML feed. Their API lists about 10 different possible feeds you can get. That's fine when querying for categories or product listings or reviews or comparing prices or features of a product because you always now what sort of elements you will get back in your XML.

But.. their search is the most rediculous thing I've ever seen. You send their API a term and it decides if it should be a specific product or broad product search or category of products or otherwise. There are near a dozen possible sets you get back. Each set has DIFFERENT elements and hierarchies. So when parsing, I have to take the attribute to the first tag (result type) and then decide which of a dozen ways I am going to parse and handle that particular call.

And half of their result types are completely undocumented. They don't consistently provide the items they say they will provide in the API documentation. I'm so disappointed . I figured it would be nothing for me to pull and display categories, products within categories, product details, user reviews, price comparisons and a search feed. What should have been a week or two of work has turned into a couple of months.

And that doesn't count for all the particular queries I get which are completely malformed. They weren't putting stuff in CDATA where it had special characters, etc. and it took me over a month to convince them that it wasn't optional, they had to because it broke all of the standard compilers to *nix based systems and typical web languages.

Their developer base prior to me seems to be almost entirely Windows and somehow those guys are handling malformed XML like it was nothing. I am so frustrated that I've almost considered writing a customer parser just so I can do my own malformed handling instead of fighting with these guys to get their feeds right...

UGH!!!

It's no wonder there aren't that many sites running this particular companies XML feed.

Last edited by sahammondsr; 05-20-2006 at 01:40 PM.
sahammondsr is offline   Reply With Quote
Reply

Bookmarks


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -4. The time now is 07:41 AM.








Acceptable Use Policy

Internet.com
The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers


Powered by vBulletin® Version 3.7.2
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.