Web crawlers truncating URLs with single quotes. Bad sitemap.xml maybe?



I'm getting quite a few failed requests on my server and they're mostly from web crawlers that encounter URLs with single quotes in them.


example: http://ift.tt/1pLZEGF's-event


and the crawler ends up browsing to


http://ift.tt/1pLZEGF


Now my sitemap.xml's URL entry DOES contain the raw single quote (not entity escaped); however all of the online sitemap generators actually generate the same thing - they don't entity escape the single quote. Also, I've submitted my sitemap.xml to online validators and it validates every time.


One thing I've noticed is that these online generators issue:


<urlset xmlns="http://ift.tt/xwbjRF" xmlns:xsi="http://ift.tt/ra1lAU" xsi:schemaLocation="http://ift.tt/xwbjRF http://ift.tt/Av6BHN">


whereas my sitemap.xml only contains:


<urlset xmlns="http://ift.tt/xwbjRF" xmlns:image="http://ift.tt/Sqn21o">


Could that have something to do with it?


No comments:

Post a Comment