I've been trying to figure out how Twitter blocks access to their sitemaps and haven't been able to replicate it on my website. I was hoping that I could get a bit of here from people more knowledgeable than I ;)
Here's the location of their sitemap index file: http://ift.tt/1yqruli
If you view that file directly it's showing blank. But search for it in Google and view the cached version of it: http://ift.tt/1yqrulk
Obviously Google is able to see the sitemap. I'm guessing that Twitter doesn't know every single Google IP so I was assuming they did it with the User Agent. However, when I install User Agent Switcher and change to GoogleBot, I'm still not able to view the sitemap.
Does anyone know how exactly this is being done?
No comments:
Post a Comment