AdsBot-Google does not respect robots.txt

I noticed something strange today. AdsBot does not respect robots.txt as you would expect. AdsBot ignores general rules set in the robots.txt file. So for example, if your robots.txt file looks something like this:

User-agent: *
Disallow: /javascript/

You would expect the bot not to crawl any pages in the /javascript/ folder. However, AdsBot does crawl those pages. If you specifically name AdsBot in the robots.txt file is does respect the rules.

User-agent: adsbot-google
Disallow: /javascript/

User-agent: *
Disallow: /javascript/

Another thing I noticed is the AdsBot uses 2 ways to identify itself. It either sends

AdsBot-Google (+http://www.google.com/adsbot.html)

or

Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)

Leave a Reply

Your email address will not be published. Required fields are marked *