Adsense and robots.tx

My Adsense crawler errors report shows various files supposedly restricted by my robots.txt

They take the form of

http://webcache.googleusercontent.com/s … google.com">http://webcache.googleusercontent.com/search?q=cache:0X2JPl27DlUJ:aplawrence.com/Basics/taskset.html+linux+taskset+%22cpu+affinity%22&cd=5&hl=en&ct=clnk&gl=us&source=www.google.com

with the details varying.

I don't see why my robots.txt would match that (or any of the others):

User-Agent: *

Disallow: /UNIXIART

Disallow: /pub

Disallow: /Anatests

Disallow: /Personal

Disallow: /MissDow

Disallow: /var/ftp/pub/psst-sample.pdf

Disallow: /cgi-bin/deepindexget.pl

Disallow: /cgi-bin/printer.pl

Disallow: /cgi-bin/newcomm.pl

Disallow: /cgi-bin/auth.pl

Disallow: /cgi-bin/countad.pl

Disallow: /cgi-bin/fatal.pl

Disallow: /cgi-bin/forumpost.pl

Disallow: /cgi-bin/freprint.pl

Disallow: /cgi-bin/showrelated.pl?

Disallow: /cgi-bin/getauthart.pl

Disallow: /cgi-bin/mkltest.pl

Disallow: /cgi-bin/mkpost.pl

Disallow: /cgi-bin/nav.pl

Disallow: /cgi-bin/randad.pl?

Disallow: /cgi-bin/snav.pl

Disallow: /cgi-bin/search3.pl

Disallow: /cgi-bin/supersearch.pl

Disallow: /cgi-bin/ta.pl

Disallow: /cgi-bin/tester.pl

Disallow: /Bela

Disallow: /Maps

Disallow: /errors

Disallow: /itech

Disallow: /Consultants/

Disallow: /prepub

Disallow: /buytests.html

Disallow: /download.html

Disallow: /contrib.html

Disallow: /Unix/sample

Disallow: /visitreport.html

Disallow: /SCOFAQ/upgrade.txt

In Googling around, I found a suggestion to add these lines:

User-agent: Mediapartners-Google

Allow: /

User-agent: Adsbot-Google

Allow: /

User-agent: Googlebot-Image

Allow: /

User-agent: Googlebot-Mobile

Allow: /

I don't know why I'd need to, but I'll try it - are the last two really Adsense related?

1 Reply

Mine has those, fwiw. (A wordpress site) And google adsense says it can index ok without error.

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content
Disallow: /tag
Disallow: /author
Disallow: /wget/
Disallow: /httpd/
Disallow: /i/
Disallow: /f/
Disallow: /t/
Disallow: /c/
Disallow: /j/

User-agent: Mediapartners-Google
Allow: /

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Image
Allow: /

User-agent: Googlebot-Mobile
Allow: /

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct