What is monodoc.ashx and why does googlebot request it?_问答_开发者

What is monodoc.ashx and why does googlebot request it?

开发者 https://www.devze.com 2023-04-09 19:46 出处：网络

I am getting TONS of request. They all start with /1.1/handlers/monodoc.ashx?link= then follows what look like .NET classes. What are these and why is googlebot requesting them?

相关专题：googlebot

I am getting TONS of request. They all start with

/1.1/handlers/monodoc.ashx?link=

then follows what look like .NET classes. What are these and why is googlebot requesting them?

I need to turn it off so my access and error lo开发者_如何转开发g isnt polluted.

Googlebot will request any URL that it knows of, which includes URLs that you may not have generated yourself.

For instance, if there's a forum out there that links to your site with that URI, Googlebot will attempt to crawl it to see if there's any information worth indexing.

Based on IP provided, I verified that it was indeed Googlebot since the reverse DNS lookup resolves to 'crawl-66-249-68-184.googlebot.com' and the forward DNS lookup for 'crawl-66-249-68-184.googlebot.com' resolves back to the IP address provided.

The best thing you can do it respond with a 404 or 410 response if that page shouldn't exist. If you have an idea of what content used to be there, you should 301 redirect it to a relevant page on your site just in case other people had linked to those pages ... you not only want to retain the link credit for those links, but also it's just a better user experience for users who have followed that link. If there isn't a relevant place to 301 redirect the users to, you can redirect them to your homepage, but just know that from an SEO perspective, the link value will decay since the relevancy of the links probably won't match exactly to the content of your homepage.

Definitely make sure that you're not responding with a 500 or 503 response code. If you have a large number of 5xx type of responses, Googlebot will think that it's hitting your site too hard and will throttle back their crawl.

Lastly, even if you 301, 404, or send a 410 response ... expect to see Googlebot to hit these URLs for sometime (e.g. even years from now). I've got sites that receive a burst of Googlebot traffic for long dead legacy URIs every few weeks. There are some old crusty URLs out there, and Googlebot will run across them from time to time and then attempt to recrawl them. They even keep a historic list which they'll attempt to crawl when they feel they have additional bandwidth to allocate to your site.

TL;DR: Don't sweat it. Googlebot will hit these URLs for no good reason. Just send the response that would be the best User Experience, and you'll be fine.