Googlebot Crawl Rate causing site slowdown

SuperMikeLewis

I am hearing from my IT department that Googlebot is causing as massive slowdown/crash our site. We get 3.5 to 4 million pageviews a month and add 70-100 new articles on the website each day. We provide daily stock research and marke analysis, so its all high quality relevant content. Here are the crawl stats from WMT:

http://imgur.com/dyIbf

I have not worked with a lot of high volume high traffic sites before, but these crawl stats do not seem to be out of line. My team is getting pressure from the sysadmins to slow down the crawl rate, or block some or all of the site from GoogleBot.

Do these crawl stats seem in line with sites? Would slowing down crawl rates have a big effect on rankings?

Thanks

Rich_A

Similar to Michael, my IT team is saying Googlebot is causing performance issues - specifically during peak hours.

It was suggested that we consider using apache re-write rules to serve Googlebot a 503 during our peak hours to limit the impact. I found the stackoverflow thread (link below) in which John Muller seems to suggest this approach, but has anyone tried this?

http://stackoverflow.com/questions/4730376/how-to-set-robots-txt-or-apache-to-allow-crawlers-only-at-certain-hours

Cyrus-Shepard

Blocking googlebot is a quick and easy way to disappear from the Index. Not an option if you want Google to rank your site.

For smaller sites or ones with limited technologies, I sometimes recommend using a crawl-delay directive in robots.txt

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=48620

But I agree with both Shane and Zachary, this doesn't seem like the long term answer to your problems. Your crawl stats don't seem out of line for a site of your size, and perhaps a better hardware configuration could help things out.

With 70 new articles each day, I'd want Google crawling my site as much as they pleased.

Jinx14678

whatever Google's default is in GWT - It sets it for you.

You can change it, but it is not reccomended unless for a specific reason (such as Michael Lewis's specific scenario) even though, I am not completely sold that Gbot is what is causing the "dealbreaking" overhead.

ClaireH-184886

what is the ideal setting on the crawler. i have been wondering about this for some time.

Jinx14678

Hi,

Your admins saying that, is like someone saying "we need to shut the site down, we are getting to much traffic!" Common sys-admin response (fix it somewhere else)

4GB a day downloaded, is alot of Bot traffic, but it appears you are a "real time" site, that is probably actually helped and maybe even reliant on your high crawl rate....

I would upgrade hardware - or even look into some kind of off site cloud redundancy for failover (Hybrid)

I highly doubt that 4GB a day, is a "dealbreaker",but of course that is just based off the one image, and your admins probably have resource monitors - Maybe Varnish is an answer for static content to help lighten load???? Or CDN for file hosting to lighten bandwidth load?

Shane

SuperMikeLewis

We are hosting the site on our own hardware at a big colo. I know that we are upgrading servers but they will not be online until the end of July.

Thanks!

deltasystems

I wouldn't slow the crawl rate. A high crawl rate is good so that Google can keep their index of your website current.

The better solution is to reconsider your hardware and networking setup. Do you know how you are being hosted? From my own experience with a website of that size, a load balancer on two decent dedicated servers should handle the load without problems. Google crawling your pages shouldn't create noticeable overhead on the right setup.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Googlebot Crawl Rate causing site slowdown

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Tools/Software that can crawl all image URLs in a site

Why Can't Googlebot Fetch Its Own Map on Our Site?

Can anyone tell me why some of the top referrers to my site are porn site?

Does my "spam" site affect my other sites on the same IP?

Staging site and "live" site have both been indexed by Google

Way to spider Wordpress site

404 Errors After Site Migration

Crawling image folders / crawl allowance

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved