Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Should I use meta noindex and robots.txt disallow?
-
Hi, we have an alternate "list view" version of every one of our search results pages
The list view has its own URL, indicated by a URL parameter
I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling
When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled
Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"...
Thanks

-
Hi,
Thanks, I will do some testing to confirm that this behaves how I would like it to
-
if all pages are 100#5 not indexed then I would block it in robots.txt, Google's John Muller confirmed to me that Googlebot will continue to crawl every link to check to see if a nofollow or noindex has changed status.
So as a result we blocked our pages with robots.txt and saw a great increases in index/crawl rates on pages we want Google to pay attention to. It also reduces waste in server resources.
However if there are any pages that are index, if you block them in robots.txt then Googlebot will never be able to crawl the link to determine that it should be noindex. This means it could stay in a permanent stage of indexed.
I hope that answers all your questions?
-
When you say:
nofollow will tell the crawlers to not crawl the page
I believe you mean to say that this will tell the crawlers not to crawl the links on the page, the page itself is itself still "crawled" is it not?
But yes, you are right to say, that once robots.txt disallow is in place, the meta tag will not be seen and thus be moot (at which point I may as well take it off).
It would be nice to be able to say "don't crawl this and don't put it in the index"... but is there a way?
-
noindex only tells the search crawlers to not include the page in the index but still allows for them to crawl the page. nofollow will tell the crawlers to not crawl the page.
robots.txt will accomplish this as well but both I think would be overkill.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should you 'noindex' Checkout Pages?
Today I was reviewing my Moz analytics and suddenly noticed 1,000 issues with pages without a meta description. I reviewed the list and learned it is 1,000 checkout pages. That's because my website has thousands of agency pages from which you can buy a product, and it reflects that difference on each version of the checkout. So, I was thinking about no-indexing (but continuing to 'follow') these checkout pages, but wondering if it has any knock-on effects I may be unaware of? Any assistance is much appreciated. Luke
Intermediate & Advanced SEO | | Luke_Proctor0 -
How many images should I use in structured data for a product?
We have a basic printing website that offers business cards. Each type of business card has a few product images. Should we use structured data for all the images, or just the main image? What is your opinion about this? Thanks in advance.
Intermediate & Advanced SEO | | Choice0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Description vs meta description
I have an e-commerce website and am trying to create product category pages. I am under the impression that Description is the text that would appear under the title on a google search and I believe the meta description is just what google reads? Is having BOTH important or just description? Is it ok to duplicate the description for the meta description? I know its not good to duplicate descriptions on other products and pages.
Intermediate & Advanced SEO | | nchachula0 -
Canonical tag + HREFLANG vs NOINDEX: Redundant?
Hi, We launched our new site back in Sept 2013 and to control indexation and traffic, etc we only allowed the search engines to index single dimension pages such as just category, brand or collection but never both like category + brand, brand + collection or collection + catergory We are now opening indexing to double faceted page like category + brand and the new tag structure would be: For any other facet we're including a "noindex, follow" meta tag. 1. My question is if we're including a "noindex, follow" tag to select pages do we need to include a canonical or hreflang tag afterall? Should we include it either way for when we want to remove the "noindex"? 2. Is the x-default redundant? Thanks for any input. Cheers WMCA
Intermediate & Advanced SEO | | WMCA0 -
Should comments and feeds be disallowed in robots.txt?
Hi My robots file is currently set up as listed below. From an SEO point of view is it good to disallow feeds, rss and comments? I feel allowing comments would be a good thing because it's new content that may rank in the search engines as the comments left on my blog often refer to questions or companies folks are searching for more information on. And the comments are added regularly. What's your take? I'm also concerned about the /page being blocked. Not sure how that benefits my blog from an SEO point of view as well. Look forward to your feedback. Thanks. Eddy User-agent: Googlebot Crawl-delay: 10 Allow: /* User-agent: * Crawl-delay: 10 Disallow: /wp- Disallow: /feed/ Disallow: /trackback/ Disallow: /rss/ Disallow: /comments/feed/ Disallow: /page/ Disallow: /date/ Disallow: /comments/ # Allow Everything Allow: /*
Intermediate & Advanced SEO | | workathomecareers0 -
How does the use of Dynamic meta tags effect SEO?
I'm evaluating a new client site which was built buy another design firm. My question is they are dynamically creating meta tags and I'm concerned that it is hurting their SEO. When I view the page source this is what I see. <meta name="<a class="attribute-value">keywords</a>" id="<a class="attribute-value">keywordsGoHere</a>" content="" /> <meta name="<a class="attribute-value">description</a>" id="<a class="attribute-value">descriptionGoesHere</a>" content="" /> <title id="<a class="attribute-value">titleGoesHere</a>">title> To me it looks like the tags are not being added to the page, however the title is showing when you view it in a browser and if use a spider view tool, it sees the title. I'm guess it is being called from a DB. So I'm a little concerned though that the search engines are not really seeing the title and description. I'm not worried about the keywords tag. Can anyone shed some light on how this might work? Why it might not being showing the text for the description in the page code and if that will hurt SEO? Thanks for the help!
Intermediate & Advanced SEO | | BbeS0 -
How many time should a keyword be used in the body of text?
We employee an outside agency to write content for our website as we do not have the ability in house to write unique and good quality content. They have just sent an article which is around 300 words. I told them the keyword phrases to use. When I got the document there is only 1 instance of the keyword phrase(s) in it. Now there seems to be a conflict here amongst posts I have read and general SEO advise as to how many times it should be present (SEOmoz indicates 4 times for instance), our outside agency says it doesn't matter. Now if I have a page optimised for 2 keywords this starts making things tricky and probably looks keyword stuffed to the reader. Assuming the keywords are present once in meta tags, H1, meta descriptions and alt text, what do people think is best practice taking into account recent panda updates? Thoughts appreciated. Thanks Craig
Intermediate & Advanced SEO | | Towelsrus0