Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Is 404'ing a page enough to remove it from Google's index?
-
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
-
Nice information John. I hadn't thought of adding a temporary page with a noindex tag but that sounds like a way to go for faster results.
I know Google has automatically removed 404 pages in the past. I noticed the issue Michelle is sharing and the information you shared offers great details on the process.
-
Setting pages to 404 should be enough to remove them after Google indexes your page enough times. Google has to be careful about this, because when many sites crash or have site maintenance, they return 404 instead of 503, so Google wouldn't want to remove pages from their index until they're sure the page is gone.
Google talks about removing pages from there index here. The Google Webmaster Tools URL removal tool is only intended for pages that urgently need to be removed, so I wouldn't recommend that. Google recommends:
- If the page no longer exists, make sure that the server returns a 404 (Not Found) or 410 (Gone) HTTP status code. This will tell Google that the page is gone and that it should no longer appear in search results.
- If the page still exists but you don't want it to appear in search results, use robots.txt to prevent Google from crawling it. Note that in general, even if a URL is disallowed by robots.txt we may still index the page if we find its URL on another site. However, Google won't index the page if it's blocked in robots.txt and there's an active removal request for the page.
- Alternatively, you can use a noindex meta tag. When we see this tag on a page, Google will completely drop the page from our search results, even if other pages link to it. This is a good solution if you don't have direct access to the site server. (You will need to be able to edit the HTML source of the page).
Is there a reason you are 404'ing these pages rather than redirecting them? If these pages have new pages with similar content, you should do a 301 redirect to keep the link juice flowing and to take advantage of these pages being linked to. If you do continue returning 404 for these pages (or even if you don't...), make sure your 404 page is a useful one, that helps users find the page they're looking for (Google help article).
Also, Ryan, I'd be interested in hearing the results of using the 410 status code. I would imagine that status code would do the trick! I'm surprised I haven't read about this more, or why it's not mentioned in the help file linked to above.
-
I have experienced this same issue with Google.
I just began a test by making a change on my site to one of the URLs. I am bookmarking this Q&A and will try to remember to update it if I see a change. It can take Google some time to check any individual link so it could take weeks.
In case you are curious, I have added a 410 status code for one of the pages involved. 410 means the resource is gone, while 404 is simply not found. Perhaps the 410 header code will send the right message to Google.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
Magento: Should we disable old URL's or delete the page altogether
Our developer tells us that we have a lot of 404 pages that are being included in our sitemap and the reason for this is because we have put 301 redirects on the old pages to new pages. We're using Magento and our current process is to simply disable, which then makes it a a 404. We then redirect this page using a 301 redirect to a new relevant page. The reason for redirecting these pages is because the old pages are still being indexed in Google. I understand 404 pages will eventually drop out of Google's index, but was wondering if we were somehow preventing them dropping out of the index by redirecting the URL's, causing the 404 pages to be added to the sitemap. My questions are: 1. Could we simply delete the entire unwanted page, so that it returns a 404 and drops out of Google's index altogether? 2. Because the 404 pages are in the sitemap, does this mean they will continue to be indexed by Google?
Intermediate & Advanced SEO | | andyheath0 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
After reading of Google's so called "over-optimization" penalty, is there a penalty for changing title tags too frequently?
In other words, does title tag change frequency hurt SEO ? After changing my title tags, I have noticed a steep decline in impressions, but an increase in CTR and rankings. I'd like to once again change the title tags to try and regain impressions. Is there any penalty for changing title tags too often? From SEO forums online, there seems to be a bit of confusion on this subject...
Intermediate & Advanced SEO | | Felix_LLC0 -
What's the best way to redirect categories & paginated pages on a blog?
I'm currently re-doing my blog and have a few categories that I'm getting rid of for housecleaning purposes and crawl efficiency. Each of these categories has many pages (some have hundreds). The new blog will also not have new relevant categories to redirect them to (1 or 2 may work). So what is the best place to properly redirect these pages to? And how do I handle the paginated URLs? The only logical place I can think of would be to redirect them to the homepage of the blog, but since there are so many pages, I don't know if that's the best idea. Does anybody have any thoughts?
Intermediate & Advanced SEO | | kking41200 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
Tool to calculate the number of pages in Google's index?
When working with a very large site, are there any tools that will help you calculate the number of links in the Google index? I know you can use site:www.domain.com to see all the links indexed for a particular url. But what if you want to see the number of pages indexed for 100 different subdirectories (i.e. www.domain.com/a, www.domain.com/b)? is there a tool to help automate the process of finding the number of pages from each subdirectory in Google's index?
Intermediate & Advanced SEO | | nicole.healthline0