Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
-
Hi!
I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us.
The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx
I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this:
User-agent: googlebot
Disallow: /community/photos/
Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible?
Thanks!
Leona
-
Are you seeing the images getting indexed, though? Even if GWT recognize the Robots.txt directives, blocking the pages may essentially keep the images from having any ranking value. Like Matt, I'm not sure this will work in practice.
Another option would be to create an alternate path to just the images, like an HTML sitemap with just links to those images and decent anchor text. The ranking power still wouldn't be great (you'd have a lot of links on this page, most likely), but it would at least kick the crawlers a bit.
-
Thanks Matt for your time and assistance! Leona
-
Hi Leona - what you have done is something along the lines of what I thought would work for you - sorry if I wasn't clear in my original response - I thought you meant if you created a robots.txt and specified Googlebot to be disallowed then Googlebot-image would pick up the photos still and as I said this wouldn't be the case as it Googlebot-image will follow what it set out for Googlebot unless you specify otherwise using the allow directive as I mentioned. Glad it has worked for you - keep us posted on your results.
-
Hi Matt,
Thanks for your feedback!
It is not my belief that Googlebot overwrides googlebot-images otherwise specifying something for a specific bot of Google's wouldn't work, correct?
I setup the following:
User-agent: googlebot
Disallow: /community/photos/
User-agent: googlebot-Image
Allow: /community/photos/
I tested the results in Google Webmaster Tools which indicated:
Googlebot: Blocked by line 26: Disallow: /community/photos/Detected as a directory; specific files may have different restrictions
Googlebot-Image: Allowed by line 29: Allow: /community/photos/Detected as a directory; specific files may have different restrictions
Thanks for your help!
Leona
-
Hi Leona
Googlebot-image and any of the other bots that Google uses follow the rules set out for Googlebot so blocking Googlebot would block your images as it overrides Googlebot-image. I don't think that there is a way around this using the disallow directive as you are blocking the directory which contains your images so they won't be indexed using specific images.
Something you may want to consider is the Allow directive -
Disallow: /community/photos/
Allow: /community/photos/~username~/
that is if Google is already indexing images under the username path?
The allow directive will only be successful if it contains more or equal number of characters as the disallow path, so bare in mind that if you had the following;
Disallow: /community/photos/
Allow: /community/photos/
the allow will win out and nothing will be blocked. please note that i haven't actioned the allow directive myself but looked into it in depth when i studied the robots.txt for my own sites it would be good if someone else had an experience of this directive. Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How i can increase my page authority?
Hi, I have website and i want to increase my page authority. My website is latestdatabase.com I have making more backlinks but not good page authority so far. Please give me suggest.
Intermediate & Advanced SEO | | LatestMailingDatabase1 -
Creating Redirect Maps -To include PDFs or Not to include PDFs?
When creating a redirect map for a site re-build or domain change, it is necessary to include .PDFs or any other non-HTML URLs? Do PDFs even carry "seo juice" over? When switching CMS, does it even matter to include them? Thanks!
Intermediate & Advanced SEO | | emilydavidson0 -
Domain Authority: 23, Page Authority: 33, Can My Site Still Rank?
Greetings: Our New York City commercial real estate site is www.nyc-officespace-leader.com. Key MOZ metric are as follows: Domain Authority: 23
Intermediate & Advanced SEO | | Kingalan1
Page Authority: 33
28 Root Domains linking to the site
179 Total Links. In the last six months domain authority, page authority, domains linking to the site have declined. We have focused on removing duplicate content and low quality links which may have had a negative impact on the above metrics. Our ranking has dropped greatly in the last two months. Could it be due to the above metrics? These numbers seem pretty bad. How can I reverse without engaging in any black hat behavior that could work against me in the future? Ideas?
Thanks, Alan Rosinsky0 -
What Happens If a Hreflang Sitemap Doesn't Include Every Language for Missing Translated Pages?
As we are building a hreflang sitemap for a client, we are correctly implementing the tag across 5 different languages including English. However, the News and Events section was never translated into any of the other four languages. There are also a few pages that were translated into some but not all of the 4 languages. Is it good practice to still list out the individual non-translated pages like on a regular sitemap without a hreflang tag? Should the hreflang sitemap include the hreflang tag with pages that are missing a few language translations (when one or two language translations may be missing)? We are uncertain if this inconsistency would create a problem and we would like some feedback before pushing the hreflang sitemap live.
Intermediate & Advanced SEO | | kchandler0 -
Should I noindex the site search page? It is generating 4% of my organic traffic.
I read about some recommendations to noindex the URL of the site search.
Intermediate & Advanced SEO | | lcourse
Checked in analytics that site search URL generated about 4% of my total organic search traffic (<2% of sales). My reasoning is that site search may generate duplicated content issues and may prevent the more relevant product or category pages from showing up instead. Would you noindex this page or not? Any thoughts?0 -
Why is Google Displaying this image in the search results?
Hi i'm looking at advice on how to remove or change a particular image Google is displaying in the search results. I have attached a screenshot. From the first look of it, i assumed the image would be related and be on the dealers Google+ Local Page: https://plus.google.com/118099386834104087122/about?hl=en But there are no photos. The image seems to be coming from the website. Is there a way to stop Google from displaying this image or making them display a totally different image. Thanks, Chris XzfsnUy.png
Intermediate & Advanced SEO | | Mattcarter080 -
How to Disallow Tag Pages With Robot.txt
Hi i have a site which i'm dealing with that has tag pages for instant - http://www.domain.com/news/?tag=choice How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed. Any suggestions? Cheers, Mark
Intermediate & Advanced SEO | | monster990 -
Will having image lightbox with content on a web page SEO friendly?
This website is done in CMS. Will having lightbox pop up with content be SEO friendly? If you go to the web page and click on the images at the bottom of the page. There are lightbox that will display information. Will these lightbox content information be crawl by Google? Will it be consider as content for the url http://jennlee.com/portfolio/bran.. Thanks, John
Intermediate & Advanced SEO | | VizionSEO990