Handle duplicate content indexing for SEO with the rel=”canonical” attribute

For anyone focusing on SEO and duplicate content indexing, i.e. the same page indexed with several URLs, thus having a negative page rank impact, Google, Yahoo and Microsoft now all support the canonical relation on link elements.

Background

It is outlined more in detail in Google’s blog post Specify your canonical, but the gist of it is that a page of yours could be indexed with many different URLs. I.e:

http://www.robertnyman.com/2008/05/27/the-ultimate-getelementsbyclassname-anno-2008/

https://robertnyman.com/2008/05/27/the-ultimate-getelementsbyclassname-anno-2008/ (Note: www omitted)

http://www.robertnyman.com/?p=761 (Note: WordPress nowadays redirects these sort of URLs)

What you want is to have just one URL for a page indexed by various search engines, to help it getter a higher ranking and only publicly promote that URL.

The rel="canonical"

The three major search engine players have announced that they now support a new rel attribute on link elements, displaying what is the preferred URL for the currently viewed content. Therefore, if I were to prefer the first of the above listed URLs for my post, I could add a new element to that page, looking like this:

<link rel="canonical"
	href="http://www.robertnyman.com/2008/05/27/the-ultimate-getelementsbyclassname-anno-2008/">

What’s good

Naturally, anyone working with or stumbling across any SEO strategy will definitely appreciate such an approach to actively promote the URL one wants to for a page, instead of people linking with different links to the same content.

It’s also good that all three of Google, Yahoo and Microsoft support this, because it adds a wider meaning to it.

Another upside is, the perhaps given, answer by Google that the rel="canonical" is regarded as a strongly honored hint, rather than an actual directive. If it had been a directive, I’m sure it would’ve been fairly abused instantly.

What’s maybe not so good

An interesting reply from Google is that the rel="canonical" can point to a redirecting URL (but only on the same domain). I’m not sure about this, but it sounds like a possible opening for redirecting URLs to something completely different, enhancing something which maybe doesn’t deserve as much attention. Sure, all search engines must validate this, but I’m sure there will be arguing about old content having been updated, extremely extensive algorithms trying to take such factors into account etc.

Maybe I’m completely wrong here, but that thing in conjunction with people who will definitely trying their hardest to misuse this, I have an ominous feeling.

Another interesting aspect is mentioned by Anne in rel=”canonical”, and that is that this is implemented out of any standardization process or registry. But hey, if you’re big enough, you do something and other have to follow, right? Sometimes such behavior results in great things, and other times not so wonderful consequences…

Adding this to your web site

If you use WordPress, which a lot of people seem to do, or Drupal, Joost de Valk has released solutions for. Please read more in Canonical URL links if this sounds interesting (Disclaimer: I have not tested these myself).

14 Comments

  • […] See original here: Handle duplicate content indexing for SEO with the rel=”canonical … […]

  • Ricky says:

    Wow, this is great news. It opens the door to a lot of solutions and I'm glad to see that the attribute is being supported by all the big players.

  • Matt Robin says:

    I'm intrigued by this Robert, as it seems to make some good sense (from an SEO side of things), would this be a good idea for main navigational links on a site?

  • Robert Nyman says:

    Ricky,

    Yes, it's great that they, for once, have implemented the same thing.

    Matt,

    For each navigation link's result page, sure. Just to make sure the main parts of the web site is covered.

  • Gerben says:

    I think that in most cases a redirect (302) is much more useful. Most people linking to your page copy the url from the addressbar. If all variations of urls describing the same page are redirected to your preferred version, people can only copy your preferred url.

    It might be useful for people that don't have access to the http redirect because off server limitations for example, but those people probably don't even know what SEO is.

    The other use might be for a forum to strip down the querystring to it's minimum. One might link to a forum post with an added query-parameter to e.g. highlight all instances a keyword in the text.

    e.g. example.com/forum?thread=481&highlight=apache

    Redirecting this url to example.com/forum?thread=481 will result in lost functionality. So here one could use <link rel="canonical" href="example.com/forum?thread=481" />

    In this case a can see some added value, but I think that for most sites there's no need for this.

  • Robert Nyman says:

    Gerben,

    I agree, it should not really be used as a redirect replacement. In my eyes, it rather seems like a good way to complement it in some cases.

    Also, good point about address bar URL copying!

  • […] Handle duplicate content indexing for SEO with the rel=”canonical” attribute – Robert’s talk -… For anyone focusing on SEO and duplicate content indexing, i.e. the same page indexed with several URLs, thus having a negative page rank impact, Google, Yahoo and Microsoft now all support the canonical relation on link elements. (tags: google tutorial search seo article) « links for 2009-02-18 […]

  • Dan says:

    This is indeed interesting for full page duplicate content.

    However, an issue I face, is that I have a section of mandatory duplicate content at the bottom of every page of site (think long, lengthy medical disclaimer text). It was suggested to us by an SEO consultant to make the home page have this as text, but all other pages have an image of the text. This way the text-version is still spidered, but the image-version will not be, and won't dilute the rankings. This is horrible to maintain, and just leaves a nasty, filmy taste in my mouth (seriously text in an image?). I wonder if this canonical method could be applied somehow. Suggestions are welcomed!

  • Robert Nyman says:

    Dan,

    Interesting question!

    To me, the image with text sounds like a terrible idea. Sure, maybe it won't get indexed, but I'm fairly sure Google would see a disclaimer text as what it is, and not duplicate content.

    In my world, at least, I would have the disclaimer text in every page, just as it's intended to.

  • oliver says:

    A neat little firefox extension has been written (not by me, but by a friend I might add) which displays the canonical url in the toolbar.

    The description and download link are here:

    Canonical uri firefox extension

  • Robert Nyman says:

    oliver,

    Cool, thanks!

  • […] These three url’s all point to the same page but would be looked at as three different pages with identical content by the search engines. To help us deal with this Google, Yahoo! and Microsoft now supports the link rel=”canonical” tag (read more at Google Webmaster Central and Robert’s Talk). […]

  • […] same exact content, and this is repeated over and over again, this could hurt your SEO rankings. Search engines don’t like duplicate content. Now if you already have this duplicate page backup feature on your site, you can always […]

  • I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty. Should I stop publish my articles on article directories?

Leave a Reply to Ricky Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.