What is a canonical?

A canonical link is an HTML element used to distinguish the “original” page from derivative pages carrying the same content. It is used to prevent duplicate content issues on the site and tells search engines which page it should index.

 

canonical link diagram

 

How are canonical links used?

How a canonical is used depends on the site and the types of content it contains. Here are the six common instances where canonicals should be used:

  1. Self Referring Canonicals
  2. Duplicate Pages
  3. View All Pages
  4. Faceted Navigation
  5. Non-HTML Content
  6. Cross-Domain Syndication

Best Practices

  • Self-referring vs canonicalization
  • one canonical per page
    • plugins can sometimes conflict with hard-coded canonicals, duplicating them
  • absolute urls not relative urls

Self-Referring

This type of canonical points to itself. This is used as a confidence indicator to confirm that the page the search engine has found is indeed the page that should be indexed. This type of canonical is particularly useful when redirecting pages to a new location. Search engines will follow a 301 redirect and use the self referring canonical to confirm that the page it has arrived on is the new page that should be indexed.

Example of a self-referring canonical:

URL
http://www.example.com/breakdancing-grizzly-bear

Canonical
<link rel="canonical" href="http://www.example.com/breakdancing-grizzly-bear" />

Duplicate Content

In it’s most basic form duplicate content means that two or more URLs have the same content. Normally this is not done on purpose, but rather the Content Management System (CMS) is producing URLs that will render the content on different URLs.

An important thing to remember about duplicate content is that if a URL can be modified and the site still renders the content on the original URL, then you have a potential duplicate content issue.

Common ways modifying a URL can produce duplicate content:

http vs https

These would technically be considered duplicates

http://www.example.com/services
https://www.example.com/services

www vs non-www. This happens when a CMS does not force the domain to use either www or non-www. Having a www in the URL is really declaring a subdomain. So being able to render content on the www version of the URL is like

These would technically be considered duplicates

http://example.com/services
http://www.example.com/services

Capitalization. If you can modify a URL by capitalizing one or more of its characters and the content still renders, that is considered duplicate content. It would be rare to see this type of duplicate being indexed by search engines, but it can have an effect on the way a page accumulates authority. If another site links to a piece of content using capitalization, authority will be passed to that URL, instead of attributing authority to the lower case version of the link.

These would technically be considered duplicates

http://www.example.com/services
http://www.example.com/seRvices

Development Sites. When a site is undergoing a redesign a development site is typically set up to test the new site in a live environment. If the developers fail to add a noindex tag to the page, then there is potential for duplicate content issues. Developement sites are usually hosted on a subdomain or seperate domain. In either case developers should included a noindex tag and block all search engines from crawling that content.

These would technically be considered duplicates

http://www.example.com/
http://dev.example.com/
http://www.development-domain.com/

View All Pages

Duplicate content can be created when a website has a single view all page and individual pages that contain pieces of content from the view all page. This is common with publishers who product list type content where the view all pages has all ten items on one page, but also breaks each item out onto it’s own page.

view all canonical diagram

The problem with this type of content is that it often competes with itself in organic rankings. To prevent this, the site should add a canonical from breakout pages to the view all page. This eliminates duplicate content issues and consolidates link metrics, making the view all page the one page that will be indexed and ranked.

Examples of a view all page canonical: 

View All URL 
http://www.example.com/top-5-bill-murray-movies 

Individual Pages 
URL: http://www.example.com/top-5-bill-murray-movies/groundhog-day 
Canonical: http://www.example.com/top-5-bill-murray-movies 

URL: http://www.example.com/top-5-bill-murray-movies/ghostbusters 
Canonical: http://www.example.com/top-5-bill-murray-movies 

URL: http://www.example.com/top-5-bill-murray-movies/lost-in-translation 
Canonical: http://www.example.com/top-5-bill-murray-movies 

URL: http://www.example.com/top-5-bill-murray-movies/caddyshack 
Canonical: http://www.example.com/top-5-bill-murray-movies 

URL: http://www.example.com/top-5-bill-murray-movies/scrooged 
Canonical: http://www.example.com/top-5-bill-murray-movies

What if a site has multiple canonicals?

Another issue is when pages include multiple rel=canonical links to different URLs. This happens frequently in conjunction with SEO plugins that often insert a default rel=canonical link, possibly unbeknownst to the webmaster who installed the plugin. In cases of multiple declarations of rel=canonical, Google will likely ignore all the rel=canonical hints. Any benefit that a legitimate rel=canonical might have offered will be lost.

Source: 5 Common Mistakes with rel=canonical by Google Webmasters Blog

Faceted navigation

Infinite scroll

Non-HTML Content

Cross-domain Duplicate Content

Some times content is cross published on multiple sites that are owned by the same company. This is still duplicate content and each piece of content has the ability to compete for rankings. To ensure the correct domain ranks for an article or piece of content a cross-domain canonical can be added to the page.

Resources:

Cross-domain URL selection – Search Console Help

Handling Legitimate Cross-Domain Canonicals – Google Webmaster Blog

Cross-Domain Canonical The New 301? – Whiteboard Friday – Moz

Does Google support cross-domain rel=”canonical”? – Google Webmaster on YouTube

How is it implemented?

There are two way to implement a canonical link. The first, and most  common, is by adding a <link> HTML tag to the <head> of a page.

Additional Resources

Ecommerce SEO: Product Variation, Colors, and Sizes – Merkle

rel=canonical: the ultimate guide – Yoast

Leave a Reply