Alexander Lushin
Nov 26, 2020 | 14 min read

When someone speaks of setting up canonical tags, it essentially means using the rel attribute equal to canonical, as well as the href attribute with the value equal to the URL in the <head> section of the link tag. 

The source code looks like this:

<link rel="canonical" href="https://seranking.com/" />
Canonical sample on SE Ranking webpage

Hence, canonical is not a tag but the value of the rel attribute, and its purpose is to make it clear to the search engines which version of the page they should rank.

What should you use canonical for?

Canonical is used for some URLs when the site has a main version of the document along with other, additional documents with similar content. What canonical does is point the search engines to the main version of the page.

Pay special attention to the word “similar”—we will come back to this definition later.

Here’s what Google Help has to say about canonical:

If you have a single page accessible by multiple URLs, or different pages with similar content (for example, a page with both a mobile and a desktop version), Google sees these as duplicate versions of the same page. Google will choose one URL as the canonical version and crawl that, and all other URLs will be considered duplicate URLs and crawled less often. 

If you don’t explicitly tell Google which URL is canonical, Google will make the choice for you, or might consider them both of equal weight, which might lead to unwanted behavior.

How can canonical URLs be specified?

There are several ways to specify the main version of a page. All of them are described in more detail in Google Help. The most common way is to use the <link> tag—and we will use it in the examples. Other possible options include:

  • The <link> tag with the rel=”canonical” attribute
  • The Rel=”canonical” HTTP header
  • A Sitemap file

As an alternative to canonical, you can also use 301 redirects, but keep in mind that redirects function entirely differently as they make only one version of the page available to search engines and users. Therefore, choose the method accordingly to the results you expect.

Best practices of using Canonical

Canonical is normally used to avoid similar or duplicate content appearing in search results. We will discuss below why the content can be duplicated.

Important! Using rel=”canonical” href=”” /; does not prevent search bots from indexing and crawling the document. Canonical is a recommendation and may be ignored by the search engine. Canonical indicates which version of the document should appear in the search results and which, in your opinion, is the main one.

If you want to block indexation, use the following:

  • <meta name=”robots” content=“noindex”
  • X-Robots-Tag: noindex HTTP header

Read more in Google Help.

Why it’s considered good to use Canonical

The use of canonical has been recognised as a helpful practice. It is implemented to prevent potential duplicate content issues, even if there’s no hint of duplicate or similar content.

It’s considered good practice to put canonical for each major page version, whether they have duplicate or similar content or not, and point it to that very same page.

Technically speaking, the page https://seranking.com/subscription.html has this canonical:

<link rel="canonical" href="https://seranking.com/subscription.html" /> 

Here, the value of the href attribute of the link tag contains the URL of the page where this link tag is located.

This solution helps you avoid possible problems and does not allow you to assign any parameters to the page and index pages with additional parameters.

Sorting options

The classic use of canonical is to specify the main document when using filtering, sorting, and other actions that result in a URL change.

Let’s see the laptop category on ebay as an example: 

https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031.

This is the main category that contains laptops for work and is optimized for this keyword cluster. Canonical for this page looks like this:

<link rel="canonical" href="https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031" /> 

The page is pointing to itself.

The page contains classic navigational elements:

  • sorting by parameters
  • changing the view type
Sorting by parameters option

Let’s change the view type. Products are now shown in a column:

Page View Type

But what matters the most for SEO specialists is the changed URL. Now it looks like this: https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031?rt=nc&_dmd=1.

Notice the parameters that appear at the end of the URL. Sorting options or other actions are implemented using these parameters. An infinite number of such pages can be generated depending on the possible sorting options. And from a search engine’s point of view, each variation with a new parameter is a separate URL.

If such pages end up in the index, pages with the same or very similar content—and most likely with the same <title>—will compete with each other. This will lead to keyword cannibalization and lower rankings.

It is to prevent such problems that you need to use canonical—it allows you to indicate the main version of the document that you want to see in the SERP. In the example, the sorting page has the following canonical:

<link rel="canonical" href="https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031" /> 

That is, the page points to the main version of the document without parameters.

Unoptimized filtering

When a variety of filtering parameters are applied, an online store website can have many pages created that are not optimized for any keyword cluster.

When we talk about optimization for a keyword cluster, we mean that the document has:

  • a <title> optimized for specific search queries, different from the <title> of the page where the filter was applied;
  • a unique and optimized H1 header;
  • products filtered according to a specific keyword cluster.

It’s worth noting that the HTML5 standard allows for any level of heading. By referring to H1, we are considering a classic situation in the context of the HTML4 standard.

The main thing you need to understand is that one landing page equals one user intent. For example, the Laptops & Netbooks category https://www.ebay.com/b/Laptops-Netbooks/175672/bn_1648276?_dmd=1 has filters that create separate pages for different user needs.

By selecting the Workstation filter, we’ll see a separate page https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031, which we’ve examined earlier.

But let’s get back to the situation when the filters aren’t or can’t be optimized. For example, we want to view products by two specific brands. Obviously, there is no point in optimizing a the page for such requests, since each brand page must have its own page. This is when filtering results should be “hidden” by canonical.

Important! The use of canonical must be analyzed for each individual case taking into account your particular needs.

Duplicate items

In some popular CMSs, for example Shopify, a product may contain the full path to the category in which it’s located. If you add a product to several categories, it gets duplicated across several URLs.

For example:

https://site.com/phone/iphone12/
https://site.com/phone/apple/iphone12/
https://site.com/iphone12/ 

The third URL is the preferred one and the first two should point to it as to the main, canonical document.

Important! It’s recommended to avoid such situations and link categories to the main versions of documents without using canonical.

UTM tags and tracking parameters

Parameters in the URL can be used to collect certain information, but at the same time can create pages with duplicate content. For example, a URL like https://site.com/page/  may have a version with parameters like https://site.com/page/?fbclid=IwAR3cnDV4ERw24pQNVLTFlwKzchPDA1

A similar link is generated in case of a redirect from Facebook. Here, canonical would be a great solution.

Specifying the main site version

A site that is accessible via HTTP and HTTPS protocols at the same time is seen by a search engine as two different sites, similarly to sites with and without www.

The variations below are 4 different sites:

https://site.com/
http://site.com/
https://www.site.com/
http://www.site.com

You can use canonical to specify the main website version.

Canonical to main website version

For example, if the main version is https://site.com/, then the rest must contain <link rel=”canonical” href=”https://site.com/”>;

To choose which version you want to prioritize, use the command in site: site.com—it will help you determine which version of the site has been indexed by Google and understand which pages are more present in the search.

Choosing between the HTTP and HTTPS protocol, you should definitely opt for the latter. To learn why HTTPS is preferable and how to move your site to HTTPS without losing rankings and traffic, read this guide. And deciding between www and non-www, learn about the details from this article.

It’s easier to manage the website if you use 301 redirects to specify its main version.

Canonicalization of cross-domain duplicates

If duplicate pages belong to different domains and you control both, you can choose the main, canonical version of the page on a different domain.

Common mistakes with canonical

Canonizing different types of pages

Let’s go back to how search engines describe canonical. Google recommends using canonical, “If you have a single page accessible by multiple URLs, or different pages with similar content.”

A common mistake is to specify the canonical product page for the category page, or vice versa. In this case, search engines can ignore canonical. It also makes no sense to specify a canonical product page for a blog post.

The key here is that content on a canonical and non-canonical page must be of the same type.

Canonical chains

When typing a URL in the href attribute, make sure that the page you are pointing to does not have a canonical pointing to another or that same page.

Here’s an example. The page you want to canonicalize is https://site.com/phone/iphone12/. The page you want to set as canonical is https://site.com/iphone12/. It already contains the following canonical:

<link rel="canonical" href="https://site.com/phone/apple/iphone12/" />
Canonical chain

This use case for canonical is incorrect because it creates a canonical chain.

The last in this chain is the page https://site.com/phone/apple/iphone12/, which means that most likely it will be considered canonical by search engines. In order not to confuse search robots, indicate only one canonical page.

In our example, this means that you need to decide which page you want to set as canonical: https://site.com/iphone12/ or https://site.com/phone/apple/iphone12/. 

For the first option, you need to replace canonical on the https://site.com/iphone12/ page so that it points to itself and canonicalize https://site.com/phone/iphone12/ and https://site.com/phone/apple/iphone12/.

Proper canonical usage

To leave the page https://site.com/phone/apple/iphone12 as canonical, you need to make sure that other similar pages link to it and the page links to itself.

Correct canonical usage

Important! Be careful when modifying canonical URLs. Be sure to find out why certain values are used.

Pointing to a URL that is not crawlable or indexable

When choosing a canonical URL, make sure that the document is crawlable, that is, it is not blocked in the robots file or by the X-Robots-tag or <meta name=“robots” content=“noindex” />.

You can check if the page is indexed in Google Search Console or by using SE Ranking’s Website Audit tool.

Crawled pages in SE Ranking's Website Audir

Pointing to a URL that returns a status code other than 200

When choosing a canonical URL, make sure that the document is available and returns the server response 200. You can check this by analyzing your site in SE Ranking.

Pointing to a URL with an invalid protocol

When specifying the canonical page, refer to the protocol that is used in the main version of the site. If the main version uses the HTTPS protocol, then you should specify the HTTPS version of the page in the href attribute.

Non-canonical pages in the sitemap

Make sure that only main page versions appear in the sitemap. In other words, the sitemap file only needs to include those pages that point to themselves with the help of canonical.

Internal links to canonicalized URLs

Internal links must point to the main version of the page. As an exception, you can refer to a canonicalized version for improving user experience or another valuable reason.

Pagination canonicalization

Opinion on this issue is divided.

Option 1. You can follow the good practice we’ve mentioned earlier: each pagination page links to itself. For instance:

https://site.com/catalog/page/2/ contains <link rel="canonical" href="https://site.com/catalog/page/2/" />. 

I stick to this method, because I believe that pagination should be open to crawlers.

Opinion 1 is included to give a full picture. Following it wouldn’t be a mistake with canonical.

Option 2. The second option boils down to blocking pagination from the search engine using canonical that points to the first page. For instance:

https://site.com/catalog/page/2/ contains <link rel="canonical2 href="https://site.com/catalog/" />. 

In this case, search engines might ignore your commands because pagination pages differ in content.

Option 3. Finally, there’s a third option: don’t use canonical and block pagination from indexation using <meta name=“robots” content=“noindex, follow” />

Pagination canonicalization

Case study

Let’s examine a case with a canonicalization mistake. A site based on Shopify had a mistake in the Duplicate products section. The site structure looked like this:

Website with plenty of canonized pages

As you can see, the structure is dominated by red dots, which are canonized pages. After the problem was fixed, the green color became dominant:

Website structure with proper canonical usage

And we got a significant growth in search visibility as a result:

website growth

Is it worth using canonical?

The proper use of canonical tags is a part of SEO basics. However, if set incorrectly, canonicalization may not bring the desired result and lead to lower rankings due to duplicate content issues.

To use canonical correctly, avoid creating duplicate and similar content by specifying the main version of the page as canonical. But there are exceptions to all the rules as there might be situations where duplicate content is not harmful. Follow the best practices but evaluate each situation individually.

Share article
Post Views: 564
2 comments
Leave a Reply

Your email address will not be published. Required fields are marked *

More Articles
SEO Insights
Integrating SEO into your marketing DNA
May 13, 2021 11 min read

Marketers always keep looking for new more efficient ways of promoting their brand but often ignore SEO-related opportunities. Learn how introducing SEO to your marketing mix can help you gain visibility in search and build a sustainable acquisition strategy.

Svetlana Shchehel
SEO Insights
Creating outreach emails that get the link and build a relationship
Apr 30, 2021 13 min read

Outreach email templates get buried in the inbox. Phil Forbes shares his recipe of irresistible outreach emails that won't get unnoticed. You'll need a dash of humor, social media spying skills, a catchy subject line, and a few more ingredients. Read the post to learn how to create an email that goes against that grain, and reap the benefits.

Phil Forbes
SEO Insights
HTML tags you can't do without in 2021
Apr 27, 2021 61 min read

Which HTML tags and attributes matter when it comes to website promotion and SEO? Why do you need them and how should they be used? We went through every HTML tag and their attributes to put together a list of the most important ones SEOs and digital marketers must know about in their work.

Andrew Zarudnyi