When someone speaks of setting up canonical tags, it essentially means using the rel attribute equal to canonical, as well as the href attribute with the value equal to the URL in the <head> section of the link tag.
The source code looks like this:
<link rel="canonical" href="https://seranking.com/" />
Hence, canonical is not a tag but the value of the rel attribute, and its purpose is to make it clear to the search engines which version of the page they should rank.
What should you use canonical for?
Canonical is used for some URLs when the site has a main version of the document along with other, additional documents with similar content. What canonical does is point the search engines to the main version of the page.
Pay special attention to the word “similar”—we will come back to this definition later.
Here’s what Google Help has to say about canonical:
If you have a single page accessible by multiple URLs, or different pages with similar content (for example, a page with both a mobile and a desktop version), Google sees these as duplicate versions of the same page. Google will choose one URL as the canonical version and crawl that, and all other URLs will be considered duplicate URLs and crawled less often.
If you don’t explicitly tell Google which URL is canonical, Google will make the choice for you, or might consider them both of equal weight, which might lead to unwanted behavior.
How can canonical URLs be specified?
There are several ways to specify the main version of a page. All of them are described in more detail in Google Help. The most common way is to use the <link> tag—and we will use it in the examples. Other possible options include:
- The <link> tag with the rel=”canonical” attribute
- The Rel=”canonical” HTTP header
- A Sitemap file
As an alternative to canonical, you can also use 301 redirects, but keep in mind that redirects function entirely differently as they make only one version of the page available to search engines and users. Therefore, choose the method accordingly to the results you expect.
Best practices of using Canonical
Canonical is normally used to avoid similar or duplicate content appearing in search results. We will discuss below why the content can be duplicated.
Important! Using rel=”canonical” href=”” /; does not prevent search bots from indexing and crawling the document. Canonical is a recommendation and may be ignored by the search engine. Canonical indicates which version of the document should appear in the search results and which, in your opinion, is the main one.
If you want to block indexation, use the following:
- <meta name=”robots” content=“noindex”
- X-Robots-Tag: noindex HTTP header
Read more in Google Help.
Why it’s considered good to use Canonical
The use of canonical has been recognised as a helpful practice. It is implemented to prevent potential duplicate content issues, even if there’s no hint of duplicate or similar content.
It’s considered good practice to put canonical for each major page version, whether they have duplicate or similar content or not, and point it to that very same page.
Technically speaking, the page https://seranking.com/subscription.html has this canonical:
<link rel="canonical" href="https://seranking.com/subscription.html" />
Here, the value of the href attribute of the link tag contains the URL of the page where this link tag is located.
This solution helps you avoid possible problems and does not allow you to assign any parameters to the page and index pages with additional parameters.
The classic use of canonical is to specify the main document when using filtering, sorting, and other actions that result in a URL change.
Let’s see the laptop category on ebay as an example:
This is the main category that contains laptops for work and is optimized for this keyword cluster. Canonical for this page looks like this:
<link rel="canonical" href="https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031" />
The page is pointing to itself.
The page contains classic navigational elements:
- sorting by parameters
- changing the view type
Let’s change the view type. Products are now shown in a column:
But what matters the most for SEO specialists is the changed URL. Now it looks like this: https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031?rt=nc&_dmd=1.
Notice the parameters that appear at the end of the URL. Sorting options or other actions are implemented using these parameters. An infinite number of such pages can be generated depending on the possible sorting options. And from a search engine’s point of view, each variation with a new parameter is a separate URL.
If such pages end up in the index, pages with the same or very similar content—and most likely with the same <title>—will compete with each other. This will lead to keyword cannibalization and lower rankings.
It is to prevent such problems that you need to use canonical—it allows you to indicate the main version of the document that you want to see in the SERP. In the example, the sorting page has the following canonical:
<link rel="canonical" href="https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031" />
That is, the page points to the main version of the document without parameters.
When a variety of filtering parameters are applied, an online store website can have many pages created that are not optimized for any keyword cluster.
When we talk about optimization for a keyword cluster, we mean that the document has:
- a <title> optimized for specific search queries, different from the <title> of the page where the filter was applied;
- a unique and optimized H1 header;
- products filtered according to a specific keyword cluster.
It’s worth noting that the HTML5 standard allows for any level of heading. By referring to H1, we are considering a classic situation in the context of the HTML4 standard.
The main thing you need to understand is that one landing page equals one user intent. For example, the Laptops & Netbooks category https://www.ebay.com/b/Laptops-Netbooks/175672/bn_1648276?_dmd=1 has filters that create separate pages for different user needs.
By selecting the Workstation filter, we’ll see a separate page https://www.ebay.com/b/Workstation-Laptops-Netbooks/175672/bn_7116632031, which we’ve examined earlier.
But let’s get back to the situation when the filters aren’t or can’t be optimized. For example, we want to view products by two specific brands. Obviously, there is no point in optimizing a the page for such requests, since each brand page must have its own page. This is when filtering results should be “hidden” by canonical.
Important! The use of canonical must be analyzed for each individual case taking into account your particular needs.
In some popular CMSs, for example Shopify, a product may contain the full path to the category in which it’s located. If you add a product to several categories, it gets duplicated across several URLs.
https://site.com/phone/iphone12/ https://site.com/phone/apple/iphone12/ https://site.com/iphone12/
The third URL is the preferred one and the first two should point to it as to the main, canonical document.
Important! It’s recommended to avoid such situations and link categories to the main versions of documents without using canonical.
UTM tags and tracking parameters
Parameters in the URL can be used to collect certain information, but at the same time can create pages with duplicate content. For example, a URL like https://site.com/page/ may have a version with parameters like https://site.com/page/?fbclid=IwAR3cnDV4ERw24pQNVLTFlwKzchPDA1.
A similar link is generated in case of a redirect from Facebook. Here, canonical would be a great solution.
Specifying the main site version
A site that is accessible via HTTP and HTTPS protocols at the same time is seen by a search engine as two different sites, similarly to sites with and without www.
The variations below are 4 different sites:
https://site.com/ http://site.com/ https://www.site.com/ http://www.site.com
You can use canonical to specify the main website version.
For example, if the main version is https://site.com/, then the rest must contain <link rel=”canonical” href=”https://site.com/”>;
To choose which version you want to prioritize, use the command in site: site.com—it will help you determine which version of the site has been indexed by Google and understand which pages are more present in the search.
Choosing between the HTTP and HTTPS protocol, you should definitely opt for the latter. To learn why HTTPS is preferable and how to move your site to HTTPS without losing rankings and traffic, read this guide. And deciding between www and non-www, learn about the details from this article.
It’s easier to manage the website if you use 301 redirects to specify its main version.
Canonicalization of cross-domain duplicates
If duplicate pages belong to different domains and you control both, you can choose the main, canonical version of the page on a different domain.
Common mistakes with canonical
Canonizing different types of pages
Let’s go back to how search engines describe canonical. Google recommends using canonical, “If you have a single page accessible by multiple URLs, or different pages with similar content.”
A common mistake is to specify the canonical product page for the category page, or vice versa. In this case, search engines can ignore canonical. It also makes no sense to specify a canonical product page for a blog post.
The key here is that content on a canonical and non-canonical page must be of the same type.
When typing a URL in the href attribute, make sure that the page you are pointing to does not have a canonical pointing to another or that same page.
Here’s an example. The page you want to canonicalize is https://site.com/phone/iphone12/. The page you want to set as canonical is https://site.com/iphone12/. It already contains the following canonical:
<link rel="canonical" href="https://site.com/phone/apple/iphone12/" />
This use case for canonical is incorrect because it creates a canonical chain.
The last in this chain is the page https://site.com/phone/apple/iphone12/, which means that most likely it will be considered canonical by search engines. In order not to confuse search robots, indicate only one canonical page.
In our example, this means that you need to decide which page you want to set as canonical: https://site.com/iphone12/ or https://site.com/phone/apple/iphone12/.
For the first option, you need to replace canonical on the https://site.com/iphone12/ page so that it points to itself and canonicalize https://site.com/phone/iphone12/ and https://site.com/phone/apple/iphone12/.
To leave the page https://site.com/phone/apple/iphone12 as canonical, you need to make sure that other similar pages link to it and the page links to itself.
Important! Be careful when modifying canonical URLs. Be sure to find out why certain values are used.
Pointing to a URL that is not crawlable or indexable
When choosing a canonical URL, make sure that the document is crawlable, that is, it is not blocked in the robots file or by the X-Robots-tag or <meta name=“robots” content=“noindex” />.
You can check if the page is indexed in Google Search Console or by using SE Ranking’s Website Audit tool.
Pointing to a URL that returns a status code other than 200
When choosing a canonical URL, make sure that the document is available and returns the server response 200. You can check this by analyzing your site in SE Ranking.
Pointing to a URL with an invalid protocol
When specifying the canonical page, refer to the protocol that is used in the main version of the site. If the main version uses the HTTPS protocol, then you should specify the HTTPS version of the page in the href attribute.
Non-canonical pages in the sitemap
Make sure that only main page versions appear in the sitemap. In other words, the sitemap file only needs to include those pages that point to themselves with the help of canonical.
Internal links to canonicalized URLs
Internal links must point to the main version of the page. As an exception, you can refer to a canonicalized version for improving user experience or another valuable reason.
Opinion on this issue is divided.
Option 1. You can follow the good practice we’ve mentioned earlier: each pagination page links to itself. For instance:
https://site.com/catalog/page/2/ contains <link rel="canonical" href="https://site.com/catalog/page/2/" />.
I stick to this method, because I believe that pagination should be open to crawlers.
Opinion 1 is included to give a full picture. Following it wouldn’t be a mistake with canonical.
Option 2. The second option boils down to blocking pagination from the search engine using canonical that points to the first page. For instance:
https://site.com/catalog/page/2/ contains <link rel="canonical2 href="https://site.com/catalog/" />.
In this case, search engines might ignore your commands because pagination pages differ in content.
Option 3. Finally, there’s a third option: don’t use canonical and block pagination from indexation using <meta name=“robots” content=“noindex, follow” />
Let’s examine a case with a canonicalization mistake. A site based on Shopify had a mistake in the Duplicate products section. The site structure looked like this:
As you can see, the structure is dominated by red dots, which are canonized pages. After the problem was fixed, the green color became dominant:
And we got a significant growth in search visibility as a result:
Is it worth using canonical?
The proper use of canonical tags is a part of SEO basics. However, if set incorrectly, canonicalization may not bring the desired result and lead to lower rankings due to duplicate content issues.
To use canonical correctly, avoid creating duplicate and similar content by specifying the main version of the page as canonical. But there are exceptions to all the rules as there might be situations where duplicate content is not harmful. Follow the best practices but evaluate each situation individually.