Content duplication has never been appreciated, neither in the real, nor in the virtual world. Google crawls track down any illegitimate on-site content duplication fast, consequently giving the manual penalty of a significant search engine rankings decrease. That’s two steps from a heavy traffic loss. One way to fight on-site content duplication is canonicalization. Can canonical link element be applied on your website and how to do it? Keep reading!
Regardless of the type – be it a business one-page, or an ecommerce – your website can have internal duplicate content issues. That occurs anytime the same or very similar content is found on several pages within one site. Generally speaking, it’s best to simply avoid such situations, but webmasters don’t always have full control over it. This is where canonical link element feature comes in to help — they simply refer Google, Yahoo or Bing crawls to the URL with original – ‘canonical’ – content, rendering the rest overt duplications.
Canonical link element – what’s that?
A canonical link element is an HTML tag inserted in the the
<head> section of your website code. Soo, you will be able to dig deeper into HTML tags in general, as we’re preparing for you a dedicated entry: Meta tags – what are they, how to build them and how they affect SEO.
Across the internet, canonicals are also referred to as “rel canonical” or “rel=canonical”.
The main purpose of this tag is to give search engine robots a heads up about the URL with the original content, anywhere else only copied. From the search engine’s standpoint, it’s best if every single page with duplicated content is linked to only one canonical URL. However, this is not always possible. This could also be the issue about your website. With canonical link HTML element it’s possible to eradicate occurences of content identified by crawling scripts as duplicated under several different URLs. A URL tagged with a canonical link HTML element will be indexed by the search engine crawls, while pages with referring to a canonical URL will not (there are however some exceptions).
When to use a canonical link element?
As a robot crawls your website, it needs to process all the links on the page, one by one. In many cases, there are more links than the actual content-rich pages. Sometimes you may not even realize, that there are pages with duplicate content! After all, having basic SEO awareness, you make sure that each of your sites features only unique content.
To be really sure, use free internal duplication detection tools, such as SiteLiner. With them, you find the perfect places to apply canonical link elements.
Here we’ll discuss a few cases of internal duplication:
1. HTTP and HTTPS website versions
Having two different versions of the page is where you can face the issue of internal duplication. From a standpoint of a Google robot, HTTP and HTTPS URLs consist two entirely different and separate domains.
Setting 301 redirects is necessary in order to avoid duplication issues, and canonicalization will additionally support the appropriate robots’ interpretation of those URLs. With canonical link element you can tip the robots off about which website contains original content to be displayed in the SERPs. You can also use canonical URLs instead of redirects. However, this is not a recommended solution, additionally causing the indexation to take much longer.
Mind also that the canonical link elements only give the crawls a hint – nothing can give you a 100% guarantee that robots will read the URLs as we please and act accordingly. There are algorithms that directly determine which pages will be effectively shown to users in the search results.
2. Sorting products within categories
Canonical link elements are enormously useful in ecommerce optimization. Mostly for optimizing pages featuring product sorting with different classification criteria. With any criteria selection, the URL changes, but the content always stays the same – hence content duplication. The resulting links may look like this:
In this case, indicate the top link – preferably the one referring to the main product category page. And one more thing – 301 redirects will be of no use here. Two ways of dealing with this is optimizing the robots.txt file or a inserting a canonical HTML tag:
The snap shows Open SeoStats plugin, which allows you to check the canonical links working on the site.
If the canonicalization goes well, Google’s crawls will refrain from processing all pages with products sorted “from lowest price to highest” or “by size”, etc. They will focus all their attention on the main category.
3. Ecommerce pagination
Very similar issues as with product sorting re-appear with pagination. The more pages are created in the process, the more the rate of content duplication will increase. All the pages created in pagination need therefore a canonical tag referring to the main product category page.
4. URLs long and short
Your ecommerce may also support two URLs per a product page – a full, long, and a shortened one. Google crawls will read them as separate and different pages, duplicating the content. The shorter link will be more “friendly” for the search engine. It should therefore be made the canonical one. So, the longer URL should include a canonical link element referring to the shorter one.
5. External duplication
Canonical link element can also significantly impact any external duplication you may have. If it’s likely that your blog entry, product or category description will be straightforwardly copied and featured on any other site (i.e. within a content syndication), insert the canonical HTML to these pages. Also, add the date of the content update, so that Google knows that you were the first to get it out there.
Conversely, when syndicating content from elsewhere, link the source with a canonical tag – this way you’ll avoid the content’s indexation on your site and subsequent rendering your page a plagiarism.
Why canonical links are so important?
URL canonicalization is big for your SEO. You may not use them, if you don’t want, but they help the search engine scripts to make sense of what’s the original content, and what’s only a copy. Their advantages are plenty, and the best way is to optimize your site with canonical tags and feel the difference yourself.
- proper on-site canonical link element implementation makes the internal duplication rate drop, which in turn leads to your rankings going up.
- search engine crawls are able to determine the valuable on-site content fast – that means faster processing and indexing your page, also after any subsequent updates,
- Page Authority accumulated does not disperse as easy, if there are less links to process for the crawls;
- more control over which URLs users are effectively displayed in the search engine for you.
When are canonical links not a necessity?
The upsides listed above do not necessary mean that your site needs link canonicalization. There are many situations in which they are not a necessity.
- If there are no URLs duplicating the content entirely or partially, there’s no need for canonical URLs. You don’t need a canonical link for a page to be in SERPs – the Google crawls will penetrate your code anyway!
- Properly implemented 301 redirects may also redner the rel=canonical tags redundant.
- The circumstance under which canonical links are the most obligatory is having an on-site robots.txt file. In that file the URLs not indexed by search engine crawlers are listed. There, you can clearly indicate the pages not to be indexed or displayed in the search results. A meta “robots” tag works similarly.
And do not overdo the canonicalization on your site. On top of page’s content uniqueness, for the Google to see its as a valuable place in the internet, there needs to be sufficiently enough of it. If the on-site content duplication is not only related to the technical solutions of your website that we have previously mentioned, a much better solution – in terms of SEO – is to work on creating brand new content on the pages where duplicated texts were detected, rather than canonicalizing. The latter may result in dropping the content-to-code ratio of the page, which may also have adverse impact on your rankings. Also, should a canonical be implemented incorrectly, Google crawls will simply ignore it and there will be no result at all.
How to implement a canonical link element?
Inappropriately inserted canonical link elements are ignored by Google crawls, bearing no effect on the optimization. So it’s essential to canonicalize correctly. It’s not rocket science, let us walk you through it. Canonical link tag should be inserted in the <head> section of the page code. It looks like this:
here the URL means the ultimate website address, i.e. the full address, with HTTP/S, where the original content is located.
Where to insert the code in your website source:
Many people implement rel=canonical tags wrongly. So wrongly, that even Google dedicated a blog entry to the most common mistakes – here. Read that, too, and you should not harm your rankings.
Even if you don’t think there’s any duplicated content on your website – make sure that there really isn’t, and if it’s otherwise – act accordingly. One way to keep any Google penalty resulting from duplicated content at bay is the canonical link element. It comes in handy in fighting any unintentional content duplication and helps to keep your rankings high. All that is for your traffic and your conversion.