Martin Splitt Explains How Google Selects Canonical Pages

5 November, 2020
Jason Ferry
SEO Services

Martin Splitt recently told providers of SEO services how Google distinguishes duplicate content and web pages, as well as how they decide which canonical pages are to be included in the search engine results pages (SERPs). This information gave many small business SEO service providers important insights on how the Google algorithm works when it comes to canonicalisation.

In a podcast, Splitt explained that there 20 different signals which are weighted in order to detect the canonical page. He also went into detail about why machine learning is used to adjust the weights.

Splitt first stated how websites are crawled and how documents are indexed. Then, he goes into detail about how Google detects and identifies canonicalisation and page duplicates.

He said that they collect the signals first, then detect the duplicate pages by clustering them all together. Then, they will find a leader page for all these pages, and to do, so, they must reduce the content into a checksum or hash, and compare it with other checksums.

By making clusters of duplicate pages, it makes the task much faster and easier instead of checking thousands of words.

One reason why Google reduces content into a checksum is that they do not want to spend too much time and resources scanning the whole text. So, they calculate several kinds of checksums about the textual content of the page before comparing it with other checksums.

When it comes to exact duplicates and near-duplicates, Splitt says Google’s algorithms can catch both, such as those that are capable of detecting duplicates and then removing the boilerplate from pages. This means that their algorithms detect if the checksums are fairly similar or identical to each other before bringing them together in a duplicate cluster.

Once all the duplicates form one big cluster, Google selects only one document to display in the SERP.

Providers of SEO services may wonder why they avoid showing duplicate web pages in the SERP. This is so that Google can avoid showing the same content across many search results – which is one thing that users dislike. Moreover, doing so saves storage space in the index.

The hardest part is choosing the leader of the cluster, which is why they use more than twenty signals to select which web page to show as canonical from the group of duplicates.

These signals are like factors that help determine which page among the duplicates is the best one to show in the SERP. For instance, one signal is the webpage content. It could also be the PageRank – the higher the rank, the more chances the webpage will show.

Each signal has its own weight, and Google calculates and adjusts these weights. Google uses machine learning to adjust signal weights, making sure everything is accurate compared to doing things manually.

As for redirects, they are usually given a heavier weight compared to http/https URL signals. Splitt explains that any redirects must be higher in weight instead of http/https because the users will eventually see the redirect target. Because of this, Google does not include the redirect source in the SERP.

Canonical links are essential for businesses and small business SEO services because they specify which link is to be shown to users in the SERP. Moreover, search engines do not like duplicate content, and canonical tags help them identify which page should be ranked or shown to the users.

Here at Position1SEO, we make sure that your website is filled with high-quality content that is both authoritative and compelling. If you choose to work with us, you can be assured to get unique content that engages your users and effectively promotes your products and services.

Work with our SEO professionals today! Send us an email at office@position1seo.co.uk or call us on 0141 404 7515.

Author: Jason Ferry

Jason Ferry is an SEO specialist based in the United Kingdom. Taking pride in his many years of experience in the search engine optimisation industry, he has honed his skills in various digital marketing processes. From keyword marketing, website auditing, and link building campaigns, to social media monitoring, he is well-versed in all of them. Jason Ferry’s excellent skills in SEO, combined with his vast experience, makes him one of the best professionals to work with in the industry today. Not only does he guarantee outstanding output for everyone he works with, but also values a deep relationship with them as well.

Search Engine Optimisation (SEO) is an essential ingredient for website success in the digital arena. The process of indexing and ranking websites on Search Engine Results Pages (SERP) requires constant evaluation. Therefore, it is vital to conduct an in-depth SEO audit to identify areas that need improvement. SEO audit is the process of evaluating a […]

Take your website to the top: How an SEO company can supercharge your online presence

Search Engine Optimisation (SEO) is a crucial aspect of building a strong online presence. While many website owners focus on using keywords to rank higher on search engines, an SEO company can optimise your website in more ways than one. In this blog, we will explore the three areas an SEO company can improve upon […]

The importance of hiring a search engine optimisation consultant

Unlocking the potential of your business: The importance of hiring an SEO consultant

Are you struggling to get your business noticed online? Is your website buried under a sea of search results? If your answer is yes, then it might be time to consider hiring an SEO consultant. With the ever-growing importance of online presence for businesses, it has become crucial to employ the right strategies to make […]

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Martin Splitt Explains How Google Selects Canonical Pages

Related Posts