30-second summary:
- Reducing reliance on canonical tags can improve product URL discovery on Shopify
- How you structure your products on Shopify can determine how well these pages perform
- Shifting reliance from canonical tags to rich internal anchor text helps build relevancy
Can anything stop the relentless rise of Shopify? Back in 2012, the landscape was dominated by WordPress, Magento, and Joomla. Fast-forward 10 years and many in the industry now see Shopify as the leading ecommerce platform, with the others going from leaders to laggards.
There are of course multiple reasons for Shopify’s rise to prominence, but arguably one of the biggest factors is that the platform is much more technically accessible than other ecommerce infrastructure providers. Getting your head around a fresh Magento install or working out how Joomla works (which is still a mystery to me till date!) often requires a certain level of technical know-how. And, if you don’t possess it, then you need to spend extra resources outsourcing that work to someone who does.
Shopify understood that baking simplicity and an “it just works” ethos into their platform would allow everyday entrepreneurs to get their sites up and running quickly, without needing a degree in computer science or a huge budget to maintain their online presence. However, as user-friendly, as it might be, there are still a few technical and SEO hurdles to overcome if you want your Shopify site to succeed on the SERPs.
In this article, I’ll take a closer look at a key “out of the box” SEO issue that often limits the relevance of product pages within Shopify and creates significant site bloat. More importantly, I’ll also share four potential solutions that can be used to fix the problem and maximize your product page potential. Let’s dive in.
The cost of inefficiency
Something that we often discuss with our clients is ensuring that Google can crawl their websites as efficiently as possible. We explain this by breaking down the cost to Google of crawling the web. Every time Google visits a webpage on the Internet there is a physical cost to Google: the price of electricity consumption, water consumption, hardware, software, and all the other assets needed to visit that page. While this cost might be a thousandth of a penny per URL, with the sheer amount of URLs crawled by Google each day, the total cost is likely staggering.
Therefore, if you are serving Google webpages that are duplicated or not relevant, you are wasting resources. Google has made a point of stating that in their article on managing crawl budget:
“Without guidance from you, Googlebot will try to crawl all or most of the URLs that it knows about on your site. If many of these URLs are duplicates, or you don’t want them crawled for some other reason (removed, unimportant, and so on), this wastes a lot of Google crawling time on your site. This is the factor that you can positively control the most.”
The key message here is that you can control how much of Google’s crawl time is wasted. By aiming to reduce this waste, you are ensuring that the time Google spends on your website is as productive as possible. This means Google will spend more time crawling URLs that have true value, picking up changes to existing URLs, and discovering new pages much faster.
Use canonicals as a temporary solution and not the final fix
A canonical tag is used when there are multiple duplicate pages, allowing you to define which of the duplicates should be deemed the correct page for Google to index.
While they are effective in the short term, the existence of a canonical tag highlights that there are structural issues within a website, and this can impact crawl efficiency. Even though the canonical tag will indicate to Google that you have selected a preferred URL to index, the search engine still needs to crawl all duplicates that contain the canonical tag to come to the consensus that you have set.
Rather than using a canonical tag as a permanent solution, it’s important to take steps to fix the underlying structural problem, therefore negating the use of a canonical tag. This in turn will have a positive impact on crawl efficiency.
What does this have to do with Shopify product pages?
Put simply, product URLs on Shopify rely on canonical tags to be discovered. Let’s look at the two main causes of this.
Products in multiple collections
The URL below is a product page from a Shopify website.
You will notice that the URL has the collection the product is in is seen in the URL as well. If this product is in multiple collections, Shopify creates multiple product URLs. As these are duplicates, Shopify handles this by using canonical tags. These canonical tags point to the preferred product URL, which does not contain a collection:
The product highlighted above is currently in four collections, meaning there are now five different product URLs for Google to crawl to find this one product that it needs to index. There is, however, another issue that further increases this number: product variants.
Product variants
A product variant is a product attribute that can implement within Shopify. This could be color, size, weight, or any other type of attribute that a product may have. Creating variants of a product within Shopify allows a user to select attributes on the product page. This can be seen below on our example product URL as “size”:
In this setup, Shopify adds a parameter to the product URL called ?variant. This contains an ID that references the selected variant. The URL below is our example product URL with the medium variant selected:
This is of course another duplicate, which is handled via a canonical tag. If we begin to calculate the total number of URLs this single product has that rely on canonical tags, you will notice how this can have a detrimental impact on crawl efficiency.
Based on this product being in four collections and having four variants, there are a total of 20 product URLs that rely on a canonical tag. This means Google needs to regularly crawl 21 product URLs to discover the single product URL that needs indexing.
10,000 URLs crawled to index 600
When you factor in the sheer number of products across an entire website, it’s easy to see how this figure can add up. If our example website has 600 products, and each product appears in four collections with four variants each, then Google will need to regularly crawl in excess of 10,000 product URLs to find the 600 that have been requested to be indexed.
How do you fix this on Shopify?
There are two distinct problems we need to fix here: the issue with products appearing in multiple collections, and the issue with product variants. There are solutions for both — however, implementing them will require compromise in certain areas.
Products in multiple collections: The fix
This fix works by removing links to product URLs with the collection name in the product URL. The main culprit here is the collection URL — specifically the theme file that powers collection URLs. On Shopify, this file is called product-grid-item.liquid.
You can navigate to this file via the following route within your Shopify admin.
Online Store > Themes > Customize > Theme Actions > Edit Code > Snippets
Within this file there are HTML hyperlinks that reference product URLs containing the collection name:
The “within: collection” element is what is responsible for pulling the collection name into the product URL. Removing this ensures that the collection name no longer appears in the product URL.
However, before you jump in, there are a few things you’ll need to bear in mind:
- It is recommended that you consult with your web development team before making this change.
- Apps that you use may need the “within: collection” functionality, so it is worth checking with app support on whether or not this can be changed.
- This change impacts the breadcrumb on product URLs. If this is problematic, then I’d suggest building breadcrumbs manually using META fields with a dedicated META fields app.
- You will also need to ensure that manual links that use this format are changed.
- There may be other template files that contain “within: collection” so it is worth liaising with your development team to identify these.
Product variants: The fix (or is it?)
Unfortunately, the solution to product variants is more complex and ultimately depends on how much SEO value you are getting from your existing product variants. The recommendation here is to first find out how viable product variant keywords are in terms of search volume and market opportunity.
For example, if our imaginary Shopify store sells Ralph Lauren polo shirts, then my variants are likely to be color and size. By running a quick search for the product type plus these variants, we can see that there is search volume and therefore it will be important that my variants are indexable and optimized.
Fix Option #1: Optimize ?variant URLs
This first option is viable if you believe that there is search volume opportunity across a wide range of your product variants. The premise of this fix is to build logic into your theme code, so that when a variant is selected, the variant name is appended into the page title tag and where possible, the product description.
This change will likely depend on your theme setup and, as with any change, it is recommended that you consult with your web development team. More details on how to do this can be found via the Shopify community thread below:
Another thing to bear in mind with this solution is that you will need to remove the canonical tag that is currently in place on ?variant URLs. The main drawback to this approach is that you may need to implement it sitewide across all product variants — but not all variants will necessarily have available search volume.
Fix Option #2: Optimize main product URL for variants
If you want more control over which product sets have optimized variants, then this option might be for you. By optimizing the main product URL for variants, by including variant keywords in the product description and META data, you will stand a chance of being visible for these product variant keywords.
The drawback here is that product URLs could become over-optimized and not as relevant as a dedicated, optimized product variant URL.
Fix Option #3: Disallow ?variant parameter
If it turns out that your product variants have minimal or no search value then disallowing the ?variant parameter in your robots.txt file might be the best option. This will stop Google crawling ?variant URLs, therefore making crawl activity more efficient.
Fix Option #4: Individual products per variant
If your product variants do have search viability, then creating individual products per variant might be an effective option. This is something we have seen retailers like Gym Shark do with color. The product below comes in a number of different colors, each of which has its own product URL and does not rely on variants, e.g.:
https://www.gymshark.com/products/gymshark-element-baselayer-t-shirt-black-aw21
With more control over both META data and optimized content, this approach means it is easier to build deeper relevance for product variants. The downside here is that there are simply more products to manage within the CMS.
Shopify & SEO issues: Final thoughts
As I mentioned earlier, one of the reasons for Shopify’s meteoric rise has been the “it just works” ethos that makes the platform such a cinch to use. But that’s not to say that the platform doesn’t suffer from a few SEO snags.
In addition to the canonical issue, Google’s Core Web Vitals can be another source of headaches for SEOs who work with the platform. But there are generally workarounds for those who are willing to take the time to implement them. You can learn more about how to navigate these in our ultimate guide to Shopify SEO (2022).
There are also hopeful signs that the Shopify team are increasingly receptive to the needs of the SEO community. The team have regularly taken on board feedback from SEOs to improve their product, from allowing users to edit the robots.txt file, to allowing for sub-folder international structures. So, we can hope that easy-to-implement solutions around the use of canonicals and other issues will be rolled out before too long.
Can anything stop the relentless rise of Shopify? Back in 2012, the landscape was dominated by WordPress, Magento and Joomla. Fast-forward 10 years, and many in the industry now see Shopify as the leading e-commerce platform, with the others going from leaders to laggards.
There are of course multiple reasons for Shopify’s rise to prominence, but arguably one of the biggest factors is that the platform is much more technically accessible than other ecommerce infrastructure providers. Getting your head around a fresh Magento install or working out how Joomla works (which is still a mystery to me to this day!) often requires a certain level of technical knowhow. And, if you don’t possess it, then you need to spend extra resources outsourcing that work to someone who does.
Shopify understood that baking in simplicity and an “it just works” ethos into their platform would allow everyday entrepreneurs to get their sites up and running quickly, without needing a degree in computer science or a huge budget to maintain their online presence. However, as user-friendly as it might be, there are still a few technical and SEO hurdles to overcome if you want your Shopify site to succeed on the SERPs.
In this article, I’ll take a closer look at a key “out of the box” SEO issue that often limits the relevance of product pages within Shopify and creates significant site bloat. More importantly, I’ll also share four potential solutions that can be used to fix the problem and maximize your product page potential. Let’s dive in.
The cost of inefficiency
Something that we often discuss with our clients is ensuring that Google can crawl their websites as efficiently as possible. We explain this by breaking down the cost to Google of crawling the web. Every time Google visits a webpage on the Internet there is a physical cost to Google: the price of electricity consumption, water consumption, hardware, software, and all the other assets needed to visit that page. While this cost might be a thousandth of a penny per URL, with the sheer amount of URLs crawled by Google each day, the total cost is likely staggering.
Therefore, if you are serving Google webpages that are duplicated or not relevant, you are wasting resources. Google have made a point of stating that in their article on managing crawl budget:
“Without guidance from you, Googlebot will try to crawl all or most of the URLs that it knows about on your site. If many of these URLs are duplicates, or you don’t want them crawled for some other reason (removed, unimportant, and so on), this wastes a lot of Google crawling time on your site. This is the factor that you can positively control the most.”
The key message here is that you can control how much of Google’s crawl time is wasted. By aiming to reduce this waste, you are ensuring that the time Google spends on your website is as productive as possible. This means Google will spend more time crawling URLs that have true value, picking up changes to existing URLs and discovering new pages much faster.
Using canonicals as a temporary solution and not the final fix
A canonical tag is used when there are multiple duplicate pages, allowing you to define which of the duplicates should be deemed the correct page for Google to index.
While they are effective in the short term, the existence of a canonical tag highlights that there are structural issues within a website, and this can impact crawl efficiency. Even though the canonical tag will indicate to Google that you have selected a preferred URL to index, the search engine still needs to crawl all duplicates that contain the canonical tag to come to the consensus that you have set.
So, rather than using a canonical tag as a permanent solution, it’s important to take steps to fix the underlying structural problem, and therefore negating the use of the canonical tag. This in turn will have a positive impact on crawl efficiency.
What does this have to do with Shopify product pages?
Put simply, product URLs on Shopify rely on canonical tags to be discovered. Let’s look at the two main causes of this.
Products in multiple collections
The URL below is a product page from a Shopify website.
You will notice that the URL has the collection the product is in within it. If this product is in multiple collections, Shopify creates multiple product URLs. As these are duplicates, Shopify handles this by using canonical tags. These canonical tags point to the preferred product URL, which does not contain a collection:
The product highlighted above is currently in four collections, meaning there are now five different product URLs for Google to crawl to find this one product that it needs to index. There is, however, another issue that further increases this number: product variants.
Product variants
A product variant is a product attribute that can implement within Shopify. This could be color, size, weight or any other type of attribute that a product may have. By creating variants of a product within Shopify, it allows a user to select attributes on the product page. This can be seen below on our example product URL as “size”:
In this setup, Shopify adds a parameter to the product URL called ?variant. This contains an ID that references the selected variant. The URL below is our example product URL with the medium variant selected:
This is of course another duplicate, which is handled via a canonical tag. If we begin to calculate the total number of URLs this single product has that rely on canonical tags, you will begin see how this can have a detrimental impact on crawl efficiency.
Based on this product being in four collections and having four variants, there are a total of 20 product URLs that rely on a canonical tag. This means Google needs to regularly crawl 21 product URLs to discover the single product URL that needs indexing.
10,000 URLs crawled to index 600
When you factor in the sheer number of products across an entire website, it’s easy to see how this figure can add up. If our example website has 600 products, and each product appears in four collections with four variants each, then Google will need to regularly crawl in excess of 10,000 product URLs to find the 600 that have been requested to be indexed.
How do you fix this on Shopify?
There are two distinct problems we need to fix here: the issue with products appearing in multiple collections, and the issue with product variants. There are solutions for both — however, implementing them will require compromise in certain areas.
Products in multiple collections: The fix
This fix works by removing links to product URLs with the collection name in the product URL. The main culprit here is the collection URL — specifically the theme file that powers collection URLs. On Shopify, this file is called product-grid-item.liquid.
You can navigate to this file via the following route within your Shopify admin.
Online Store > Themes > Customize > Theme Actions > Edit Code > Snippets
Within this file there are HTML hyperlinks that reference product URLs containing the collection name:
The “within: collection” element is what is responsible for pulling the collection name into the product URL. Removing this ensures that the collection name no longer appears in the product URL.
However, before you jump in, there are a few things you’ll need to bear in mind:
- It is recommended that you consult with your web development team before making this change.
- Apps that you use may need the “within: collection” functionality, so it is worth checking with app support on whether or not this can be changed.
- This change impacts the breadcrumb on product URLs. If this is problematic, then I’d suggest building breadcrumbs manually using META fields with a dedicated META fields app.
- You will also need to ensure that manual links that use this format are changed.
- There may be other template files that contain “within: collection” so it is worth liaising with your development team to identify these.
Product variants: The fix (or is it?)
Unfortunately, the solution to product variants is more complex and ultimately depends on how much SEO value you are getting from your existing product variants. The recommendation here is to first find out how viable product variant keywords are in terms of search volume and market opportunity.
For example, if our imaginary Shopify store sells Ralph Lauren polo shirts, then my variants are likely to be color and size. By running a quick search for the product type plus these variants, we can see that there is search volume and therefore it will be important that my variants are indexable and optimized.
Fix Option #1: Optimize ?variant URLs
This first option is viable if you believe that there is search volume opportunity across a wide range of your product variants. The premise of this fix is to build logic into your theme code, so that when a variant is selected, the variant name is appended into the page title tag and where possible, the product description.
This change will likely depend on your theme setup and, as with any change, it is recommended that you consult with your web development team. More details on how to do this can be found via the Shopify community thread below:
Another thing to bear in mind with this solution is that you will need to remove the canonical tag that is currently in place on ?variant URLs. The main drawback to this approach is that you may need to implement it sitewide across all product variants — but not all variants will necessarily have available search volume.
Fix Option #2: Optimize main product URL for variants
If you want more control over which product sets have optimized variants, then this option might be for you. By optimizing the main product URL for variants, by including variant keywords in the product description and META data, you will stand a chance of being visible for these product variant keywords.
The drawback here is that product URLs could become over-optimized and not as relevant as a dedicated, optimized product variant URL.
Fix Option #3: Disallow ?variant parameter
If it turns out that your product variants have minimal or no search value then disallowing the ?variant parameter in your robots.txt file might be the best option. This will stop Google crawling ?variant URLs, therefore making crawl activity more efficient.
Fix Option #4: Individual products per variant
If your product variants do have search viability, then creating individual products per variant might be an effective option. This is something we have seen retailers like Gym Shark do with color. The product below comes in a number of different colors, each of which has its own product URL and does not rely on variants, e.g.:
https://www.gymshark.com/products/gymshark-element-baselayer-t-shirt-black-aw21
With more control over both META data and optimized content, this approach means it is easier to build deeper relevance for product variants. The downside here is that there are simply more products to manage within the CMS.
Shopify & SEO issues: Final thoughts
As I mentioned earlier, one of the reasons for Shopify’s meteoric rise has been the “it just works” ethos that makes the platform such a cinch to use. But that’s not to say that the platform doesn’t suffer from a few SEO snags.
In addition to the canonical issue, Google’s Core Web Vitals can be another source of headaches for SEOs who work with the platform. But there are generally workarounds for those who are willing to take the time to implement them. You can learn more about how to navigate these in our ultimate guide to Shopify SEO (2022).
There are also hopeful signs that the Shopify team are increasingly receptive to the needs of the SEO community. The team have regularly taken on board feedback from SEOs to improve their product, from allowing users to edit the robots.txt file, to allowing for sub-folder international structures. So, we can hope that easy-to-implement solutions around the use of canonicals and other issues will be rolled out before too long.
Marc Swann is Director of Search at Glass Digital, a digital marketing agency offering SEO, affiliate marketing, and paid search services. Marc has been working in digital marketing for 12 years and specializes in technical SEO. At Glass Digital, his focus is on the organic search service, ensuring our teams are delivering maximum value for their clients.
Subscribe to the Search Engine Watch newsletter for insights on SEO, the search landscape, search marketing, digital marketing, leadership, podcasts, and more.
Join the conversation with us on LinkedIn and Twitter.
https://www.searchenginewatch.com/2022/08/29/shopify-seo-how-to-limit-your-reliance-on-canonicals-and-boost-crawl-efficiency/