Technical SEO testing: How Googlebot handles iframes

  Marketing, Rassegna Stampa, SEO
image_pdfimage_print

Earlier this year at SMX Advanced, I presented results from our Peak Ace test lab. These tests shed some light on several technical implementation points and how Googlebot would deal with them. 

One of my favorite tests examined Google’s indexing of iFramed URLs and their content. In my SMX Advanced presentation, I touched on various scenarios that may lead Google to index the content inside an iFrame, while “assigning” that content to its parent URL.

iFrame content will be attributed to its parent URL post-render.

The parent URL can, in some instances, rank for content that only exists in the iFramed URL and not in the parent URL.

Post-render, the parent page can now be found for content within the iFrame.

Naturally, this excited people – and all sorts of follow-up questions arose. Here are a few of them with my answers.

In the iFrame test, was the iFramed content coming from the same domain or a different one? 

My example showed two URLs that live on the same domain: domain.com/test.html would iFrame domain.com/tobeframedA.html, so that test.html could rank for content that only exists in tobeframedA.html

The same also works for externaldomain.com/tobeframedB.html – which can still cause test.html to rank for content only present in tobeframedB.html, as well as for iFrames residing on subdomains. We tested every combination we could think of and concluded that it made no difference where the iFrame content was hosted.

If you want to prevent someone from loading (and ranking for) your content in an iFrame, it would be a good idea to look into the X-Frame-Options Header. This indicates whether a browser should be allowed to render a page in an iFrame. 

If we were to use iFrames with a no-indexed content page, would the parent page still rank for the listed content with the intent to improve page speed?

As soon as the iFramed URL contains a meta robots noindex directive, the parent URL won’t be able to rank for the content from the iFramed URL.

iFramed URL containing meta robots noindex directive.

The same is true if you iFrame a URL that would be served with an X-Robots noindex header directive or is actively blocked using robots.txt.

As far as page speed is concerned, iFrames support the loading="lazy" attribute, which would defer offscreen iFrames from being loaded until a user scrolls near them. This is an elegant solution for speeding up loading times for URLs that depend on iFramed content.

Does Google give full value to semi-hidden content (content that typically comes after ‘Read more’)?

There doesn’t seem to be too much love for using “Read more” functionality within the ranks of Google. John Mueller went on record a couple of times here and here, questioning the use of the functionality in its entirety. Mueller added, “I don’t think you’d see a noticeable, direct change in SEO, […]”. 

When we tested it, the purpose of the test was to understand what difference the technical implementation could potentially make – and if, in general, content behind a “Read more” would be indexed (if correctly set up). 

The short answer: whether or not it was visible, the content would be indexed, found and returned.

However, content that was invisible during loading didn’t get highlighted in the snippet. The technical implementation didn’t make a difference (as long as the content was part of the HTML DOM at load), leaving you free to use display:none, opacity:0, visibility:hidden, etc.

That said, in my opinion, it is impossible – due to various factors outside of our control – to create a test setup that (including results) could provide an accurate answer regarding the “full value” part of the question. 

Did you mention that duplication in certain areas of the content can be fixed by CSS implementation since it is not indexed?

I did present some behavior that I find fairly interesting regarding CSS selectors. What technically happens is that selectors such as ::before create a pseudo element that is the first child of the selected element. In practice, this is often used to add cosmetic content to an HTML element. 

This could also be useful from an SEO point of view because Googlebot seems to treat this just as it would treat Chrome on desktop/smartphone. The rendered DOM remains unchanged (which is to be expected since it’s a pseudo class). As a result, content from within said selectors won’t be indexed.

So, ultimately you could use this to prevent certain content from being indexed without keeping it from being displayed on the website. Maybe you have to display certain content that gets classified as “boilerplate” (e.g., shipping info, or legal info) or you want to create a certain content footprint. This opens up a great many possibilities to explore further.

Watch: Technical SEO testing in 2022: Separating fact from fiction

Below is the complete video of my SMX Advanced presentation.


Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.


Related Stories

New on Search Engine Land

About The Author

Bastian Grimm is the CEO of Peak Ace and a renowned expert in large-scale, international SEO, managing sites of almost any size and in highly competitive industries. With more than 20 years’ experience in online marketing, technical and global SEO, Bastian was named “Search Personality of the Year” at the 2019 European Search Awards: a welcome acknowledgement of his contributions to a rapidly evolving industry. Bastian’s believes that understanding a target market means not only getting to grips with the language, but also the culture. This has given him a unique perspective on how to reach global audiences. Bastian leads a thriving team of expert native speakers, equipped to serve clients in 25+ languages, and the results speak for themselves. With a technology-driven approach, Peak Ace is a one-stop shop for highly flexible, data-driven solutions for all relevant digital marketing channels. Working closely with world-renowned brands such as Airbnb, TUI, Sage and McKinsey & Company, Peak Ace is also celebrated in the marketing industry. In 2022, Peak Ace was recognised for its exceptional standard as an agency multiple times, named Best Large Integrated Agency by multiple industry awards bodies. Bastian is proud to lead such an innovative, ever-expanding company. His secret? Dynamic, decisive processes, a phenomenal team and always going to bed with “inbox zero”.

https://searchengineland.com/how-googlebot-handles-iframes-388243