Google has confirmed it: Changes to the layout can affect the ranking even if everything else remains the same. The question that came up in my mind was: How does Google measure and rate web design? Since we, unfortunately, do not get any specific information, it only helps to speculate! What I always love to do and then share my thoughts with you:
The video by John Müller
First of all, here is the video from John Müller, fast-forwarded to the relevant point. A webmaster had asked him if changes to the layout – of everything else stayed the same – could affect the ranking on Google.
Does the design have a specific effect on the ranking?
John’s answer is “yes.” In the video, he speaks more of the title tag, internal linking, and other on-page factors, so by layout, he means much more than just the design.
He also addresses the so-called supplementary content. This describes everything that is not the main content. It is only logical that this additional content can also have an impact on the ranking.
But does that also apply to the design? Our observations have suggested this for many years. In my opinion, the design has been an integral part of the ranking calculation since Google Panda (2011) at the latest.
A bad design for the user can lead to bad rankings. A good design can have a positive effect. And, as John explains, even if only the layout changes and all the content and code on the website remains the same. This is supported by the fact that Google recommends rendering URLs in the Search Console and checking whether the rendered version matches the actual version.
How does the design actually affect the ranking?
Now we leave the obvious track and play a few mind games about what exactly could happen and how Google could “measure” design.
There are different possibilities:
Manual ratings of quality raters, in the sense of “Page Quality”: Google could precisely trigger this when the design changes with many URLs. This could trigger a so-called “flag,” a signal that something unusual is happening in a domain. After a flag, Google sends several different quality raters to the website, which then rates the URL. This is what Google (according to rumors) sometimes does with unnatural links, but it would also work for a redesign. If the rating of the URL changes, the overall rating also changes.
Measurement of “good” by means of an AI: In my opinion, the worst possible variant, as only uniform designs would be rated as good. Novel solutions that are good for the user would lose. I myself once drove such a test at the SMX Munich, in which a Google AI was supposed to rate shops as good or bad. The system works wonderfully precisely for the majority of shops.
But anyone who breaks traditional barriers and has an extraordinary, but good design, had no chance. AI wants to see conventions; machine learning thinks very little of innovations. With this method, Google would virtually freeze the web. In addition, AI is resource-intensive and, therefore, expensive. It is possible, scalable, and practicable, but less so.
Use user data/browser data / short clicks: Google has repeatedly confirmed that data from Google Chrome, Android, or simply click data are not directly included in the ranking. But they do it indirectly. The new Google Core Web Vitals also explicitly say that they use field data. Due to the very deep insight into how First Input Delay or Largest Contentful Paint are calculated, you get a good idea of how much Google tries to get objectively measurable values that – if they say “this is not good” – are justifiable are not objectively good on the outside.
How could it be? My theory
I think it’s a mix of all three. The most important thing is the objective measurability. When did a design fail? That must not be judged by one or a few people.
Manual ratings may still be used. Although they are officially only used to collect data for the engineers, this data is really very good, from trained quality raters, and there is a lot of data. The web is not as big as you think, and if Google were to guess every URL 3 times a year, they would have a good idea of how good the URL really is.
But here, I can also be grossly wrong, and such data was and is never directly included in the page quality. I’ve heard rumors going there, but I strongly doubt there’s anything to this. It would just be a very fragile system, semi-automatic and somehow down-to-earth.
AI data is probably not used at all. If so, then far more rudimentary than we SEOs imagine. Really complex AI is too expensive for such matters: it simply costs too much computing capacity. But I can imagine that the AI looks at a data point that checks, for example, “Is the main content front and center?”
And assigns a yes or no after rendering. AI can now do this very reliably, and it is not judgmental, so there is no risk of innovations being suppressed. If the main content is not immediately visible, then all users are dissatisfied with it. With the Largest Contentful Paint, we have a similar metric that Google has already communicated publicly. All you have to do is include a check: “Largest Contentful Paint is not Main Content, “and the review is done.
User data/browser data / short clicks: What you keep hearing from Google engineers is that user data is very noisy, meaning that there is a lot of background noise that is difficult to separate. I don’t think that this data is used for the keyword-related ranking.
No algorithm is likely to say, “We’re banning this page for the keyword “buying handbags” on page two because the short clicks are too high.” But maybe there is a metric like Overall Page Quality that says, “The short click rate this domain is very high, so it’s probably not of high quality.
Perhaps once more to explain: It is a short click when a user selects “Buy handbags” clicks on the first result, visits the website and then clicks back on the Google search and clicks on the second (or another) result. So you can say very reliably that he was not satisfied with the first result.
And here, like Google, I can make very objective observations when the web design changes. If the short clicks decrease after a relaunch, the users are generally more satisfied. The overall page quality increases, and with it, the rankings in the long term. Another factor in favor of browser and user data is that the rankings do not fall straight into the basement in the event of an unsuccessful redesign but rather steadily fall in a staircase-like manner – provided there are no serious technical errors such as missing redirects, robotst.txt blocks or unwanted no- Index tags crept in.
In a nutshell:
In a nutshell, there could be two types of data that determine web design. Objectively directly measurable data that Google can easily measure, such as “Is the Main Content Front and Center.” And then objectively perceptible data such as short clicks and browser data. In this way, a quality value is probably calculated for the entire domain, but perhaps also more likely for the page type, and then included in the website’s quality assessment.
There are already examples for both of my theories that work similarly and that Google will – at least soon – incorporate into the ranking. Firstly, the Largest Contentful Paint, which is the value that automatically reads out when the largest visible block of content, is loaded. To do this, of course, Google needs to know what the largest block of content is – and that can already be done automatically.
On the other hand, there are those core web vitals that only work really well with so-called field data. These are data from real users that are much more accurate than projections. First Input Delay, also within the Core Web Vitals, works with such field data. The Chrome User Experience Report works with this field data, more precisely with Google Chrome data.
There it says: “The Chrome User Experience Report provides user experience metrics for how real-world Chrome users experience popular destinations on the web. “So Chrome user data is not necessarily used for ranking, but it is for evaluating the quality of a website.
What does that mean for your redesign/relaunch?
The design has an impact on ranking. It is presumably measured objectively. On the one hand, based on hard facts, Google Chrome user data probably play a role on the other hand.
The latter explains why the ranking of unsuccessful design relaunches often slips into the basement like a staircase: The user data must first be updated; changes only become noticeable over time. The same thing happens with good relaunches.
Often the ranking is almost the same at the beginning of a good relaunch; the positive consequences usually only come in the following weeks and not immediately afterward.
In principle, this means nothing new for your redesign: Keep the user in focus, take care of an appealing and, above all, usable design. Nonetheless, in the future, you could keep an eye on the Google Core Web Vitals with more than one eye – but you should anyway.
What is your opinion on the topic? Am I right? Am I wrong? I am looking forward to your input!