Absolute vs. Relative urls

You can use both relative and absolute URLs for internal linking. Both forms of linking serve their purpose – however, in addition to some advantages, relative URLs also have disadvantages, which can have a negative effect on the SEO performance of your site. In this article, we will show you these problems and provide the appropriate solution.

What are relative and absolute URLs?

If absolute URLs are used in the internal link, the complete URL is listed:

<a href= "https://www.digitalarg.com/seo"> search engine optimization </a>

In the case of relative URLs, however, only the path to the linked document is stored:

<a href=" /seo "> Search engine optimization </a>

Both forms can be used equally for internal linking.

Advantages of relative urls

Relative URLs have several advantages over absolute URLs:

  • They are faster and easier to program.
  • If relative URLs are used, moving a new page from a test system to the actual domain is easier because the internal links do not have to be adjusted.
  • The loading time is slightly faster with relative URLs. These are fractions of a second that can hardly be measured on normal pages.

For the reasons mentioned above, using relative URLs in internal linking is a common practice in web development.

Disadvantages of relative urls

Relative URLs can also have negative effects. This is especially the case if the programming is not clean:

Duplicate content arises because the domain is stored under different protocols (http: // vs ) or with different subdomains (often with and without www.). This creates a massive obstacle to indexing if Google is asked the wrong variants. The crawl budget is used up for URLs that should not be included in the Google index. Thus, in case of doubt, Google will not find the pages that are important to you. Incorrect URLs can be generated if Google (currently running under the current Chrome browser) interprets the paths differently than intended. 

In the following, we will go into each of these factors in more detail and also ways to reduce or even eliminate the adverse effects.

Duplicate content through relative URLs

Duplicate content always occurs when the same content can be found under different URLs. If a website is technically not set up properly, it can happen, for example, that the same content can be found under the following URLs:

https://www.digialgarg.com
https://www.digitalgarg.com

https://example.com
http://example.com

The risk of a  non-standard version of the website is completely crawled and indexed is increased by using relative URLs. Imagine the standard version that you would like to have indexed is www.example.com .But enters the crawler when crawling the Web now to a URL of the domain digitalgarg.com and here follows the relative internal links, it remains necessarily all the time on the non-standard version – precisely because of the different sub-pages just the path is stored and no absolute url.

If, on the other hand, absolute URLs were stored in the internal links, the crawler would automatically access the standard version of the domain as soon as it followed the first internal link. The risk of a non-standard version being completely crawled and indexed is therefore much higher with relative URLs.

The problem with this: In this case, it is up to Google to find out which of the four versions is the standard version that is to be indexed and found by the users. So there is a risk that the search engine will index a page other than the one you perceive to be the standard version.

Ideally, automatic redirect to the correct one of these four variants. This is defined on the server-side and should apply to the three wrong variants. The setting itself is done very quickly for technically savvy people, but you should test whether all components (e.g. internal search, extensions and plug-ins, integrated forms, …) continue to work after setting up. In practice, this setting is often not made, so that all four variants can be achieved.

Crawl budget waste

Far worse, however, is wasting valuable crawl budget. Google defines a certain crawl budget for every website, which depends, among other things, on the authority and the complexity of the page as well as on how often new content is published. The crawl budget, in turn, determines how many pages of the domain are actually read by the search engine crawler and can accordingly get into the index.

If each content can be reached under different URLs, the search engine would have to spend a lot more crawl budget to crawl all URLs. However, since there is no unlimited crawl budget, there is a risk that the same content will be read several times (under different URLs) while other important URLs are not crawled at all.

If the above solution for automatic forwarding has already been implemented, there is little to be done here. For large pages, it is advisable to use the log files to check whether the crawler is not still visiting old variants despite the redirection set up.

Incorrect URL interpretation

Unfortunately, in addition to the above two errors, we often find another one that is mainly caused by unclean work. If no “base URL” is specified, it is up to each program to determine how it interprets the respective links.

Example: Is the user currently on the following URL:

www.example.com/category1

A relative url path could be something like this:

<a ahref="/product1.html">

If no further specification is made, this becomes one of the two URLs:

www.example.com/category1/product1.html 

 www.example.com/product1.html

If the URL is wrong, the crawler can then, because it is already on the wrong track, get lost due to the further relative information. We have seen pages where the number of incorrectly crawled URLs was a multiple of the correct number of pages. You can see in the Google Search Console whether your own domain is affected by a misinterpretation. Under the menu item “Coverage”, among other things, the property “Duplicate – not defined as canonical by the user” is stored.

This error should be dealt with in the first step using the HTML Base element. This is often information that can be inserted easily and relatively error-free compared to the next option.

A canonical can also be used. However, page-wide implementation is often flawed. The information is not always correct, especially for extensions such as news or automatically generated job areas. If not tested, there would be additional errors. However, a correctly set up canonical also has a positive effect on other technical SEO errors.

A side effect of absolute URLs: A small hurdle for scrapers

There are scrapers who copy the contents of foreign domains to a new domain. Usually, the goal is to achieve good rankings with little effort and then to earn money through advertising.

If you use relative links for internal linking, you will make it unnecessarily easy for scrapers: the internal links would also work perfectly on the scraped version.

Absolute links have to be exchanged for the scraper side first. This is done very quickly since the domain name only has to be exchanged for the entire page. However, a few of the scrapers don’t go to that (very little) trouble. With absolute URLs, you can at least ensure that not every scraper domain works.

Tip: Report scraper pages directly to Google as webspam at https://support.google.com/webmasters/answer/93713?hl=en

Summary: Do this if you are using relative URLs

What can you do to solve these problems?

  • First of all, the duplicate content problem must be resolved on the server-side. So all URLs of the alternative versions must be forwarded to the corresponding URLs of the standard version via 301 redirects.
  • In the next step, the relative links must be converted into absolute links.
  • If for some reason it is not possible to convert the relative URLs into absolute URLs, it is advisable to use the HTML element “Base URL” as an alternative.
  • In addition, a canonical tag can be set to the standard URL itself for relative URLs on every URL of the standard version. 

Google is now pretty good at finding out which duplicate content is the actual standard version that is to be indexed. Nonetheless, you should avoid duplicate content for the crawl budget alone. Relative links should always be treated with caution. Therefore, whenever possible, always use absolute URLs in your internal links, unless there are reasons against it.