To be found in the search engines is a basic requirement for a good online business. And yet there are situations where it is better not to have specific information in the search results. In this article, you can find out why, how exactly this works, and what the “No Index” command has to do with it.
- What does No Index
- The procedure of the search engines: crawling and indexing
- How to implement a no-index statement
- Why Google no longer allows no index for robots.txt
Today we have an incredible amount of information. To still get the best results, especially with search queries, the search engines create an index with all-important content.
In doing so, they save websites, keywords for content, content, and links and, in turn, remove pages and the like from this. So it is very dynamic and changes every day.
To be found on the internet at all, indexing is essential for you. Incidentally, this is the case with Google search and other search engines, depending on the Search Console, such as Bing or Yahoo.
Now, there is also content that is not relevant for search engines and their queries or where you do not want it to be indexed. And that’s exactly what you use the noindex statement for.
This belongs to the so-called meta tags and helps you to make your site more SEO-friendly. Above all, the noindex tag is used to avoid duplicate content and to deliver the best possible content for search engines and users in search results. And ultimately, to avoid confusion and frustration among users and thus a bad ranking in SEO for you. It’s all about the best possible Google positioning.
No indexing can be beneficial for the following content, for example:
- Privacy policies
- data protection
- Thank you page after ordering
The procedure of the search engines: crawling and indexing
So that the search engines can deliver the best and most up-to-date results, the index must, of course, also be constantly up to date.
However, searching through the mass of information by hand is simply no longer possible. It is precisely for this reason that so-called robots or bots are used. These now scour the Internet independently and permanently and follow the websites’ links and content – also known as crawling.
Each bot can only take in a certain amount of information and then stores it in the search engine’s register. That process is indexing on the search engines. So in order for you to be found at all, your page has to be indexed.
You can imagine the whole thing in a similar way to a library. Instead of searching through each book again and again with a request, a catalog is created.
The respective book information is then stored under the searched terms, such as title, author, or the like. Collecting the information from the respective books for the corresponding search terms is crawling in the online area. The registration is the indexing, and the created catalog is the register of the search engines. You will only be found if you are also indexed.
Due to the bots’ limited storage capacity, it is advantageous for you to make their work as easy as possible. On the one hand, you can do this by removing non-relevant pages from the search. And on the other hand, by deliberately marking links that lead to such pages with a “do not follow” instruction.
The best way to do this is to use the “nofollow” command. Nofollow is used to slag your page for the bots to save paths and use data more sensibly. And to ensure a better SEO rating for you as well. The combination of the two statements, noindex, and nofollow, play a more important role.
For this reason, these bots always first check what the rules for the respective website look like. And that’s exactly where you put your nofollow and noindex tags.
How to implement a no-index statement
Of course, it all depends on which system you are working with. For example, if you use a modular system for your website, most of these systems have this function already implemented.
If you want to check this very carefully or use it for other sites, then it is best to contact your provider. If you use WordPress, you need the ” Yoast SEO ” plugin. There you can click on the dropdown menu for the option “Meta robots index” under Settings and select the respective command. If, on the other hand, you program your website yourself, it depends on the variant you use: Meta Robot Tag or X-Robot Tag. The best way to proceed is listed below.
Of course, you can also enter several commands for these two variants. Use the same meta name for this and add your corresponding instruction to “content.” The bot reads these and executes them accordingly. When using multiple tags, please note the combination of these.
For example, the combination of noindex and the blocking of a page would hurt you. Because the page is blocked, the “noindex” command is not read, and the page may end up in the search results.
Using robots meta tag
The meta tags’ placement takes place in the head area of a page and outputs the specific instructions. You can also decide whether the rule should apply to all bots or only to certain bots.
For example, you might want to appear in search results but not in Google’s news results. To achieve this, do not enter “robot” for the “name =” command, but the specific bot name. In our example, this would be “Googlebot-news.” The entry looks like this:
<! DOCTYPE html>
<meta name = “googlebot-news“ content = “noindex“>
HTTP / 1.1. 200 OK
X-Robot Tag: noindex
Why Google no longer allows no index for robots.txt
The robots.txt file, also known as the Robots Exclusion Standard Protocol (REP), has been used since the 1990s to tell Google crawlers which areas of the website to search. Since September 2019, however, Google has decided to make REP as the internet standard and, at the same time, introduced a few changes.
Since then, the noindex, nofollow, and crawl-delay commands are no longer recognized by the bots in the robots.txt and are therefore not executed. The reason for this is that, according to Google, far too few users have used these commands, namely just 0.001 percent.
If you still want to apply these rules for yourself, alternative options have been created by Google:
- Use the X-Robots-Tags in the HTTP-Header or the Meta Robots Tags for your No Index command. A most effective and most recommended variant, also from Google itself
- With the help of a password, the content also protects against being included in the index.
- Set like a kind of stop sign with the disallow command, which means that the crawlers do not search the content. This means you will appear much less in the search results, but you can still appear again through links from other sites.
- Use the “Remove URL” tool in the Google Search Console. This allows you to temporarily remove a URL from the index.
- Use the 404 or 410 status commands to inform the crawler that this subpage does not exist. What does not exist is not indexed.
So you can continue to make use of the rules and use them for better SEO of your site.