An XML sitemap provides an overview of all the URLs on your website and helps Google crawl. You have heard of XML sitemaps before, but you don’t know what to do with them and whether you need them? In this blog post, you will learn how a sitemap is structured and what needs to be in it. Then I’ll show you how to create a sitemap and how to submit it to Google.
What is a sitemap?
A sitemap contains all the subpages of your website that should be indexed by Google. Since it is written in the standardized XML format, it is also called an XML sitemap. With a sitemap, you help Google crawl your website. It is usually located in the main directory of a domain and can be accessed there. This contains the corresponding XML code:
Which elements does Google consider?
- The first two lines define the XML schema for our sitemap and specify that UTF-8 encoding is used.
- loc denotes the URL(location) that is listed in the sitemap.
- lastmod shows when the URL was last changed. In our example, it was on January 1st, 2005. With this entry, the search engine recognizes when the article was updated – and above all, whether it is worth crawling the page again. The notation is in the W3C datetime format. The year is right in front and the day in the back. The lastmod element is taken into account by Google when crawling.
- changefreq is optional and describes the change frequency, i.e. how often the URL is likely to be changed. The following values are accepted here: always, hourly, daily, weekly, monthly, yearly, never. With always, the URL is changed with each call. Never is only used for archived URLs. In our example, the URL is (most likely) updated every month. Changefreq is only a recommendation and not an order for Google. A URL with a changefreq of hourly can be crawled less often, and a URL with yearly can be crawled more than once a year. Even with URLS with never, the crawler comes by every now and then. If you are not sure what to enter here, you can leave this element blank.
- Priority means how important the URL is within the entire domain. The values range from 0.0 to 1.0. The default is 0.5. Google ignores this value, so it can also be omitted.
John Mueller from Google himself says in his tweet that lastmod is considered by Google, while priority is not relevant. You can find more details about the XML elements of a sitemap at sitemaps.org.
XML, RSS, Text: What other formats are there?
Google accepts various sitemap formats. The most common is the XML format described above. In addition, Google can also read other formats.
- RSS: If you have a blog with an RSS feed, you can also submit the URL of your feed. RSS 2.0 and Atom 1.0 feeds are accepted. With a media RSS feed, you can provide Google with information about videos on your website. Remember that the RSS feed only contains current URLs.
- Text file: If your sitemap should only contain website URLs and no other information, you can also create a text file. There is only one URL in each line, and the whole thing is saved in .txt format.
- Google Sites: If you have created your website with Google Sites, the sitemap will be generated automatically. However, this is not automatically submitted to Google; you can do that yourself. I will explain to you below in the article how a sitemap is submitted.
- XML: In my opinion, the best solution for your sitemap. The standardized format ensures that Google receives all the information it needs.
A sitemap is good for ranking, right?
A sitemap is not a direct ranking factor, but it helps Google to find your content more easily and to recognize changes quickly. Especially with new websites, it pays to notify Google as soon as possible that there are new URLs. Your pages will be indexed faster, and you can also control directly which pages should be included in the index.
Do I also need an HTML sitemap?
In contrast to the XML sitemap, the HTML sitemap enables your users to find their way around the page, like a kind of table of contents. The HTML sitemap does not replace the XML sitemap but can be seen as a supplement to it.
As you can see, the HTML sitemap shown above gives information about the categories and the structure for the user. Unlike the XML sitemap, the HTML sitemap is its subpage, which is usually linked in the footer and is therefore visible to your users.
Why do I need an XML sitemap?
As you have already read above, a sitemap is not relevant for the ranking, and some pages do not have any. Nevertheless, there are a few advantages of a sitemap:
- Google recognizes changes faster: If new URLs are added, you can inform Google of this with a sitemap. This will help Google crawl.
- New websites are indexed faster: Your website is still fresh, and Google first has to know that it exists. With your sitemap, you can actively tell Google that there is something new.
- Your pages are not linked to one another: If your content pages are not linked to one another, you can use a sitemap to ensure that Google can find them anyway. So they won’t be overlooked when crawling. Of course, a sitemap does not replace a well-thought-out internal link!
- Your site is extensive: If you have a lot of URLs on your site, a sitemap reduces the likelihood that something will not be crawled.
- Even if you own rich media content or want to be displayed on Google News, Google may also include additional information in the sitemap. You will also need a separate sitemap for Google News.
So you see, there are good reasons for a sitemap. Google itself says that a sitemap does not guarantee that everything will actually be crawled, but there are definitely no disadvantages.
What if I don’t have a sitemap yet?
Then create one! You can either easily create it in your content management system (CMS), or you can create it manually. But I recommend the first variant. You will find out why in a moment.
How big can a sitemap be?
A sitemap can contain a maximum of 50,000 URLs, and it can also be a maximum of 50 megabytes in size. If you have a larger website, then your sitemap will need to be split. Depending on which CMS you are using, this can happen automatically. The individual sitemaps are linked in the sitemap index file. The whole thing looks like this:
What criteria does a sitemap have to meet in order for Google to be happy?
For your sitemap to be error-free and accepted by Google, some requirements must be met. I’ll tell you what these are.
- Your sitemap file must be encoded in UTF-8 format, and the appropriate escape codes are stored if some characters cannot be displayed correctly.
- Make sure that your sitemap only contains URLs from the same domain. If you have multiple domains, each domain will have its own sitemap.
- Your sitemap may only have content that should be indexed and actually accessible. You can see possible errors in the sitemap in the Google Search Console.
Use consistent urls
Google crawls your URLs just like you put them in your sitemap. So be consistent and don’t mix up different spellings.
How to create an XML sitemap in the content management system
Most content management systems like WordPress have a corresponding extension with which you can easily create a sitemap for your website. Let’s take a look at how you generate a sitemap in WordPress.
First, you need a plugin that will help you create it. When choosing the plugin, make sure that it is well-written. You can see this, for example, in the ratings and the number of users who use it. Well-written plugins pay attention to Rel = Canonical and Noindex, while bad plugins simply add everything to the sitemap. In this example, we are using the Yoast plugin. It’s that easy:
- Go to the “General” button in the Yoast settings and then click on “Features” at the top.
- Activate “XML sitemaps”. By clicking on the question mark, you can display further information. If you have activated the function, Yoast automatically creates an XML sitemap for your site.
With a click on “View the XML sitemap” your sitemap will be opened in a new tab. You will need the link to this later if you want to submit the sitemap to Google.
If you would like posts to be excluded from the search results, you can set this under “Display in search” and there under “Content types”. Your posts would then get the robots meta tag noindex and will not be included in the sitemap. Since I would very much like my contributions to appear in the search results, the switch remains on “Yes”.
Under Taxonomies, you can decide whether categories should also be displayed in the search results. The same applies here: If the switch is off, category pages are set to noindex and are therefore not listed in the sitemap.
The practical thing when you have your sitemap created using the CMS: It is always up to date and less prone to errors. The bigger your page gets, the more difficult it is to oversee the content, especially if something changes manually. That is why I recommend that you always have XML sitemaps created automatically.
How to manually create an XML sitemap for your website
Alternatively, you can also create your sitemap manually. You should really only do this if you are not using a CMS. But remember: If you generate your sitemap manually, you will have to recreate it every time something changes at any URL. That is why I recommend you to use a tool for this too and under no circumstances click your sitemap together by hand.
You can use this tool for this, for example. It even recognizes Noindex and Canonical elements and does not add the corresponding URLs to the sitemap. There is also a pro version of this tool that automatically updates the sitemap when changes are made. If you only use the normal version, you will have to generate a new sitemap every time your site changes. You can quickly lose track of things.
It is best to always create your sitemap automatically. Otherwise, errors can creep in quickly, which then lead to indexing problems. You can find out which elements lead to errors in the next paragraph.
What doesn’t belong in your sitemap
Unfortunately, it happens again and again that sitemaps contain elements that do not belong there. I have already written that your sitemap can only contain information that is intended to be indexed and actually accessible. If you have faulty pages or redirects in your sitemap, there are problems with crawling. These elements have no place in your sitemap:
- Duplicates of a URL: Only the correct version of each URL should be indexed. So there is no point in including digitalgarg.com/blog and digitalgarg.com/blog/ in the sitemap at the same time. Choose one of the two versions.
- URLs with a canonical tag: If a page has a canonical tag, this is a sign for Google that it should not be indexed. However, if it appears in your sitemap anyway, this sends contradicting signals. Everything that is listed in a sitemap should also be indexed. Do not use any URLs with a canonical tag in your sitemap to avoid crawling conflicts.
- Session IDs: If there are session IDs in the URL of a page, a unique link is generated each time the page is visited. Since the link changes every time you visit the page, it looks like duplicate content to the Google bot.
- Pages with status code 404/410: These pages give an error and have no place in your sitemap. Either delete the relevant entries from the sitemap or make sure that the links work again.
- Redirects: Only unique URLs should be listed in your sitemap. Redirects lead the Google bot astray.
- Pages with noindex tag: As with Canonical- day conflicting signals are sent when you page Noindex- aufnimmst day in your Sitemap. These pages have to stay outside.
- Images: Your normal sitemap only lists URLs to content pages. If you have a lot of images that you want to index, use an image sitemap. I’ll explain something to you below.
How do you submit your sitemap to Google?
Now you’ve successfully created a sitemap for your website, but how does Google know you have one?
References in the robots.txt to your sitemap
First of all, save your sitemap in robots.txt. This file helps the crawlers to find their way around your website. A reference to the sitemap in the robots.txt tells the crawlers which URL structure your website has. A reference to the sitemap in the robots.txt looks like this:
How to submit your sitemap to Google
In order to submit your sitemap to Google, you need a link between your website and the Google Search Console. Here you can submit your sitemap under the menu item Sitemaps.
Here you can also see whether you have already submitted a sitemap and whether it was successfully submitted or whether there were any problems. You can also easily enter the URL of your sitemap so that it is submitted.
Is your sitemap incorrect?
In the Search Console, you can find out whether your submitted sitemap has errors. This is shown in the sitemap report in the “Status” column. Our sitemap has the status of “Successful”. If your sitemap is incorrect, the Search Console will show you this with the status “Sitemap contains errors”. If Google cannot retrieve your sitemap, you will see this under the status “Could not retrieve”. You can find a list of all possible error codes in Google Help by scrolling down. Here you will also get suggested solutions.
Take a look at this report regularly so that you can see whether your sitemaps are still free of errors. Errors in the sitemap can lead to problems with indexing and should, therefore, be corrected. Ideally, you should have your sitemap created automatically and have thus already reduced the susceptibility to errors.
Do you need to update your sitemap?
It makes sense that you let Google know as soon as there is new content on your website. If your sitemap is generated via the CMS, the sitemap is automatically updated when changes are made. At least now, you will see why it makes sense not to create the sitemap manually.
Especially when new content is added frequently, a plugin does a lot of work for you, so that you don’t have to worry about updating it yourself. If, on the other hand, you create your sitemap manually, you have to update it every time you change it, and this can quickly become confusing.
How to handle multiple language versions in the XML sitemap
If you use several different languages with your website, you also have to inform Google. To do this, you create markup in the sitemap. There are two other methods of including the hreflang attribute.
Also Read – Read More About Hreflang
In order to specify the language versions via the XML sitemap, an XHTML: link element is added to the loc element of each URL, in which the different languages are defined. This information must be provided for each individual URL of the website. It will look like that:
As you can see, the whole thing gets very large very quickly. Therefore, be sure to check your sitemap for errors before submitting it to Google. You can find more information about the hreflang attribute directly on the Google support page.
If you are also using Yoast to create the video sitemap, first install the “Yoast SEO: Video” plugin. This will add another menu item entitled “Video SEO” under the settings for Yoast.
In order to create a video sitemap, you don’t really have to do anything else, Yoast will do that for you. You can, of course, define other settings for your videos, but the standard settings are usually sufficient.
You can call up your sitemap by clicking on the link or via www.yourdomin.com/sitemap.xml. The whole thing looks like this:
It is also possible to include your pictures in a sitemap. For images (as well as for videos or Google News) there are special criteria that keep changing. A picture sitemap is not necessary for normal website users, but if you have a large picture portal, then you would also like to be found in the picture search. An image sitemap can contain information such as subtitles, geographic location, title or the image license.
Google News Sitemaps
If you have a news portal, then it would be conceivable that you would also like to be listed on Google News. First, you have to be signed in to Google News as a publisher so that your content can be displayed there. There are also special requirements for a Google News sitemap, which you can read here at Google Support. The special thing about it: If your Google News sitemap is incorrect, you will be removed from Google News until the errors are corrected. So make sure that your Google News sitemap is always clean.
What you learned today
You got an overview of what an XML sitemap is and why you need one too. You also now know how to create a sitemap for your website and what special forms there are. If you would like to know more about SEO, please take a look at our blog. If you have any questions in the SEO area, please feel free to contact us.