Julia Jung
Jan 30, 2019 | 7 min read

When you are not sure about directions, your best bet is to look at a certain place on the map and figure out how to get there using geo navigation. The same applies to search robots – when they are scanning your site, they start with a sitemap. So if you want your content to be explored, crawled and indexed, it’s better to have a sitemap on your website.

What we are calling a sitemap

A sitemap is a file with a list of all the web pages accessible to crawlers or users. It may look like a book’s table of contents, except the sections are the links.

There are 2 main types of sitemaps: HTML sitemap and XML sitemap.

An HTML sitemap is a web page that lists links. Usually, these are links to the most important sections and pages of the website. Here are some nice examples of HTML sitemaps: DHL, Lufthansa, SmartFares.

how an html sitemap looks like - SmartFare exampleThe HTML sitemap is designed mainly for people and not robots, and helps quickly navigate across the main sections of the site.

An XML sitemap is an XML file (e.g. sitemap.xml) that’s located in the website root folder. The sitemap is created in the XML format and allows to specify links, scanning priority and frequency. XML files look pretty much the same, here’s a sitemap example:  how xml sitemap looks like There’s a great SEO value hiding behind such an unappealing look. It’s the tool for helping search robots understand the logic of your website and find new content on it. And that is why we’ll focus on this type of sitemaps in this post.

Let’s review all the benefits of having an XML sitemap, find out how to satisfy your needs and comply with search engine rules.

What are the benefits of having an XML sitemap

As you’ve already found out, a sitemap is a file with information about the pages that need to be indexed. So it’s definitely the right place where you should list the web pages of your site to tell search engines about your content and help them find exactly what they need. Moreover, it can provide valuable page metadata: last page update, update frequency, and page importance level in your content hierarchy.

As Google says, you can benefit from adding a sitemap to your website and never get penalized for having one. That’s already a good enough reason to create one, right? But that’s not it – here are more benefits of having an XML sitemap on your website:

  • XML sitemaps help search engines understand what you would like to index on your website and prioritize the crawling process (indicating the most and less important pages in the crawling order which is especially important for bulky websites).
  • A sitemap can help your website recover if its web pages were hit by the Google Panda update (especially useful for large websites).
  • If your site has a deep directory structure, a sitemap will act like a guide for search engines so they don’t miss valuable content.
  • If it’s a brand new site, adding a sitemap will be a good way to let search robots (and thus the whole world) know about it and index it accordingly.
  • Sitemaps help you control indexing of certain pages in Google Search Console.
  • An XML sitemap is your legal helper in confirming your content rights as it mentions the page publication and update time.

How many sitemaps do you need and how to create the right one

This practical part of our sitemapping crash course is extremely important – read carefully!

Before creating a sitemap, you may wonder how many sitemaps you need. Usually, one is enough. But if you have a file larger than 10 MB, more than 50,000 URLs or want to track your web pages in Search Console separately, it’s better to break it into several sitemaps. Also, if you have subdomains, you will need separate sitemaps simply because you can’t include subdomain pages in the root sitemap.

To submit all of your sitemaps to Google at once, you can create an index file with all the sitemaps.

What is a sitemap index file?

A sitemap index file is a file that contains many sitemaps and helps in handling them. It serves as a directory that provides search engines with information about your website pages in the XML format. Note: a sitemap index file can’t list other sitemap index files – it lists only sitemap files.

Tip: To improve indexing, describe your hard-to-parse content using the XML extension for video, images and news that are supported by Google.

There are 3 ways to create a sitemap for a website:

  • Manually (requires some time and skills).
  • Using special WP plugins for creating sitemaps like Google XML Sitemap.
  • Using a free sitemap generator like this one (usually all the free options have a limited number of pages); a paid sitemap generator; or something that I call a freemium sitemap generator with an unlimited number of pages that’s available as part of the SE Ranking Website Audit tool (free with the 2-week trial and then available via a subscription plan).

Sitemap XML tags and their settings

The sitemap protocol format consists of XML tags. If you’ve decided to create a sitemap manually or fill in the settings yourself, you should know how to set these tags correctly.

Here are some of the most common XML tags:

  • <sitemapindex> – a parent tag at the beginning and end of the file;
  • <sitemap> – a parent tag for each sitemap file. At the same time, this tag is a child tag relative to the sitemap index tag;
  • <url> – a block that contains the URL and other elements;
  • <loc> – the page URL itself;
  • <changefreq> – how often this page can change;
  • <priority> – the priority of structural elements (this helps to determine which pages have higher crawling priority);
  • <lastmod> – the last time the page content was updated.

Tip: Make sure that you use the same syntax when specifying a URL. Also, sitemap files should be UTF-8 encoded.

How to set sitemap priority and frequency

To prioritize page crawling, you should set the priority and frequency for search engine crawlers. Thus, they will understand what content you consider more important over the other. You know those XML tags already – the <priority> and <changefreq> tags.

To set the priority, use the <priority> tag, which specifies the priority of one URL in relation to the other URLs on your site (valid values range from 0.0 to 1.0).

To set the frequency, use the <changefreq> tag, which indicates how frequently the page is likely to be updated: always, hourly, daily, weekly, monthly, yearly, never.

For the main page, it’s recommended using the “1.0” priority (that is 100%), and the “daily” frequency.

priority and frequency in sitemap example

When it comes to the main sections or promoted pages, use the “0.8-0.6” priority and the “weekly” frequency. For lower priority pages (for example, forum pages), use the “0.6-0.4” priority and the “weekly” or “monthly” frequency.

priority and frequency for less important pages

Tip: Don’t set equal priority for all pages, since it doesn’t look like data to crawlers. Keep in mind that the priority is relative, it shows the page importance in relation to the other pages of your site.

How to validate a sitemap and place it on your website

This is the final part of our short sitemapping course. After you’ve generated your sitemap, check its validity.

For this, add the sitemap file to your website. Use an FTP client, for example, Total Commander or FileZilla. After putting the sitemap file into your website root folder, submit it to Google Search Console (GSC). Also, you can use some of the free tools like this one to validate your sitemap and make sure it’s operating as you intend.

Make sure that your sitemap does not have pages that redirect or pages that are not selected as the main ones in the canonical tag. In addition, the URLs should not be duplicated.

how to add sitemap to google search console

If the sitemap report contains errors, please correct them first and then submit an adjusted sitemap to GSC.

Then, add the reference to your sitemap to your Robots.txt file, so that crawlers can find the sitemap of your website.

sitemap in robots.txt example

Tip: If you disallow some of the pages from being indexed via the robots.txt or use the “noindex” meta tag, don’t include these pages in the sitemap file.

To conclude

Congrats! You’ve finished our crash course on SEO sitemapping. On a final note, I want to leave you with the strongest piece of advice I hold – if you want the journey through your site to be valuable both to search crawlers and users, provide them with a correct sitemap – you definitely should know how to do it after this mini-course. There is no magic – just follow the recommendations above. Of course, it doesn’t guarantee indexing by the search engines, but it does highly increase your chances!

Have I missed something important about sitemaps? Please let me know in the comments, and I’ll add them in!

Share article
9 comments
  1. Thanks a lot for the info! I was confused how to create sitemaps, but your post is very precise and easy to understand.
    It helped me a lot.

    1. Thank you, great that you’ve found this info useful! The schema and the sitemaps are not the same. If the sitemaps provide a list of URLs for search engine crawlers, the schema markup then tells the crawlers more details about your content and return more informative results for users.

  2. I am a little confused can some one explain me sitemap and sitemap-index both needs to be generated in sitemap generator

    1. The sitemap index file is not necessarily required, it will be really useful if you have a large number of sitemaps.
      First of all, the index file is handy as there is no need to add all your sitemaps to Google Search Console one by one. Simply send the index file to GSC, and the search engine will find all your sitemaps. Secondly, it helps not to lose any of your sitemaps. Here you can find more information on its syntax.

  3. I am not getting any correct information about the procedure for generating sitemap.
    provide proper link.

    1. Hey! Which link do you mean? All the links mentioned in this article are active and lead to the proper pages.

Leave a Reply

Your email address will not be published. Required fields are marked *

More Articles
SEO Insights
10 most common SEO issues to avoid in 2020
Dec 12, 2019 12 min read

We've scanned over 80,000 websites with our Website Audit tool within the last 12 months. Upon analyzing the reports on each one of them, we compiled a list of the most common SEO issues. This post is for everyone who prefers to learn from other people's mistakes. Check out what problems you most likely have on your website and learn how to fix them.

Stasia Avet
SEO Insights
Is negative SEO still a lethal weapon in SEO wars?
Dec 09, 2019 16 min read

Staying at the top of the ranking ladder is tough, especially if your competitors try to throw you off the ladder by resorting to negative SEO methods. Learn which nasty tactics your rivals may use, how to protect your website from negative SEO attacks and how Google handles issues of the kind.

Sylvia Shelby
SEO Insights
Keeping tabs on your competitors with SE Ranking
Dec 05, 2019 14 min read

From tracking competitor keyword rankings to monitoring PPC campaigns, and from keeping tabs on their pages for changes to keeping an eye on their backlinks—you have all the tools you need to always know what your rivals are doing right here in SE Ranking!

Andrew Shipp