Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

Sitemaps FAQs

Tuesday, January 15, 2008 at 10:35 AM



Last month, Trevor spoke on the Sitemaps: Oversold, Misused or On The Money? panel at Search Engine Strategies in Chicago. After receiving a lot of great questions at the conference in addition to all the feedback we receive in our Help Group, we've pulled together a FAQ:

Q: I submitted a Sitemap, but my URLs haven't been [crawled/indexed] yet. Isn't that what a Sitemap is for?
A: Submitting a Sitemap helps you make sure Google knows about the URLs on your site. It can be especially helpful if your content is not easily discoverable by our crawler (such as pages accessible only through a form). It is not, however, a guarantee that those URLs will be crawled or indexed. We use information from Sitemaps to augment our usual crawl and discovery processes. Learn more.

Q: If it doesn't get me automatically crawled and indexed, what does a Sitemap do?
A: Sitemaps give information to Google to help us better understand your site. This can include making sure we know about all your URLs, how often and when they're updated, and what their relative importance is. Also, if you submit your Sitemap via Webmaster Tools, we'll show you stats such as how many of your Sitemap's URLs are indexed. Learn more.

Q: Will a Sitemap help me rank better?
A: A Sitemap does not affect the actual ranking of your pages. However, if it helps get more of your site crawled (by notifying us of URLs we didn't previously didn't know about, and/or by helping us prioritize the URLs on your site), that can lead to increased presence and visibility of your site in our index. Learn more.

Q: If I set all of my pages to have priority 1.0, will that make them rank higher (or get crawled faster) than someone else's pages that have priority 0.8?
A: No. As stated in our Help Center, "priority only indicates the importance of a particular URL relative to other URLs on your site, and doesn't impact the ranking of your pages in search results." Indicating that all of your pages have the same priority is the same as not providing any priority information at all.

Q: Is there any point in submitting a Sitemap if all the metadata (<changefreq>, <priority>, etc.) is the same for each URL, or if I'm not sure it's accurate?
A: If the value of a particular tag is the same for 100% of the URLs in your Sitemap, you don't need to include that tag in your Sitemap. Including it won't hurt you, but it's essentially the same as not submitting any information, since it doesn't help distinguish between your URLs. If you're not sure whether your metadata is accurate (for example, you don't know when a particular URL was last modified), it's better to omit that tag for that particular URL than to just make up a value which may be inaccurate.

Q: I've heard about people who submitted a Sitemap and got penalized shortly afterward. Can a Sitemap hurt you?
A: Only if it falls on you from a great height. (Seriously, though: if it ever happened that someone was penalized after submitting a Sitemap, it would have been purely coincidental. Google does not penalize you for submitting a Sitemap.)

Q: Where can I put my Sitemap? Does it have to be at the root of my site?
A: We recently enabled Sitemap cross-submissions, which means that you can put your Sitemap just about anywhere as long as you have the following sites verified in your Webmaster Tools account:
  • the site on which the Sitemap is located
  • the site(s) whose URLs are referenced in the Sitemap
Note that cross-submissions may not work for search engines other than Google. Learn more about Sitemap cross-submissions.

Q: Can I just submit the site map that my webmaster made of my site? I don't get this whole XML thing.
A: There's a difference between a (usually HTML) site map built to help humans navigate around your site, and an XML Sitemap built for search engines. Both of them are useful, and it's great to have both. A site map on your domain can also help search engines find your content (since crawlers can follow the links on the page). However, if you submit an HTML site map in place of a Sitemap, Webmaster Tools will report an error because an HTML page isn't one of our recognized Sitemap formats. Also, if you create an XML Sitemap, you'll be able to give us more information than you can with an HTML site map (which is just a collection of links). Learn more about supported Sitemap formats.

Q: Which Sitemap format is the best?
A: We recommend the XML Sitemap protocol as defined by sitemaps.org. XML Sitemaps have the advantage of being upgradeable: you can start simple if you want (by just listing your URLs), but—unlike a text file Sitemap—you can easily upgrade an XML Sitemap later on to include more metadata. XML Sitemaps are also more comprehensive than an Atom or RSS feed submitted as a Sitemap, since feeds usually only list your most recent URLs (rather than all the URLs you want search engines to know about).

Q: If I have multiple URLs that point to the same content, can I use my Sitemap to indicate my preferred URL for that content?
A: Yes. While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. We take this into consideration, along with other signals, when deciding which URL to display in search results. Learn more about duplicate content.

Q: Does the placement of a URL within a Sitemap file matter? Will the URLs at the beginning of the file get better treatment than the URLs near the end?
A: No, and no.

Q: If my site has multiple sections (e.g. a blog, a forum, and a photo gallery), should I submit one Sitemap for the site, or multiple Sitemaps (one for each section)?
A: You may submit as few or as many Sitemaps as you like (up to these limits). Organize them in whatever way you find easiest to maintain. If you create multiple Sitemaps, you can use a Sitemap Index file to list them all. Learn more.

If your question isn't covered here, you can find even more questions and answers in our Sitemaps Help Group.
The comments you read here belong only to the person who posted them. We do, however, reserve the right to remove off-topic comments.

62 comments:

Jennifer Mathews Somogyi said...

I get these questions from people all the time - Thanks so much for posting an FAQ for me to refer them to.

Jen

Hagrin said...

Thank you for the post.

I was wondering if you could tell me if Google is actively using the Sitemap Autodiscovery entry in the robots.txt file. I see that the Webmaster Tools section correctly parses the Sitemap: statement, but I don't see any mention of that method in your post.

Thanks for the clarification in advance.

Susan Moskwa said...

Yes, Hagrin, we do support autodiscovery of Sitemaps through robots.txt. We've got a help article about it here.

Stran The Man said...

An excellent one stop page to refer to in the future. Thanks. nic the culture player guy.

sa_1709 said...

Thanks for sharing

Here is shown my site statistics below

Sitemap statistics:
---------------------
Total URLs: 15447
Indexed URLs: 3658

Could you give some information what is the reason that so much difference between total and indexed ? How can be increased ?

Thanks

Susan Moskwa said...

Hi, SA:
Check out our help article about those Sitemap statistics, and this article about indexing.

If you want more individual help with your particular site, I'd recommend taking your questions to our Webmaster Help Group.

Missy said...

Thanks for the FAQs. I still have a question... I work for a travel company, and our site is gigantic, with thousands of pages of travel journals and photos. I need to update my sitemap, as it's out of date. I was thinking this time around of just including the more important pages of the site that I most want to be crawled -- the ones that describe our trips -- and leave out the urls for all the journals and photos. Is there any reason why this would be a bad idea?

Thanks.

Susan Moskwa said...

Hi Missy:
Not a problem; it's fine to start small and just add as many URLs as you have time for. Adding the most important ones is a great way to start; you can always add more URLs later if you want more information (such as the Sitemap details that we recently added).

You could also check out an automated Sitemap generator to help you save time. There's a Google tool available and some third-party tools as well.

linkerlinks said...

It's good to have a formal, easy to read blog on the subject of sitemaps, confirming what we've been practicing.

I've posted it on my new Search Marketing Group, where I hope to collect data like this on a regular basis: Search Marketing PPC SEO Group

Is there a newsletter that emails these kinds of updates?

Susan Moskwa said...

Thanks, glad you've found it helpful! This blog is our "newsletter", so you could subscribe to the feed if you want updates: http://googlewebmastercentral.blogspot.com/atom.xml

Gaurav Aggarwal said...

Is there any way by which I can get listed on google search quickly without spending much of time and money
Chelaram
IIT JEE
AIEEE

Jennifer Mathews Somogyi said...

One addition to the webmaster tools that would be nice is pages indexed (such as the number of pages that appear when you do a site:www.mysite.com search) with a daily trend graph so we can track the indexing of our pages over time.

Thanks,
Jenn

Stuart said...

I have a serious problem with my sitemaps due to Google having 800+ network unreachable errors from workfriendly.net URLs in Google's system. (workfriendly rewrite URLs to look like they are part of my site, which they are not)

I am concerned that this number of errors, which I don't seem to be able to stop, will affect my ranking.

I would be really grateful if you could help or point me in the right direction, thanks

Stuart

Uber Bill said...

I'm finding conflicting information on the location of sitemaps. You said in this post that they can now be posted "just about anywhere," with the new cross-submission system. Does this mean that sitemaps no longer need to be at the top of the hierarchy? We're having server trouble with posting ours, and can only currently put them at the bottom of the hierarchy, and if we could just put it there it would potentially save us weeks of headache. Thanks!

Joey said...

Great job on the FAQ!

I want to know more about the xmlns attribute in the urlset tag. The reason is because I want to convert my sitemap.xml to display as an HTML page. I researched XSLT and know how to turn XML into HTML.

When I try with my sitemap.xml created for Google it doesn't work. I found when I take the attribute xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" out of the urlset tag it works. Does the XML Namespace have to be in the sitemap.xml file for Googles crawler to process it? What am I doing wrong? Thank you!

Lian Felani said...

I always fail to add sitemap in my site. error. What should I do?
How to add sitemap in my blog? I use blogger.

Susan Moskwa said...

Hi Lian:

Since the /atom.xml and /rss.xml feeds go through Feedburner, what I did was to use the feed at example.blogspot.com/feeds/posts/default?orderby=updated (replace "example" with your blog's real name, of course). You'll need to add example.blogspot.com/feeds/posts as a site in your Webmaster Tools account, and then add default?orderby=updated as the Sitemap name.

Susan Moskwa said...

Uber Bill:

You can place a Sitemap at any level in your site's hierarchy. When you submit the Sitemap in Webmaster Tools, you'll need to add that particular subfolder to Webmaster Tools (as an individual site) and then submit the Sitemap for that site. For example, add www.example.com/1/2/3 to your account in order to submit a Sitemap located at www.example.com/1/2/3/sitemap.xml.

If you want that Sitemap to include URLs from anywhere in the site, just make sure www.example.com is verified in that same account.

Kellyn said...

I am sorry, but I really need help in this and I have no idea where to put this.

I signed up for an email account ([email protected]) using my friends laptop. Apparently, when I wanted to use the features Google provides, it brings me to my friends account ([email protected]). Only to find out that my email account has became her primary account! We are of separate users and would like to ask if there's anything you can do to remove my email as her primary address so I can also enjoy Google's features to its optimum?

Do get back to me, thank you.

jpprufino said...

Hi Kellyn,

In the Google Accounts helpcenter and there is this entry [ http://www.google.com/support/accounts/bin/answer.py?answer=70206 ] that I believe it answers your query (The 3rd example mentioned). So, you just need to login to your Google Account, press edit your information and you'll be able to remove the additional Yahoo username for that account.

Hope it helps.

João Rufino

Uber Bill said...

Thanks, Susan! Any chance that a similar system for robots files is coming, too? Oddly, I find it a pain to manually put nofollows on internal links on ten thousand pages, but we can't put it in a sane place with our servers.

Susan Moskwa said...

I can't speculate about future features, but robots.txt is probably something that's worth taking the time to set up correctly (especially if there are areas of your content that you don't want crawled). Even if Google did implement some sort of "robots.txt cross-submission", there are still many other crawlers and search engines that only look for robots.txt at the root of your site (since that's what the spec specifies).

Foot In Mouth said...
This post has been removed by the author.
Foot In Mouth said...

Hey, I am a rabid site maps / webmaster tools user, but I am frustrated because neither webmaster tools nor site maps are available from Igoogle as a Gadget, or even on the list of normal links. I always have to go searchign for it. Is there any plan to make a gadget for the Igoogle? That would make it alot more convinient!

reggy said...
This post has been removed by a blog administrator.
Gounder said...

I use blogger on our website, we use labels for every post which sometimes is two to three words long, Blogger makes a separate pages for each label, which is with the space as well, like www.mydomain.com/blog/label/my keyword label.html, but if someone clicks it they should find my%20keyword%20label.html, now i have seen that google webmaster tools showing a 404 error for labels/my, it seems the googlebot is coming to the first space and stopping there instead of reading the complete URL. which is why many of my pages aren't getting cached and has also resulted in a drop of my rankings.

nshelton said...

We're working with a CMS and so a lot of our pages fall under a template page.
(http://www.indianapolismonthly.com/article.aspx)

So we have a lot of pages as such
(http://www.indianapolismonthly.com/article.aspx?id=20224)

will listing just the aspx page be good enough in our sitemap, or should I update the sitemap with id numbers as we add content to better crawl all of our article pages past and present?

Thank you!

Brick Marketing said...

Great FAQ. Always a pleasure reading these informative answers all in one place as opposed to searching everywhere for each answer.

Referring to nshelton's post:

When it comes to crawling, I would change the article id URL's into keyword rich domain names. As you add new rich content, it will help the crawl rate and also your long term SEO strategy as well.

For example, instead of http://www.indianapolismonthly.com/entertainment/article.aspx?id=20498 - I would have it something like

http://www.indianapolismonthly.com/entertainment/bestbets.htm

Good luck!

Roteirista diletante said...

I'm using blogger and my feeds are redirected through Feedburner too.

What's the difference between the method explained for Lian Felani (where you add example.blogspot.com/feeds/posts as your site, and default?orderby=update as your Sitemap name) or just to add atom.xml as my Sitemap name after I've added www.example.blogspot.com as my site?

One more question, if I add example.blogspot.com/feeds/posts as my site, I can't (I don't know how to) verify it, is that all right if I leave it unverified?

thanks

Susan Moskwa said...

If you add atom.xml as a Sitemap for example.blogspot.com, you'll see errors reported in Webmaster Tools because the URLs in atom.xml are on feedburner.com. In order to submit URLs for feedburner.com in a Sitemap hosted on blogspot.com, you'd need to follow the instructions for Sitemap cross-submissions and verify feedburner.com in your account, which I don't believe is possible (unless you actually own Feedburner!).

Once you've added example.blogspot.com to your account and verified it, when you add example.blogspot.com/feeds/posts it should become automatically verified because it's a subfolder of example.blogspot.com (if you can prove ownership of example.blogspot.com, we assume that you have ownership of all its subfolders).

Woodmeisterflex said...

I have an issue with PHPSESSID appearing as duplicate links to my content in the SERPS. By including the correct link in my Sitemap will Google take the sitemapped version of the link as the priority link. It frustrates me that Google doesn;t filter out PHPSESSID parameters as a default...anyone got any tips on this?

thanks

LeotheSEO

Susan Moskwa said...

Hi, Leo:

As stated in this blog post, you can use your Sitemap to indicate your preference for which URL should be the canonical URL. Although we can't guarantee that our algorithms will display that URL in search results, it's helpful for you to indicate your preference.

You can also use pattern matching in your robots.txt file to block crawling of any URLs with the PHPSESSID parameter.

BlogmasterPg said...

Sitemaps are great webmaster utility. I think it's the best one. I use blogger and I think to go in another hosting. Do you know if the limitate (now) number of url will be expand in another hosting service? I hope it. I know sitemap is the same for many search engines. I know Google signed an accord between Msn search, Yahoo and Ask.com. I know where find Yahoo "webmaster utility", called "Site explorer", a very little strument, but of Msn Search and Ask.com I didn't found it. There are someone can help me? Thanx

AlyscoTravel said...

Thanks for the FAQs. I was looking for an answer to a question and actually learned a few things that I did not even think of.

Scott
http://alyscotravel.com

blr044 said...

I am still a bit new to all of this, but get some help, please. Left message in forum, but no replies yet.

My error message: URL restricted by robots.txt
We encountered an error while trying to access your Sitemap. Please ensure your Sitemap.....

Just direct in a direction to whom I can ask for a answer. Thanks.

Bennett

St.Frame │ İnternet Adına Herşey... said...

Thanks !

Bronzer Tanning Lotion | Tanning Beds said...

I love the Google blog. It is so full of great information. Thanks for keeping us posted.

Richard &amp; Elsa Laplante said...

It would be nice to have a pictorial tutorial, showing people like myself how to creat and how to install a site map. One picture saves a thousand words.

Gabi said...

I added a new sitemap and all the analysis gives no error at all, however sitemap shows 95 URL but 0 indexed and no search brings up our site exclusiveafricanart.com. It did work in the past, then stopped. I also use Adwords, which works fine, but none of the organic searches do. Any help will be much much appriciated! (adwords are expensive)

Woodmeisterflex said...

Hi Gabi, how old is the website...if its new...even if u have added a sitemap its going to take some time to index all the pages. Check in Webmaster central to see if you have any penalties

Gabi said...

Hi Woodmeisterflex,

Thanks for the response. The site has been up for two months. Webmastertools shows no erros, no messages, everything looks ok. But still the site pages are not indexed. I can get it on the search only if I search for the site URL exclusiveafricanart.com. Any ideas?

Woodmeisterflex said...

HI Gabi

if you do the site: command in google (site:exclusiveafricanart.com)you can see that you have 26 pages currently indexed. 2 months is still quite a young website. i recently added a sitemap to one of my sites and its taken at least 6 weeks if not more for it too index only a half of the sitemap files. it was also reading 0 for around 4 weeks of that time frame. Give it a couple more weeks. Are you getting any traffic according to your analytics?

Lee

sathya said...

Hi..
I have problems in submitting my sitemap.When i submitted using atom.xml,it showed me errors stating that
"Paths don't match
We've detected that you submitted your Sitemap using a URL path that includes the www prefix (for instance, http://www.example.com/sitemap.xml). However, the URLs listed inside your Sitemap don't use the www prefix (for instance, http://example.com/myfile.htm)"

I had submitted my blog(www.satzcorner.blogspot.com) along with www when i added to google.Is that a problem or anything else? I need your guidance to sort out this problem

btr said...

I am very new to website....with that stated. I am linking better rated sites to mine. how much if any will this help my page rank? Will this alone help or does it all depend on the traffic coming from their site to mine? Thanks for the help.
Brent

btr said...

My site is www.raft1.com I am having everyone I know take a look at it to see what they think. The only problem is that as I change this site I can't help but think that it will affect its ranking. as long as I don't switch the address will it still start all over again, or would that be more up changes involving my site map?
Thanks again
Brent

lolypop said...

Please reply my posting, cause i need confirmation from what i've done. Thank's a lot.

I'm just new blogger fellow.
At the first time i make my blogger sitemap was error cause feedburner matter than i read some people blog that advice their readers to add: atom.xml?redirect=false on the sitemap.

Firstable we change this command at my sitemap seem good result. No more sitemap error on my webmaster tools.

But one day latter, i got PSA (Adsense) from all of my blog.

Previously even got error on my sitemap, the ads still fine. (No PSA)

Is it possible this problem caused by that command (atom.xml?redirect=false) on my sitemap?

Now i change all with your advice, hopelly tommorrow or soon will see
normal ads again.

Now my question are:

If it's true, how long normal ads will be appeared?

If it's wrong, how can i got PSA that running well previously?

OyUN said...

Please reply my posting, cause i need confirmation from what i've done. Thank's a lot.

smartin said...

Hi..
I have problems in submitting my sitemap.When i submitted using atom.xml,it showed me errors stating that
"Paths don't match
We've detected that you submitted your Sitemap using a URL path that includes the www prefix (for instance, http://www.example.com/sitemap.xml). However, the URLs listed inside your Sitemap don't use the www prefix (for instance, http://example.com/myfile.htm)"

my site is www.topmob.blogspot.com

Gillon said...

Can you let me know if i have http://www.mydomain.com and http://www.mydomain.com/default.asp in my sitemap will cause google to list duplicated urls or even think that i am trying to duplicate pages?

Pantheratigristigris said...

oh i fell so stupid because i've read a lot about sitemaps but i haven't figured out how i put in my blog :( I guess that it should me included ind my html?

digitaltracker said...

I'm curious...

Typically when I submit my sitemap Google is super prompt at updating. Lately however it is taking 3-6 days to update...

Is there anything as a webmaster I could be doing to slow down the process? It's just one site in particular. My other sites when submitted are updated quickly.

Also, in my webmaster account no errors are being indicated.

Any advice?

Umesh said...

The sitemap has got indexed in Google. We are afraid that the sitemap can be accessed by the website scrappers and used for illegitimate purposes.

What can we do in htaccess files so that only google can access the sitemap?

Your urgent help will be greatly appreciated.

Mycoolestgifts said...

Thanks a lot for this page. I have been struggling to add site map for my blogs repeatedly, but you guyz made it possible. Thanks again.

Kim
electronics345.blogspot.com

ZZLLRR said...

Is there a Google Webmaster Tool for automatically generating my sitemap? Thanks!

horse trailers said...

What is this warning about dynamic content in my sitemap?

All the URLs in your Sitemap are marked as having dynamic content.
All the URLs in your Sitemap are marked as having dynamic content. Because dynamic content is difficult for search engines to crawl and index, this may impact your site's performance in search results. Check your Sitemap to make sure your site information is correct.

How do I correct this?

Uncle Bobs Trailers

MAR said...

Hi there, need help I'm newbie, I have created my blog site two weeks ago, I submitted my two site map atm.xml and rss.xml before it was OK but after signing on feedburner now ERROR (empty url). Google already crawled and index.

John said...

This guide was extemely helpful for me. Thanks.
Thank you.

John Milletics
Grand Rapids, MI
Email: [email protected]
Website: http://www.geocities.com/johnmilletics/

Christina_Media said...

Do Sitemaps need to have the "www"?? I mean, even if it isn't 100% necessary, isn't it protocol of some kind and just the right thing to do? Please someone offer me a reason for/reference for why it is the right thing to do, so that I can have my web company make these changes without complaining and treating me like an idiot! Thank you!

Woodmeisterflex said...

Hi Christina, no they don't have to have the www in there although it depends on how you display the URL to the end user. If you want the user to eventually end up at the www. version of the address then by not having it in there you're just making Google have to do a redirect when it follows the link. You don't want to end up with duplicate content for example
http://www.mysite.co.uk and http://mysite.co.uk are the same URL, this can be addressed with a Canonical redirect in your .htaccess file...which will look something like this
Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^mysite.co.uk [nc]
rewriterule ^(.*)$ http://www.mysite.co.uk/$1 [r=301,nc]
rewriteRule ^index.php$ http://www.mysite.co.uk/$1 [R=301]

William Fence said...

I've changed my sitemap slightly. Do I need to resubmit via Webmaster Tools? Google seems to download it every couple of days so why resubmit via Webmaster Tools?

fred said...

thanks for the info, I cant believe that i understood most of what i read, it was in laymens terms
fred

Maile Ohye said...

Hi everyone,

Since some time has passed since we published this post, we're closing the comments to help us focus on the work ahead. If you still have a question or comment you'd like to discuss, free to visit and/or post your topic in our Webmaster Help Forum.

Thanks and take care,
The Webmaster Central Team