The robots.txt fileThe robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.
At MyOrg we break the file up to multiple sections
- For the GoogleBot - This allow us to give specific instruction to Google Bot Only.
- For the GoogleBot - This allow us to give specific instruction to Bing Bot Only.
- For all other Bots - This to gibe all other search bot instructions.
- Finally we include our Sitemap. Using the sitemap directive you can tell search engines – specifically Bing, Yandex and Google – the location of your XML sitemap.
Explanation for why at MyOrg we setup our robots.txt fileThe reason we would not just use the generic bot is because we may want to have Google or Bing index something that we don;t want othe sites to index. This allows you great control of what get spidered. And of course it always best to include the site,map so they know to index ever link you have in your sitemap.
To better understand why you use certain commands in the robots.txt file "vary.com" has already outlined a full explanation on robots.txt files I've ever read follow this link to vary.com's article titled "The robots,txt File"
A brief thought on why use proper site structureWhen you are setting up your website if you create a structured site, then make proper use of the structure it will make your life easier. Not only is it easier tot find things, but it also easier for the search engines to index your files,. So instead pf just dumping all your files in the route directory or one sub folder; employ a ptroper site structure.
Images, then add any additional folder you may need. Examples of additional directries might be includes or processors or templates / skins.
I hope fiind this article useful and give you a little insight on why you need a robots.txt file