One often overlooked file that all Web sites should have is the robots.txt. While not a requirement it is a recommended file because ALL robots that index the Web are supposed to request this file before it does any indexing of any document found on your Web site. If this file is not found on your server when the robot visits it causes your Web server to return a 404 (Page Not Found) error or even a redirect to some other page. Robots are not suppose to follow a redirect or read a 404 (Page Not Found) error when requesting the robots.txt, but that doesn't mean they won't which can cause problems. Plus it makes little sense to have your server turn out a 404 (Page Not Found) error ever single time a robot visits the site. In one day Google requested this file from our server 18 times! To prevent possible problems with robots the best thing you can do is provide this simple robots.txt file. The Standard for Robots Exclusion (SRE) was first proposed in 1994, as a mechanism for keeping robots out of unwanted areas of your Web site. These areas may include the following: a.. sensitive "for your eyes only" material b.. infinite URL spaces in which robots could get trapped ("black holes") c.. resource intensive URL spaces, e.g. dynamically generated pages. d.. documents which would attract unmanageable traffic, e.g. erotic material. e.. documents which could represent a site unfavorably, e.g. bug archives. f.. documents which aren't useful for world-wide indexing, e.g. local information. Even if you do not wish to restrict robots from specific areas or files on your server it is still recommended you provide this file and simply instruct the robot that it is OK to index what ever it finds. WARNING UPDATE!!! (December 11, 2003) There is a major flaw in Google's robot when it comes to the robots.txt file so you MUST follow these instructions or your home page could be dropped from Google's index. The construction of the robots.txt file is actually very simple. To create a robots.txt file all you need is a simple ASCII text editor like Notepad (NOT Wordpad) and create a text file called robots.txt. Here are a couple of examples of what you might place in this file: To allow all robots complete access User-agent: * Disallow: /azr94v2hh2lg/ Some say you can create an empty robots.txt file instead of the above or to leave the "Disallow:" area empty, but if you do you could risk being dropped from Google. We have confirmed at least one member who has fallen victom to this so please follow these instructions. So why did we enter "/azr94v2hh2lg/" for the Disallow: area? This tells Google as well as any other robot type search engine to stay out of your "/azr94v2hh2lg/" directory. Of course you do not have this directory available and that is our point. You can replace the "/azr94v2hh2lg/" with any other directory name you wish just as long as the directory does not and will never be part of your Web site. You have been warned.... If you would like a copy of the robots.txt above click here and save it to your local computer. Then upload this file to your Web server as instructed above. To exclude all robots from part of the server User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /private/ Where does the robots.txt file go?: The robots.txt is to be placed in your root "/" HTML document tree. Example;