About the Robots <META> tag In a nutshell You can use a special HTML <META> tag to tell robots not to index the content of a page, and/or not scan it for links to follow. For example: <html> <head> <title>...</title> <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> </head> There are two important considerations when using the robots <META> tag: * robots can ignore your <META> tag. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention. * the NOFOLLOW directive only applies to links on this page. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page. Don't confuse this NOFOLLOW with the rel="nofollow" link attribute. The details Like the
About /robots.txt In a nutshell Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol. It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds: User-agent: * Disallow: / The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site. There are two important considerations when using /robots.txt: * robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention. * the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use. So don't try to use /