CURRENT VERSION IS SEY 2003
and the Robots Meta Tag
André le Roux
This page is NOT maintained. For an updated discussion on the Robots.txt
file and the Robots Meta Tag, please refer to the current version of
the Search Engine Yearbook.
does the "robots" text file do?
Most sites contain pages that should not be indexed
by the search engines. Administrative pages, for example, Pandecta Magazine's
"contact" page: "contact.html". There's no need
to have it indexed, so we use the robots.txt file to tell the search
engine spider (robot) to ignore it.
The robots.txt file must be in your root directory.
Like this: www.pandecta.com/robots.txt
Not like this: www.pandecta.com/admin/robots.txt
syntax of the robots.txt file
The first line specifies which robots should ignore
/images/, /contact.html and /privacy/privacy.html. The asterisk * is
a wildcard - so all robots should ignore the directories and files listed
below it. If I only wanted Googlebot to ignore those directories &
files, I'd type "User-agent: Googlebot".
The second line refers to an entire directory.
Nothing in that directory will be indexed.
The third line refers to a specific page in the
root directory - in this case the contact.html file.
The fourth line refers to a specific file
in a specific directory.
The robots meta
The Robots META tag does exactly the same thing
as the robots.txt file - but it is not as reliable. Not all robots honor
the robots meta tag.
Use it if your site is in a subdirectory like www.freewebspace.com/users/mycoolhomepage/
and you can't get the server administrator to add (or add changes to)
a robots.txt file.
If you have access to your root directory, forget
about the robots meta tag. Use the robots.txt file. No need to have
The syntax of
the robots meta tag is:
<META NAME="ROBOTS" CONTENT="NOINDEX,
Type that between the <HEAD> and </HEAD>
tags on each page you do not want to be indexed.
robots.txt Syntax Checker
Names by the Search Engine Dictionary
CNN's robots.txt file:
page is based on information contained in the Search Engine Yearbook 2003.
For more detailed search engine information & help, please refer to the
current version of the book.
up to date on changes in the search engine world with the EnginePaper
Newsletter. It goes out only when something important changes in
the search engine world. Subscribe now with a blank email to