Internet review - Lite Web-master Graphics Games Freeware Time Christmas Bible
Website Hosting Clouds Domains Resources Forms on the site Promotion
ASCII HTML robots.txt SEO Validators Webmasters Web Editors Icon Generator Internet Links

robots.txt

robots.txt
Specifies to search robots what catalogs to take for indexing should not be. If it is empty or does not exist, then everything can be taken.
List of search engine robots


Generate robots.txt file

The search engines always look for a file called "robots.txt" in the root directory of your domain (http://www.mydomain.com/robots.txt) .
This file tells the robots (spiders-indexers) what files they can index and which ones they do not.

robots.txt consists of two fields:

  1. User-agent is the name of the robot,
  2. Disallow - prohibits the indexing of a file or directory.
  3. comments - start with a new line with #.

Rules

Editors
Robots.txt should be created in text format.
As an editor you can use notepad, FTP client, some HTML-editors.

Title
Robots.txt, not robot.txt or Robots.txt, otherwise it will not work.

Location
The robots.txt file should be located in the root directory.

Spaces
":"
Spaces do not matter.

Comments
Comments - start with a new line with #. A space after # is optional.

Order
1st line User-agent, which defines the robot,
And bynext Disallow specifies a file or folder that is not indexed.

If the ban refers to a number of robots, then they are written one by one separately, and then a ban or list of prohibitions, for example:

User-agent: StackRambler
User-agent: Aport
Disallow:/eng
Disallow:/news

#Rambler and Aport to disallow the indexing of links,
#which begin with/news and/eng

The same and for Disallow - every ban with a new line.

If for different robots different prohibitions, then they are separated by an empty string, for example:

User-agent: *
Disallow:/news
# disallow for all the indexing of links,
#which begin with/news

User-agent: StackRambler
User-agent: Aport
Disallow:/eng
Disallow:/news
#Rambler and Aport to disallow the indexing of links,
#which begin with/news and/eng

User-agent: Yandex
Disallow:

#Yandex allow all.

Prevent all robots from indexing files with .doc and .pdf extensions.:

User-Agent: *
Disallow:/*.doc$
Disallow:/*.pdf$

Examples

User-agent: Roverdog
Disallow: email.htm

Allows all robots to index everything:
User-agent: *
Disallow:

Disallow all robots everything:
User-agent: *
Disallow:/

It disallows all robots to index the email.htm file, all files in the folder "cgi-bin" and the folder of the 2nd level "images":
User-agent: *
Disallow: email.htm
Disallow:/cgi-bin/
Disallow:/images/

It disallows Roverdog from indexing all server files:
User-agent: Roverdog
Disallow:/

One moe example:
User-agent: *
Disallow:/cgi-bin/moshkow
Disallow:/cgi-bin/html-KOI/AQUARIUM/songs
Disallow:/cgi-bin/html-KOI/AQUARIUM/history
Disallow:/cgi-bin/html-windows/AQUARIUM/songs
Disallow:/cgi-bin/html-windows/AQUARIUM/history

META tag ROBOTS

META robots tag is used to enable or disallow robots coming to the site to index this page. In addition, this tag is designed to offer robots a walk through all the pages of the site and index them. Now this tag is becoming more important.

<HTML>
<HEAD>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<META NAME="DESCRIPTION" CONTENT="This page ….">
<TITLE>...</TITLE>
</HEAD>
<BODY>

NOINDEX - forbids document indexing;
NOFOLLOW - Denies passage by the links in the document;
INDEX - allows indexing of the document;
FOLLOW - Allows you to follow links.
ALL - index everything, is equal to INDEX, FOLLOW
NONE - do not index anything, is equal to NOINDEX, NOFOLLOW

Robot meta tag examples:

<META NAME=ROBOTS" CONTENT="NOINDEX, FOLLOW">
<META NAME=ROBOTS" CONTENT="INDEX, NOFOLLOW">
<META NAME=ROBOTS" CONTENT="NOINDEX, NOFOLLOW">

Robots.txt Checker - Free check of file functionality robots.txt.


For a web master Textbooks, reference books .htaccess CHMOD ERROR - table return codes 404 META tags CSS MySQL cribs

Protection from auto-fill forms Redirect Validity
Soft
Web soft Best web-based utility Online WYSIWYG WAP software
Favicon
Favicon Editors icons Icon Generator online Generator icons for online smartphone
RSS
RSS Examples of RSS Example Atom-document



Mobile version

Terms of publication of the article
Advertising
About us
Graphics

Fonts
Logos
Brandbooks
Pictogramms
Heraldry

Popular

Check a website level
A website registration
How to creat a website
#1 on Google
Online Translators
Password

Internet top

©2005-2024, Web studio Ph4 - Internet Catalog for user, web-master and designer v. 6.0.3