Search engine robots and how they work.

Some problems connected with search robots:

Incompleteness of the standard (Standard for Robot Exclusion)

Unfortunately, as search systems appeared not so long ago, the standard for robots is in a stage of working out, completion, etc. It means that in the future search engines will be guided by it.

Traffic increase:

Not all search robots use/robots.txt

For today this file is necessarily requested by search robots only such systems as Altavista, Excite, Infoseek, Lycos, OpenText and WebCrawler.

Use of meta-tag HTML:

The initial project is lower which has been created as a result of agreements between programmers of some number of the commercial indexing organizations (Excite, Infoseek, Lycos, Opentext and WebCrawler) at recent meeting Distributing Indexing Workshop.

At this meeting use of meta-tag HTML was discussed for management of behavior of search robots, but the definitive agreement has not been reached. Following problems for discussion in the future have been defined:

* Uncertainty of file specifications/robots.txt

* Exact definition of use of meta-tag HTML, or additional fields in a file/robots.txt

* the Information “Please visit”

* Current control of the information: an interval or a maximum of open connections with the server at which it is possible to start to index the server.

ROBOTS meta-tag:

This tag is intended for users who cannot supervise a file/robots.txt on the web sites. Tag allows setting behavior of the search robot for each HTML-page, however thus it is impossible to avoid the reference of the robot to it.

robot_terms is the list of following keywords divided by commas (header or lower case symbols do not play any role): ALL, NONE, INDEX, NOINDEX, FOLLOW, NOFOLLOW.

NONE – speaks to all robots to ignore this page at indexation (it is equivalent to simultaneous use of keywords NOINDEX, NOFOLLOW).

ALL – allows to index this page and all references from it (it is equivalent to simultaneous use of keywords INDEX, FOLLOW).

INDEX – allows indexing this page

NOINDEX – do not allow indexing this page

FOLLOW – allows indexing all references from this page

NOFOLLOW – do not allow indexing references from this page

If this meta-tag is passed or is not specified robot_terms by default search robot arrives as though have been specified robot_terms = INDEX, FOLLOW (i.e. ALL). If in CONTENT keyword ALL robot arrives accordingly it is revealed, ignoring specified other keywords. If in CONTENT there is opposite keywords on sense, for example, FOLLOW, NOFOLLOW the robot arrives at own discretion (in this case FOLLOW).

If robot_terms contains only NOINDEX references from this page is not indexed. If robot_terms contains only NOFOLLOW the page is indexed, and references, accordingly, are ignored.

Currently the web technologies have become very popular. The Internet network is not only a place to entertain but also a space to earn money. In spite of the reason, to be presented in the Internet one needs a site. And this is when the question how to make a website arises. Those who are searching for info on how to build a website, should refer to the Internet itself. There are lots of tutorials on how to make a website and respective topics.

In any way, it wouldn’t be wise not to avail themselves of this chance given to us by digital technologies. Google and other search engines, social networks and forums, blogs – all of them could help to find info on “make a flash website” and similar topics.

Tagged with:

Filed under: Uncategorized

Like this post? Subscribe to my RSS feed and get loads more!