Wednesday, August 8, 2012

What is Robots.txt

A robots.txt is a file that tells search engine robots to crawl or ignore some specified files, directories of the website.  By this we can restrict robots to visit any specified files or folders.

Search engines periodically visit your website and index your pages accordingly content & page quality. But you may have problem if you don't want to cache some pages by search engines. Some of pages are useful inside but not for publicly. For example you have an admin pannel in your website for manage website and the path is "sitename.com/admin/administrator" that is used for user login & online work. You probably don't want to show it out in search engine results or want to restrict for robots, then you should create a txt file named "robots.txt" to stop crawling by google search engine. Also, there are some cases of sensitive files or folders on your website that you want to make private, you will also prefer that search engines do not crawl these files or folders.



Robots.txt File


The robots.txt is very simple to create. You can add a countless list in robots.txt file, After uploading this file search engine spiders will not follow the files or folders that you restricted in robots.txt.

Here is given the code to create the "robots.txt" file:


User-agent: *

Disallow: /administrator/
Disallow: /cache/
Disallow: /tmp/
Disallow: /includes/
Disallow: /installation/image

1 comment:

  1. Its a valuable information about Robots.txt file good job

    ReplyDelete

Thanks for Your Comment.. If you have any question then mail me at manisha.web@gmail.com