View Full Version : Robots.txt

09-12-2015, 01:17 AM
What is Robots.txt file?

09-12-2015, 01:56 AM
Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.

It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

User-agent: *
Disallow: /

09-12-2015, 03:21 AM
Robots.txt files are Important files for website..? Can we must need to keep this..

09-12-2015, 04:03 AM
Robots.txt is very important for crawling your website.The robots exclusion standard, also known as the robots exclusion protocol or robots.txt protocol, is a standard used by websites to communicate with web crawlers and other web robots.

09-12-2015, 04:54 AM
09-14-2015, 02:00 AM
09-14-2015, 02:09 AM
Robots.txt is the txt files which helps to crawls or don't crawls the site pages. For Example, if you want your site pages don't crawls, you can use robots.txt file.

User-agent: *
Disallow: /backend/

09-14-2015, 03:09 AM
Robots.txt is used to inform google about which web page of your website you should be crawled and which should not be

09-15-2015, 12:58 AM
The robots.txt file defines how a search engine spider like Googlebot should interact with the pages and files of your web site. If there are files and directories you do not want indexed by search engines, you can use a robots.txt file to define where the robots should not go. The robots.txt is a very simple text file placed on your web server.

09-15-2015, 02:43 AM
It is a text file used to define the functions of search engine crawlers. They contain instructions to crawl and disallow certain webpages in your website.

09-21-2015, 03:18 AM
The robot.txt file is used when a website owner or webmaster likes that search engine robots do not visit their special webpage.

09-21-2015, 06:11 AM
It is a kind of text file utilized to provide the instructions to the crawlers about the caching and indexing of a website, directory, domain or a file of a webpage.

09-21-2015, 06:41 AM
It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.

09-21-2015, 07:23 AM
You can block your webpages and folders to get crawl and indexing in any search engine.

09-21-2015, 07:39 AM
Robots.txt protocol, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies the instruction format to be used to inform the robot about which areas of the website should not be processed or scanned. Robots are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code.

09-22-2015, 12:04 AM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too na´ve to rely on robots.txt to protect it from being indexed and displayed in search results.