PDA

View Full Version : What is the robots.txt file?



mackhunt88
06-11-2012, 11:58 AM
Please can anyone explain me what is robots.txt file and its correct format?

delhi
06-11-2012, 12:16 PM
Robots.txt a text file made on notepad which instructs search engines about their allowed crawling or visitng areas on full website. Example content for a robots.txt file which give full permission to all search engines to crawl full website is given below:
User-agent: *
Disallow:

watson123
06-12-2012, 01:16 AM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

sabrinasai
06-12-2012, 01:31 AM
The concept and structure of robots.txt has been developed more than a decade ago and if you are interested to learn more about it, visit http://www.robotstxt.org/ or you can go straight to the Standard for Robot Exclusion because in this article we will deal only with the most important aspects of a robots.txt file. Next we will continue with the structure a robots.txt file.

williamlukee
06-12-2012, 01:59 AM
Actually this term design for making any big changes in your current well developed site.Some people want some changes without any seo risk......

andhrareporter
06-12-2012, 04:21 AM
Roberts.text file is the file which is used to hide the privacy page or any private page from the Google spider.So that we maintain our privacy.

AllenSantiago
06-12-2012, 06:46 AM
Robots.txt is one kind of file that is mainly used to tell or instruct search engine robot regarding your website so that frequently visit your website and index your content. Each webpage must have different robots.txt file just like http://www.abc.com/robots.txt .
Example:-
User-Agent: *
Disallow:
Hence, it visits all pages because * contains all robots

terrijhon
06-12-2012, 06:51 AM
thanks for shearing your experiences with us..

absmoving143
06-12-2012, 07:41 AM
The Robot Exclusion Standard, also known as the Robots Exclusion Protocol or robots.txt protocol.It is a convention to prevent cooperating web crawlers and other web robots from accessing all or part of a website which is otherwise publicly viewable. These are often used by search engines to categorize and archive web sites, or by webmasters to proofread source code.

jamsen
02-03-2013, 04:07 PM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

jaysh4922
02-04-2013, 01:39 AM
Robot.txt is an on-page SEO technique and it is basically used to allow for the web robots also known as the web wanderers, crawlers or spiders. It is a program that traverses the website automatically and this helps the popular search engine like Google to index the website and its content.

lucyrai1
02-04-2013, 05:12 AM
Robot.txt file is using for disallow duplicate page or images from crawls.

Rajdeep Bose
02-04-2013, 05:25 AM
Robots.txt is a file through which you can guide search engines to crawl or not to crawl certain sections of your website.

mobileweb003
03-07-2013, 10:52 AM
Robots.txt file is used to what pages you want crawler to crawl or not.

mobileweb003
03-07-2013, 11:12 AM
The Robots.txt file of a website will work when it is used as a request to specific robots to ignore directories or files specified within the Robots.txt file.

bhadriram
03-07-2013, 01:19 PM
Robots.txt file which tell to google to crawl the webpages or not

webcreations
03-08-2013, 12:13 AM
A Robots.txt file is frequently used by search engines to categorize and archive web pages, or by webmasters to proofread source codes.Robots.txt is a text file present in the root directory of a website. The Robots.txt file is a convention created to direct the activity of search engine crawlers or web spiders. The file tells the search engine crawlers which parts to web and which parts to leave alone in a website, differing between what is viewable to the public and what is viewable to the creators of the website alone.

alaxhooper
03-08-2013, 02:22 AM
Robots.txt is a file placed in the root of a domain that is used to inform search robots about the structure of your website. This is often used to block robots from specific folders or pages of your site.

yepionline
03-12-2013, 09:19 AM
I really like this post, very informative and very useful

Vamp1re
03-13-2013, 02:57 AM
Robots text file are those which forced to search engine how to crawl and index the your site or web pages.
It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.

rsewak
03-13-2013, 08:42 AM
Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion Protocol.

blessy_smith
05-17-2013, 06:19 AM
It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want. For instance, if you have two versions of a page (one for viewing in the browser and one for printing), you'd rather have the printing version excluded from crawling, otherwise you risk being imposed a duplicate content penalty. Also, if you happen to have sensitive data on your site that you do not want the world to see, you will also prefer that search engines do not index these pages (although in this case the only sure way for not indexing sensitive data is to keep it offline on a separate machine). Additionally, if you want to save some bandwidth by excluding images, stylesheets and javascript from indexing, you also need a way to tell spiders to keep away from these items.

Markgibson431
05-18-2013, 04:18 AM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

saiyadgulbaz
05-18-2013, 08:25 AM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

Manov
05-20-2013, 02:03 AM
The below link gives you an detailed information about Robots.txt
http://www.javascriptkit.com/howto/robots.shtml

barbu5
05-20-2013, 03:49 AM
When you add your website in google.com/webmasters it generates itself the robot.txt. you don't need to add any file or change anything. If you want not to index meta, category, archive download "all in one seo pack" plugin and check not to index them.

Jitendra Shukla
05-20-2013, 08:24 AM
robot.txt is is a file to prevent Google's Crawler to read any particular page. it is basically use to hide pages which we don't want to index by Google.

SANTOSH123
05-20-2013, 08:29 AM
Robots.txt a text file made on notepad which instructs search engines about their allowed crawling or visitng areas on full website.
Robots.txt file is used to what pages you want crawler to crawl or not.

Jitendra Shukla
05-22-2013, 09:21 AM
Most of the e-commerce site use robots.txt to hide few important pages like Payment and Thanks us page. :)

sptechnolab
05-23-2013, 06:40 AM
robot.txt file is used to direct or to tell web bots which pages and directories to index or not to. This file must be compulsory placed in the root directory.

sharprobert
05-25-2013, 06:44 AM
"Robots.txt" is a regular text file that through its name, has special meaning to the majority of "honorable" robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it's both meaningless to you and a waste of your site's bandwidth.

SASA techno
05-27-2013, 05:59 AM
Robot.txt means which one file that not crawling in search engine that file is online visitor see that but search engine that file is not crawl.

jayanta1
05-27-2013, 07:30 AM
It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want. A robots.txt file restricts access to your site by search engine robots that crawl the web. Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site.