PDA

View Full Version : What is robots.txt?



Henry Ford
09-20-2011, 06:37 AM
Robot.txt is use for crawling for site it is telling about which page you would like to spider or not ? It is great when search engines frequently visit your site and index your content but often there are cases when indexing parts of your online content is not what you want. For instance, if you have two versions of a page (one for viewing in the browser and one for printing), you'd rather have the printing version excluded from crawling, otherwise you risk being imposed a duplicate content penalty. Also, if you happen to have sensitive data on your site that you do not want the world to see, you will also prefer that search engines do not index these pages (although in this case the only sure way for not indexing sensitive data is to keep it offline on a separate machine).

Ibiza Gran Hotel (http://www.ibiza-hotels.com/ibizagranhotel) | Lux Mar Apartments Ibiza (http://www.ibiza-hotels.com/luxmar) | El Puerto Apartments Ibiza (http://www.ibiza-hotels.com/el-puerto)

Liliane
06-12-2012, 07:42 AM
All websites have robots.txt file to give the instruction to major search engine spiders to crawl webpage or not of the various websites. It also guides them to index a web page into a search engine or not. This file is created in a text editor and saved as Robots.txt file and submit or upload it in website control panel in the root folder. It is helpful to store confidential matter on the website which you don’t want to display in search engine result.

theshail
06-12-2012, 07:58 AM
Generally webmasters use this file to block "Terms & Conditions" and "Privacy Policy" Pages as they're quite identical on every website. And it could create the risk of duplicate content.

Robbin07
06-12-2012, 09:06 AM
Robot.text is a note pad file which through we find out the crawling information.

android45
06-12-2012, 11:56 PM
Telling search engine not to crawl particular page which we don't want to crawl through a notepad on the root file is called robot.txt.

sabrinasai
06-13-2012, 12:52 AM
Robots.txt is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.

rajjan011
06-13-2012, 01:58 AM
robots.txt file is very useful to crawl the website. without robots.txt , search engine does not crawls your sites.

shoaib
06-13-2012, 02:39 AM
it helps you to what pages you want to indexed by search Engine or dont want to index

andhrareporter
06-13-2012, 04:00 AM
Robot.txt file use to hide the privacy or any private web page of a website from the spider of Google.So that Google's spider not reach your privacy page.

terrijhon
06-13-2012, 04:01 AM
i really want to thanks for this post..thanks

adumpaul
06-13-2012, 06:18 AM
Robots.txt is a text file present in the root directory of a website. The Robots.txt file is a convention created to direct the activity of search engine crawlers or web spiders.

henrycw
06-13-2012, 06:22 AM
The robots.txt file is a text file that tells search engine crawlers which portions of your website they should NOT index. If you don't want to restrict search engine crawlers, you should simply create an empty robots.txt file (e.g., touch robots.txt) or one that looks like this:

User-agent: *
Disallow:

Once you have created a robots.txt file, you store it in the root directory of your Web server. To test if you've done this correctly, visit http://yoursite.com/robots.txt (where you replace yoursite.com with your actual website). If you see the robots.txt file you created, you're good to go.

matthew
06-13-2012, 07:00 AM
Robot.txt is the text file which we usually put on our website as to tell the search engines bots which pages you would them not to visit.

jamsen
01-31-2013, 04:24 PM
The robots exclusion protocol (REP), or robots.txt is a text file webmasters create to instruct robots (typically search engine robots) on how to crawl & index pages on their website.

webcreations
02-01-2013, 02:34 AM
The robots.txt is a simple text file in your web site that inform search engine bots how to crawl and index website or web pages.By default search engine bots crawl everything possible unless they are forbidden from doing so. They always scan the robots.txt file before crawling the web site.The robots.txt is usually placed in the root folder of your web site.using the robots.txt file, you can hide the pages such as user profiles and other temp folders from being indexed and does not divulge your SEO effort into junk or the pages which are useless for the search results. In general, you results will be more precise and better valued.

cookaltony
02-01-2013, 12:04 PM
Robots.txt is one of the best text file for deny the web pages to crawl in the search engine.it is beneficial to protect your important pages in the search engine for crawling.

theavi
02-02-2013, 01:42 AM
Robots.txt file is a notepad file which instruct to search engine spiders to not to crawl the pages which this file has.

thearav
02-02-2013, 06:43 AM
Robots.txt is a file which instruct google to not to crawl the pages which this file has. The robots.txt file should be on server in root directory of website.

bivin007
04-28-2022, 10:51 PM
Robots.txt is a text file located in a website’s root directory that specifies what website pages and files you want (or don’t want) search engine crawlers and spiders to visit. Usually, website owners want to be noticed by search engines; however, there are cases when it’s not needed.

taxiongo
04-29-2022, 07:13 AM
A robots.txt file is a set of instructions for bots. This file is included in the source files of most websites. Robots.txt files are mostly intended for managing the activities of good bots like web crawlers, since bad bots aren't likely to follow the instructions.

tbsind
04-29-2022, 07:52 AM
We can say that Robots.txt is a file where you can create allow to control the crawling of your website. It allows search engines to choose which website url they can crawl or not.