PDA

View Full Version : Robots.txt file



jesicawillss
02-15-2012, 05:49 AM
I would like to know that which things of website, an expert SEO must put into robots.txt file for disallow? Which pages is better to not show to Search Engines?

rupeshgupta
02-15-2012, 06:59 AM
Robots.txt file is use block the page and urls.

dpfocjames
02-15-2012, 07:11 AM
Robots.txt allows to disable file to Google. So its depend on you that what file you do not want to show to Google.

stevemack
02-15-2012, 07:11 AM
it is better to restrict folders like cgi, images, scripts, inc, functions, lib etc which are not needed to index by any search engine.

Amy.Sarin
02-17-2012, 02:46 AM
I would like to disallow pages containing,
-Personal/ Legal information
-Admin login details
-Duplicate files
-cgi bin pages

Amy.Sarin
02-17-2012, 02:54 AM
I would like to disallow pages containing,
-Personal/ Legal information
-Admin login details
-Duplicate files
-cgi bin pages

Venus Brown
02-17-2012, 03:55 AM
You may use robot.txt for duplicate pages, comment pages, older webpage which are no longer required etc.

tinilxyz
02-18-2012, 12:29 AM
Robots.txt file is a file, which is initially checked by the search engine crawlers (robots), and based on it search engine decides, which pages have to do index, and which pages do not have to index. If you want search engines does not access the certain folders, you can use simple robot txt command "Disallow: /cgi-bin/" (without the quotes) – and the directory will not be accessible for search engine. Some search engine optimization experts claim that bots do not follow the rules, but it is not true. You can also use robots.txt file declaring your site map without creating a Google and Yahoo account, in which you have to submit it manually.

Liliane
06-15-2012, 08:14 AM
With the help of disallow attribute of the robot.txt one can restrict the crawler to access the certain webpage as well as the category. Some critical information that need not be necessary to index into the search engine result pages should be formatted in robot.txt with disallow attribute.

henrycw
06-15-2012, 08:52 AM
Mostly use in duplicate content pages and payment and admin login pages.

sabrinasai
06-18-2012, 01:21 AM
A robots.txt file on a website will function as a request that specified robots ignore specified files or directories when crawling a site.

safrine
06-18-2012, 01:33 AM
The discussions are very helpful

john515
06-18-2012, 04:27 AM
You can disallow pages or folders like admin folders, cgi-bin, image folder, which are not relevant to the search engines. Robots.txt helps tell spiders what is useful and public for sharing in the search engine indexes and what is not.

yoscommerce
06-19-2012, 02:40 AM
As far as I know it is vast when search engines often visit your site and index your content but often there are cases when indexing parts of your online content is not what you want.

Mandy46
06-19-2012, 02:49 AM
Robot.text file is a simple notepad file which is used for getting information about crawling and caching.

chat2vishakha
06-19-2012, 04:16 AM
If we wants to hide some pages from google then we create robots.txt file.

alfiecharleng
06-19-2012, 04:25 AM
Hi

When google crawl on sites and site owner donot want to crawl the site, then robot.txt file is used ...

terrijhon
06-19-2012, 05:39 AM
thanks for shearing the great answers..

genericpillshop
06-19-2012, 06:03 AM
Robots.txt allows to disable file to Google.

foreignpharmacy
06-19-2012, 06:25 AM
Robots.txt file is use block the page and urls.

nanan699
06-19-2012, 07:45 AM
Robots.txt is a file which is used to exclude content from the search engine bots. Robots.txt is also called Robots Exclusion Protocol.

In general, we prefer that our website pages are indexed by search engines. But there may be some content that we don’t want to be crawled by search engine bots. Like the personal images folder and cgi-bin, and many more. The main idea is we don’t want them to be indexed.