Results 1 to 12 of 12
  1. #1
    Registered User
    Join Date
    Jun 2013
    Posts
    1

    Why is a Robots.txt File Used?

    Why should I have a robots.txt file?

    Thanks...

  2. #2
    Registered User
    Join Date
    Jan 2013
    Posts
    734
    This section has a few handy examples.

    To prevent FreeFind from indexing your site at all:

    user-agent: FreeFind
    disallow: /

    To prevent FreeFind from indexing common Front Page image map junk:

    user-agent: FreeFind
    disallow: /_vti_bin/shtml.exe/

    To prevent FreeFind from indexing a test directory and a private file:

    user-agent: FreeFind
    disallow: /test/
    disallow: private.html

  3. #3
    Senior Member
    Join Date
    Apr 2012
    Posts
    1,019
    Robots.txt are also helpful in blocking the bots from indexing directories that contain scripts. If you have a very plain site that you would be ok with having the engines crawl everything than you do not need a robots.txt.

  4. #4
    Registered User
    Join Date
    May 2013
    Location
    India
    Posts
    37
    Robots.txt is a simple text (not html) file you put on your website root directory to tell search robots which pages you would like them not to visit. By defining a few rules in this text file, you can instruct robots to not crawl certain files, directories within your site, or at all.

  5. #5
    Registered User
    Join Date
    May 2013
    Location
    India
    Posts
    117
    Robots.txt is a text file that consist of half of the URLs of some pages that you don't want search engines to be crawled.

  6. #6
    Junior Member
    Join Date
    Apr 2013
    Posts
    29
    If you dont want to crawl the thing in search engine, just create robot.txt and add them in it. Search engine will nevrer crawl them.

  7. #7
    Registered User
    Join Date
    Jun 2013
    Posts
    4
    To tell which search engine and which pages to crawl and index.....

  8. #8
    Senior Member
    Join Date
    Mar 2020
    Posts
    1,214
    The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned.

  9. #9
    Senior Member
    Join Date
    Jun 2013
    Location
    Forum
    Posts
    5,019
    Robots.txt is a text file that lists webpages which contain instructions for search engines robots. The file lists webpages that are allowed and disallowed from search engine crawling.
    Cheap VPS | $1 VPS Hosting
    Windows VPS Hosting | Windows with Remote Desktop
    Cheap Dedicated Server | Free IPMI Setup

  10. #10

  11. #11

  12. #12
    Senior Member
    Join Date
    Dec 2019
    Posts
    1,837
    A robots. txt file tells search engine crawlers which pages or files the crawler can or can't request from your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

  Find Web Hosting      
  Shared Web Hosting UNIX & Linux Web Hosting Windows Web Hosting Adult Web Hosting
  ASP ASP.NET Web Hosting Reseller Web Hosting VPS Web Hosting Managed Web Hosting
  Cloud Web Hosting Dedicated Server E-commerce Web Hosting Cheap Web Hosting


Premium Partners:


Visit forums.thewebhostbiz.com: to discuss the web hosting business, buy and sell websites and domain names, and discuss current web hosting tools and software.