Results 1 to 10 of 10
  1. #1
    Registered User
    Join Date
    May 2019
    Location
    USA
    Posts
    211

    How do I use robots txt?

    hi

    How do I use robots txt?

  2. #2
    Junior Member
    Join Date
    Jun 2019
    Posts
    20
    User-agent
    Disallow
    Allow
    Blocking sensitive information
    Blocking low quality pages
    Blocking duplicate content

  3. #3
    Member
    Join Date
    Dec 2018
    Location
    Chennai, India
    Posts
    99
    Robots.txt is a text file webmasters create to instruct web robots (typically search engine robots) how to crawl pages on their website. The robots.txt file is part of the the robots exclusion protocol (REP), a group of web standards that regulate how robots crawl the web, access and index content, and serve that content up to users. The REP also includes directives like meta robots, as well as page-, subdirectory-, or site-wide instructions for how search engines should treat links (such as “follow” or “nofollow”).
    Basic format:
    User-agent: [user-agent name]
    Disallow: [URL string not to be crawled]

  4. #4
    Senior Member
    Join Date
    Nov 2018
    Posts
    527
    A robots.txt shows which pages or files the Googlebot can or can't request from a website. Webmasters usually use this method to avoid overloading the website with requests.

  5. #5
    Registered User
    Join Date
    Jun 2019
    Posts
    7
    Robots are applications to crawl through websites.
    Syntax:
    1. define the user agent: this section applies to all robots.
    2. disallow: state the URL here, to block access to pages or a section of your website
    3. allow: If you want to unblock a URL path within a blocked parent directly, enter the URL subdirectory path

  6. #6
    Registered User
    Join Date
    Jun 2018
    Posts
    719
    How to use Robots.txt file?

    Define the User-agent. State the name of the robot you are referring to (i.e.
    Disallow. If you want to block access to pages or a section of your website, state the URL path here.
    Allow.
    Blocking sensitive information.
    Blocking low quality pages.
    Blocking duplicate content.

  7. #7

  8. #8
    Senior Member
    Join Date
    Jul 2006
    Location
    IndiaMDM
    Posts
    354
    Robots.txt file is at the root of the website that involves sectors of your website you don’t want to be attained by search engine crawlers. Webmasters use a robot.txt file to instruct the search engine robots on how to crawl & index the web pages.

  9. #9
    Registered User
    Join Date
    Feb 2019
    Posts
    1,526
    Define the User-agent. State the name of the robot you are referring to (i.e. ...
    Disallow. If you want to block access to pages or a section of your website, state the URL path here.
    Allow. ...
    Blocking sensitive information. ...
    Blocking low quality pages. ...
    Blocking duplicate content.

  10. #10
    Senior Member
    Join Date
    Feb 2018
    Location
    Bangalore
    Posts
    123
    Web site owners use the /robots.txt file to give instructions about their site to web robots; this is called The Robots Exclusion.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

  Find Web Hosting      
  Shared Web Hosting UNIX & Linux Web Hosting Windows Web Hosting Adult Web Hosting
  ASP ASP.NET Web Hosting Reseller Web Hosting VPS Web Hosting Managed Web Hosting
  Cloud Web Hosting Dedicated Server E-commerce Web Hosting Cheap Web Hosting


Premium Partners:


Visit forums.thewebhostbiz.com: to discuss the web hosting business, buy and sell websites and domain names, and discuss current web hosting tools and software.