PDA

View Full Version : How to use robots.txt file to block a directory?



OPI
03-15-2017, 08:22 AM
How to use robots.txt file to block a directory?

fayeseom
03-16-2017, 12:40 AM
Sometimes you have a directory containing decorative images, or temp files. And sometimes you may not want this to be indexed in the search engines. You might also have a directory with some private files, which you don't want to have indexed.

The below will block access to a directory called "images" as well as its sub directories.
User-agent: *
Disallow: /images/

Multiple directories? No problem!
User-agent: *
Disallow: /images/
Disallow: /temp/
Disallow: /cgi-bin/

friendhrm
03-16-2017, 12:47 AM
Webcrawlers can be prevented from accessing certain directories of your website by using the disallow option in your robot.txt file.

Web site owners use the /robots.txt file to give instructions about their site to web robots. It works likes this: A robot wants to vist a website URL (example: http://www.example.com/welcome.html). Before it does, it checks for http://www.example.com/robots.txt, and finds:

User-agent: *
Disallow: /
The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site. If you need to prevent the robots from accessing the cgi-bin directory, use the following lines in your robot.txt file:

User-agent: *
Disallow: /cgi-bin/
Robots directives for Disallow/Allow are case-sensitive. Use the correct capitalization to match your website.

OPI
03-16-2017, 01:18 AM
Sometimes you have a directory containing decorative images, or temp files. And sometimes you may not want this to be indexed in the search engines. You might also have a directory with some private files, which you don't want to have indexed.

The below will block access to a directory called "images" as well as its sub directories.
User-agent: *
Disallow: /images/

Multiple directories? No problem!
User-agent: *
Disallow: /images/
Disallow: /temp/
Disallow: /cgi-bin/

Thank You for helping me

OPI
03-16-2017, 01:20 AM
Webcrawlers can be prevented from accessing certain directories of your website by using the disallow option in your robot.txt file.

Web site owners use the /robots.txt file to give instructions about their site to web robots. It works likes this: A robot wants to vist a website URL (example: http://www.example.com/welcome.html). Before it does, it checks for http://www.example.com/robots.txt, and finds:

User-agent: *
Disallow: /
The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site. If you need to prevent the robots from accessing the cgi-bin directory, use the following lines in your robot.txt file:

User-agent: *
Disallow: /cgi-bin/
Robots directives for Disallow/Allow are case-sensitive. Use the correct capitalization to match your website.

Ya, I got it now, Thanks for the help