Thursday, June 16, 2011

Robots.txt

A robots.txt file reference will have no effect by itself, but improper use can prevent the site is successfully crawled. This means that one or more of your web pages (or entire site) can be indexed and ranked by search engines.

The most important thing to understand about the robots.txt file is that they do not need at all! Many administrators to create a (wrong) and download it on your website only to find the missing site after the next update of Google.

My recommendation is to not use a robots.txt file, unless you absolutely have to. After all, it is impossible to spoil a nonexistent file.

When to use a robots.txt file? Only when you have folders or files you do not want robots to crawl. A good example would be if you have a download page for software or ebook you sell.

You obviously do not want to have your download page listed in Google, where people could find and download your product for free.

You can avoid this by blocking this page with a robots.txt file.

Note: Creating a robots.txt file in Notepad as a text file and then upload to the root directory of your server.

There are several ways to use a robots.txt file, but the simplest, safest and most effective is simply to ban a specific folder.

For example, you have a download page called / software-download.html. You could create a special directory called place / secret download page in this directory. Create a robots.txt file with these two lines:

User-agent: *

Disallow: / private /

The * means that all robots (including Googlebot) should respect the line (s) below. In this case. All robots to respect and follow the robots.txt (not all) want to ignore the directory / secret and all files in the

Another way to avoid scanning files is to ban it completely, like this:

User-agent: *

Disallow: Software download.html

If you want to prevent Googlebot (or any other robot) to crawl, but allow all others, must explicitly name to exclude. For example, the following robots.txt file would prevent Googlebot from crawling the directory / secret while everyone else:

User-agent: Googlebot

Disallow: / private /

You can also disable tracking directories and multiple files by adding an entry for each:

User-agent: *

Disallow: / secret /

Disallow: / cgi-bin /

Disallow: / images /

Disallow: download.html Software

The robots.txt file can be a powerful tool when used correctly, but when used properly, can leave the files on public display and / or damage your ranking in search engines. Rule of thumb: Use a robots.txt file only if necessary and make sure you use it correctly.

0 comments :

Post a Comment

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes | coupon codes