How to Prevent Duplicate Content with Effective Use of the Robots.txt and Robots Meta Tag

Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.

Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.

There are two ways to control how the search engine spiders index your site.

1. The Robot Exclusion File or "robots.txt" and

2. The Robots < Meta > Tag

The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.

The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.

Creating your robots.txt file

Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:

User-agent: *
Disallow:

Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.

Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:

User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html

Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.

Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:

User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html

Explanation

By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.

That's all there is to it!

As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.

The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;

In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.

In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.

What could be simpler!

Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.

Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company

In The News:


pen paper and inkwell


cat break through


Link Swapping - How to Win the Website Marketing Game in 3 Easy Steps

If you are a webmaster, chances are one of your... Read More

Is Something Missing From Your Keywords Research? (Part 2)

In my previous article, I raised the issue that proper... Read More

The Unethical SEO Myth

"The use of black hat SEO techniques are completely unethical."... Read More

Increase Page Rank with Search Engine Optimization

Utilizing effective search engine optimization techniques will improve the page... Read More

Meta Tags - What Are They and Which Search Engines Use Them?

Defining Meta Tags is much easier than explaining how they... Read More

10 Easy Steps to Boost Your Search Engine Rankings!

In order that someone finds your website and buys your... Read More

Attack Smaller Searches To Get The Big Ones!

Searching online can not only be fun, but you sometimes... Read More

Google ? A Bit of History

The first question most people have is, "What the heck... Read More

Why Pay-Per-Inclusion Search Engines are Dying

A Pay-Per-Inclusion search engine is a service in which a... Read More

Search Engine Ranking - What Works Now

The methods employed to increase your search engine rankings may... Read More

SEO Tips - Google Has Changed - Learn Why And What To Do

Google now checks the year your domain name was first... Read More

Submitting Your Site to Search Engines and Directories

Regardless of which type of business you currently have, you... Read More

Website Optimization, Good Overall Optimization is Key

Good overall optimization, the right keyword phrases and quality content... Read More

Make Quality Content Your #1 Priority

It is by now a proven fact that content is... Read More

A Play In The Sandbox Is Necessary

There has been a good deal written about the Google... Read More

10 Costly Search Engine Mistakes to Avoid

If you have a website then you already know the... Read More

Five Simple Steps to Getting Links to Your Site

Today if you want your site to survive in the... Read More

Simple Way to Get High Pagerank

Everyone knows that if your website has a high Google... Read More

Get a Number One Google Ranking With This Simple Technique

You probably do this already - complete regular searches in... Read More

What Did We Learn from the Great Search Engine Experiment!

Last Week I did a Search engine Experiment. I wanted... Read More

A Three Day Marketing Plan for Better Google Rankings

If you're reading this article, you've probably discovered that simply... Read More

Google is Quickly Changing...

Google is quickly changing...With the big buzz of Novembers fall... Read More

HTML Title Tags Dictate Your Rankings

There are a lot of things in Search Engine Optimization... Read More

Search Engine Optimization: Creative Ways To Acquire Natural Back Links

Search engines use algorithms calculate the order in which the... Read More

MLM and SEO - Bad Business! No Business!

MLM has been around way before the Internet. It is... Read More

Meta Tags - An Important Part of Every Web Page

Meta tags are an absolute must from a search engine... Read More

What Is Search Engine Marketing?

It's in our genes, we're driven to seek. We 'hunt'... Read More

Search Engine Optimization - Enhancing Web Site Visibility

I've had several prospects and clients say to me "I... Read More

Why Optimize Your Site For Search Engines?

Sometimes a search engine optimization company will miss that glaring... Read More

Article Marketing: Fox in the Competitor Hen House or Chicken Little?

I recently was asked by an author to remove a... Read More

Keep Your Web Site Content Relevant

Visitors and search engines love content-rich web sites, but just... Read More

How to Make More Money with Your Mambo 4.5.1 Site

For those of you who are not familiar... Read More

Creative Search Engine Optimization ? A Case Study

Search engine optimization this and search engine optimization that. You... Read More