Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.
Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.
There are two ways to control how the search engine spiders index your site.
1. The Robot Exclusion File or "robots.txt" and
2. The Robots < Meta > Tag
The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.
The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.
Creating your robots.txt file
Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:
User-agent: *
Disallow:
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.
Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.
Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.
That's all there is to it!
As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.
The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;
In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.
In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.
What could be simpler!
Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.
Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company
![]() |
|
![]() |
|
![]() |
|
![]() |
If you are a webmaster, chances are one of your... Read More
In my previous article, I raised the issue that proper... Read More
"The use of black hat SEO techniques are completely unethical."... Read More
Utilizing effective search engine optimization techniques will improve the page... Read More
Defining Meta Tags is much easier than explaining how they... Read More
In order that someone finds your website and buys your... Read More
Searching online can not only be fun, but you sometimes... Read More
The first question most people have is, "What the heck... Read More
A Pay-Per-Inclusion search engine is a service in which a... Read More
The methods employed to increase your search engine rankings may... Read More
Google now checks the year your domain name was first... Read More
Regardless of which type of business you currently have, you... Read More
Good overall optimization, the right keyword phrases and quality content... Read More
It is by now a proven fact that content is... Read More
There has been a good deal written about the Google... Read More
If you have a website then you already know the... Read More
Today if you want your site to survive in the... Read More
Everyone knows that if your website has a high Google... Read More
You probably do this already - complete regular searches in... Read More
Last Week I did a Search engine Experiment. I wanted... Read More
If you're reading this article, you've probably discovered that simply... Read More
Google is quickly changing...With the big buzz of Novembers fall... Read More
There are a lot of things in Search Engine Optimization... Read More
Search engines use algorithms calculate the order in which the... Read More
MLM has been around way before the Internet. It is... Read More
Meta tags are an absolute must from a search engine... Read More
It's in our genes, we're driven to seek. We 'hunt'... Read More
I've had several prospects and clients say to me "I... Read More
Sometimes a search engine optimization company will miss that glaring... Read More
I recently was asked by an author to remove a... Read More
Visitors and search engines love content-rich web sites, but just... Read More
For those of you who are not familiar... Read More
Search engine optimization this and search engine optimization that. You... Read More
One-way link building is a great way to improve your... Read More
I hear this all the time. "I can get you... Read More
In very simple words, the link popularity of your site... Read More
For so many web surfers, it's almost automatic to type... Read More
With so many internet and home business opportunities on the... Read More
We have all heard that adding quality content to your... Read More
Let's face facts - Search engines are starting to rule... Read More
Search Engine Optimization (SEO) is a very complex process. It... Read More
PageRank - an exclusive technology developed by Google which can... Read More
The sandbox effect or (site getting banned on google) has... Read More
Most people feel that optimizing is to target the search... Read More
Do you have a website that has little or no... Read More
What is Search Engine Optimization?Search Engine Optimization or SEO for... Read More
One of the most important steps in any site's publicity... Read More
When search engines first appeared, they were simple affairs consisting... Read More
Getting a high ranking on Google is a big achievement.... Read More
In a perfect world, everyone would be honest.In a perfect... Read More
With apologies for the cheap trick of mentioning Paris Hilton... Read More
I have spent some time discussing the 5 different options... Read More
A Pay-Per-Inclusion search engine is a service in which a... Read More
No doubt, having a high search engine ranking is very... Read More
We all, meaning us webmasters want to have the best... Read More
The top three search pages- the only place you'll be... Read More
As I read the latest news online about what Google... Read More
There is a way to generate links with the content... Read More
As we all know Google uses their PageRank technology to... Read More
The point of optimizing your website is so that you... Read More
If you don't know already, one of the key success... Read More
For me personally, Wordtracker.com is not just a... Read More
Every website has times when traffic is higher than others.... Read More
In today scenario when we talk about Search Engine Optimization,... Read More
In the Global Internet era the industry presence is undoubtedly... Read More
Achieving a top ranking position in Google is every webmasters... Read More
Search Engine Optimization (SEO) |