Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.
Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.
There are two ways to control how the search engine spiders index your site.
1. The Robot Exclusion File or "robots.txt" and
2. The Robots < Meta > Tag
The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.
The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.
Creating your robots.txt file
Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:
User-agent: *
Disallow:
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.
Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.
Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.
That's all there is to it!
As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.
The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;
In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.
In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.
What could be simpler!
Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.
Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company
Search engine optimization refers to the technique of making your... Read More
Good keywords are frequently searched for (high demand) but not... Read More
Google is the major search engine webmasters have to deal... Read More
Search engines are one of the best tools to bring... Read More
Why, you ask? Mainly, because search engines want to provide... Read More
Search engine optimization refers to the technique of making your... Read More
After 105 days Google finally updated PR. And it's about... Read More
You can easily get confused by all the search engine... Read More
Today's article is about the wonders of SEO. SEO is... Read More
I set out with the intention of writing a self... Read More
Search engine optimization remains a minefield of old advice, outdated... Read More
This is a serious matter, can Google really deliver top... Read More
Introduction The Google Sandbox is a metaphorical term... Read More
For me personally, Wordtracker.com is not just a... Read More
In October 2002, the Yahoo! portal changed the way it... Read More
For websites, one of the most important things in their... Read More
In parts 1 and 2 you learnt how to develop... Read More
The Site Map is a too often overlooked piece of... Read More
There are many facets to SEO and the search engines... Read More
Regardless of which type of business you currently have, you... Read More
Anybody who has their own website or is involved in... Read More
There are a lot of things in Search Engine Optimization... Read More
OK, you published your site, now you just sit by... Read More
To arrive at the set of keywords that:Describe business correctly... Read More
Before we explore the world of search engine optimization, it... Read More
Nothing could be simpler than the title you give to... Read More
When you write a press release, what is your ultimate... Read More
Getting your website listed in Google quickly simply requires that... Read More
Search engines love content. Graphics may make your site look... Read More
You need to be extremely careful with keyword research so... Read More
As with any good web developer, the ability to time... Read More
You've selected an appropriate Online Business Opportunity. That is not... Read More
Let's talk about what keyword density is and how to... Read More
Search engine listing delays have come to be called the... Read More
The "Number One" Question - the question that I (and... Read More
Just for a change, rather than a technical article, I... Read More
What are Topical Search Engines?Simply put, topical search engines are... Read More
The most difficult challenge most web designers face is getting... Read More
The Dream: You wake up one morning and notice your... Read More
Between 75% and 98.8% of visitors to Web sites come... Read More
A good link building strategy has become an essential part... Read More
Duplicate content is one of the problems that we regularly... Read More
Link popularity is just one of the ways you can... Read More
Overture.com offers a cool function to assist you on your... Read More
This article will cause many companies to stir, but it's... Read More
My Grandfather ran a small Grocery Store and when you... Read More
Search Engine Optimization, optimizing your website for it to be... Read More
Maggie knows how to find what she wants. She lets... Read More
Maximizing traffic from the search engines to your web site... Read More
Reciprocal linking scams have increased immensely during the past year.... Read More
Getting your website listed in Google quickly simply requires that... Read More
A while back, I read an article that explained how... Read More
Search engine optimization is one of most popular online marketing... Read More
Some search engine submissions are free and some pay for... Read More
This article is actually the summary to a book soon... Read More
If you're serious about SEO, you need to know how... Read More
Did you know that you can dramatically increase the number... Read More
Internet Directories and their ImportanceThere are two very pertinent reason... Read More
Purchasing web design service is confusing with all of the... Read More
Search Engines have become the soul of the Internet. They... Read More
There has been a good deal written about the Google... Read More
Unfortunately, we don't live in a perfect world. You may... Read More
I want to touch base with selecting the title of... Read More
Ever felt intimidated at the convoluted, jargon-ridden information about Internet... Read More
Why is it that webmasters are so quick to blame... Read More
Would you like a checklist of the important steps to... Read More
Search Engine Optimization (SEO) |