OK, OK, the title was a pathetic attempt at a twist on a phrase, we’ve heard the phrase that Content is King so much and we were trying to find a different way to say it!?
So, if you can excuse the title for a moment then we will move swiftly to the point of this post. Recently we’ve been asked on several occasions why content can’t just be copied or taken from other websites. For the purpose of this post we’re going to ignore the whole copyright issue (tut tut!) and concentrate solely on duplicate content and why it’s the naughty boy of the Net.
First off, what is duplicate content?
According to Google:
Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.
To demonstrate the point, that quote above will be classed as duplicate content (although on a much minor scale tham Google is referring to) as it is originally from the Google support website. The first page to be indexed is seen to be the original author and anythig else is a duplicate (or copy).
Why does Google disapprove of duplicate content?
Google is focused on giving its users a useful experience so that they remain users. If a user finds five identical results for a search query this is not nearly as useful as five different relevant results. Its as simple as that. Google doesn’t want to return identical results so it picks the first one it comes across to display in search results.
What causes duplicate content?
A lot of webmasters innocently enough copy and paste from other sites to fill up their own. For example. If you’re selling an established brands products on an ecommerce site it makes sense to have the official product description on your site. However this will be seen as duplicate content.
A lot of webmasters knowingly scrape content from other sites due to a lack of time available or just through laziness.
Duplicate content can also occur if your site displays the same content in different parts of your site. For example. In the month and year categories in a blog.
What are the penalties?
It can stop your site from appearing in search results for key terms if you’ve “borrowed” significantly from other sites. If you’re relying on search traffic for your sites hits and income then you can see why copy and paste should not be in your vocabulary.
What can you do to stop it?
If the offending content is content repeated on your own website over several pages (maybe to describe similar products) then the simple answer is rewrite it and keep it fresh!
If someone has taken a huge amount of text from your website then…it’s their problem. If you’ve been indexed first then they’re just harming their sites chances.
Block the offending content from search engines using a robots text file. Assuming that the content is necessary on the site and you’re not relying on it for ranking.
Some golden rules.
Keep it original
If you are quoting from another site keep it to a tiny proportion of the page’s text. The odd wee quote won’t kill your site but quoting entire articles will.
Sort out you canonicalisation and use your .htaccess wisely or the new Canonical Tag.