Duplicate content can be a real issue for any site, leading to a disappointing user experience and also reducing the effectiveness of your SEO efforts. In some situations, it can even lead to penalties being applied to your site.
When we think about duplicate content, we often envisage a problem that has been caused by plagiarism. You may have inadvertently included copied content within the pages of your own site, or seen the owner of a competing site make use of your own, original copy.
Either way, there’s potential for your site to run into problems. Fortunately, duplicate content issues of that nature tend to be relatively easy to solve. Unfortunately, they aren’t the usual cause of difficulties in this area. It’s more often the case that site architecture issues create problems.
In this guest post, I’ll be taking a look at common problems and revealing their solutions. Whether you manage your own site, or work on a range of client sites, you may simply be looking for some definitive information that can help you to produce better results.
Let’s start by assuming that you have copied content from elsewhere, whether intentionally or otherwise. It may be that your site has been penalised as a result and that you’ve seen a considerable loss of positioning.
Fortunately, the solution here is pretty simple: you should remove the duplicate content. Once you’ve done that, it’s likely that your site will soon start to make its way back up the rankings. If you believe that you are suffering from a manual penalty, however, then you’ll also need to submit a Reconsideration Request. You should be able to review your Google Webmaster Tools account, in order to identify any warning messages that you’ve received.
What happens if someone else has been copying your content? I always think that this is a slightly more difficult scenario to deal with, although there certainly is a solution available to you.
To begin with, it’s probably worth writing to the owner of the site and asking to have the duplicate content removed. If that doesn’t work, then a letter from a lawyer may have more of an impact.
Let’s be realistic here though: many people who are happy to plagiarise can also be pretty unreasonable. They may not respond positively to your request.
In this case, you can file a request with their host, making use of the Digital Millennium Copyright Act (DMCA). Finally, you can ask Google to remove the content.
The entire process can be extremely frustrating, but that copied content may be harming your site, particularly if Google is unaware that you provided the original source material.
Two Versions Of Your Site
Within Webmaster Tools, you can point Google to the definitive version of your site. This deals with the issue of appearing to have a www and a non-www version of the same site.
In addition, I would also recommend using a 301 redirect. That way, you have made sure that you have covered all avenues.
Using URL Parameters
You may currently be using parameters after your stand URLs. A classic example of this is a product listing page, where site visitors are able to order products by particular characteristics.
Typically, this might mean that the same set of products can be ordered by price, size, colour or brand. The problem is that this will usually result in a set of pages that pretty much contain identical content.
How can this situation best be resolved? There are two main routes that can be taken here and the effect on link juice is pretty much the same. You can use the rel=canonical tag to point Google in the direction of the page that should be given priority.
Some people see this as being the quickest and easiest solution, although my preference is to use 301 redirects where possible. The reason for this is that all search engines will follow 301 redirects reasonably well, whereas they can be a little hit and miss when handling canonical tags.
In some instances, you may also find that it’s useful to make use of the “noindex, follow” entry within a meta robots tag. In effect, this is telling the search engines not to index the current page.
Some sites like to provide printer-friendly pages, which make a lot of sense from a user perspective. Unfortunately, they’ll often contain the same content as the main page that the user was just considering.
A good solution here is simply to exclude these pages from search engine indexing, although canonical tags could also be used.