What is Duplicate Content?
Duplicate content is content that appears on the Internet in more than one place (URL). When there are multiple pieces of identical content on the Internet, it is difficult for search engines to decide which version is more relevant to a given search query. To provide the best search experience, search engines will rarely show multiple duplicate pieces of content and thus, are forced to choose which version is most likely to be the original—or best.
The Three Biggest Issues with Duplicate Content
- Search engines don’t know which version(s) to include/exclude from their indices
- Search engines don’t know whether to direct the link metrics to one page, or keep it separated between multiple versions
- Search engines don’t know which version(s) to rank for query results
Three Common Solutions to Conquer Duplicate Content
- 301 redirect. Check Page Authority to see if one page has a higher PA than the other using Open Site Explorer, then set up a 301 redirect from the duplicate page to the original page. This will ensure that they no longer compete with one another in the search results.
- Rel=canonical. A rel=canonical tag passes the same amount of ranking power as a 301 redirect, and there’s a bonus: it often takes less development time to implement. Add this tag to the HTML head of a web page to tell search engines that it should be treated as a copy of the “canon,” or original, page: <head> <link rel=”canonical” href=”http://moz.com/blog/” /> </head>
- noindex, follow. Add the values “noindex, follow” to the meta robots tag to tell search engines not to include the duplicate pages in their indexes, but to crawl the links. This also works well with paginated content or if you have a system set up to tag or categorize content (as with a blog). The code should look something like this: <head> <meta name=”robots” content=”noindex, follow” /> </head>