DUPLICATE CONTENT AUDIT
DUPLICATE CONTENT CAN NEGATIVELY
IMPACT YOUR RANKINGS
A common SEO technique used by spammers up until 2012 was to create a page of content and duplicate this content across several URLs. The spammers would hope to dominate the search results by ranking on page 1 several times with different URLs, and therefore stealing all of the traffic. Google has cracked down on this technique and decreases the rankings of sites that participate in this practice, or even imposes Google penalties on these sites. In fact, Google now limits the amount of times one website can rank in the search results for a single keyword.A common SEO technique used by spammers up until 2012 was to create a page of content and duplicate this content across several URLs. The spammers would hope to dominate the search results by ranking on page 1 several times with different URLs, and therefore stealing all of the traffic. Google has cracked down on this technique and decreases the rankings of sites that participate in this practice, or even imposes Google penalties on these sites. In fact, Google now limits the amount of times one website can rank in the search results for a single keyword.
WEBSITE SECTIONS AND SUBDOMAINS
Duplicate content problems persist quite commonly in the field of life sciences. Often times companies have multiple divisions, languages and microsites on separate subdomains. These subdomains are often created to clarify websites. However, from an SEO perspective, subdomains are treated by Google as separate websites.
A LIFE SCIENCE EXAMPLE
In Wuxi AppTec’s scenario, they have multiple divisions that branch off their corporate website. If one of their subdivisions is Lab Testing Division and situated at: http://ltd.wuxiapptec.com/ then their main website at http://www.wuxiapptec.com/ won’t receive any of the SEO benefit. Instead, their website begins to compete with itself for the same keyword phrases. In turn, it is more difficult for either website to rank.
Fig 16. If you are placing subdivisions on subdomains, you may be cannibalizing your own rankings.
It’s important to thoroughly check whether or not the same content resides on completely different URLs? This is often hard to avoid on ecommerce biotech websites that list 100s to 1000s products. These products can either be extremely similar, or the same product could be replicated across different categories.
A LIFE SCIENCE EXAMPLE
Take a look at these two URLs on Vector Lab’s website:
Fig 17. Two versions of the same content lead to duplicate content penalties.
These two URLs show the exact same product under two separate categories.
This is duplicate content. These two pages are competing with each other to rank in the search results. As a result, Google could end up not ranking either of them.
Utilizing Moz Pro’s Crawl Test will provide you with reports on duplicate content and page titles.
If you’ve recently launched a new website, or a page has been updated, make sure to utilize 301 redirects, canonical tags or use Google Webmaster Tools to fix any duplicate content that might be indexing and penalizing your site.
To assist with this process, utilize: http://copyscape.com/ to check for duplicates of your content not only on just your website, but other websites as well.
Other ways to search for duplicate content would be to:
- Take a content snippet, put it in quotes and search for it.
- Does the content show up elsewhere on the domain?
- Has it been scraped? If the content has been scraped, you should file a content removal request with Google.
THE DUPLICATE CONTENT AUDIT
31. Check there is one URL for each piece of content.
32. Check for subdomain duplicate content
Does the same content exist on different sub-domains? This includes different languages, staging/ development sites, DNS servers, mirror sites and more.
33. Check for a secure version of the site
Does the content exist on a secure version of the site? If it is, make sure that is the preferred version of the website and other versions are redirected to it.
Pro tip: “Google announced they would reward sites using HTTPS encryption with a boost in search results.” So if you aren’t using HTTPS encryption, you should.
34. Check other sites owned by the company
Is the content replicated on other domains owned by the company?
35. Check for “print” pages.
If there are “printer friendly” versions of pages, they may be causing duplicate content.
36. Check URL Parameters.
Are product categories filtered using url parameters? For example:
The second URL above is a duplicate of the first URL listing enzyme substrates, except the second URL is filtered for enzyme substrates with a IHC application (applications=ihc) and alkaline phosphatase detection enzyme (enzyme=ap).
The two URLs are duplicates of one another as they display most of the same content. Use the URL parameter tool in Google Search Console to specify to Google which of these parameters to ignore.
37. Check that your international websites don’t have duplicate content.
If they do, a comprehensive fix is to submit a new property for each international website to Google Search Console and submit a unique XML sitemap for each. Additionally, use the International Tagging tool in Google Search Console to specify which country or region each site is targeting. More on this in the International Chapter.