Duplicate Content means that similar content is being shown on multiple locations on the internet or inside the website. As a result, search engines get puzzled to find which one is original and which one is a just copy of the original content to show in a search index.
Duplicate Content can hurt the SEO of the blog. For example, When Google bot finds the same content on multiple locations, give the signal to google that someone scraped the content. Google start to identify the original source to give SEO point (link juice) and deduct or reduce the SEO point of the overall domain who copy it because you do not follow the google Guidelines
If you write a content on your website “ABC.com” and someone copies it and Publish its own website “DEF.com”. By accidently Google select DEF.com website publish original content and you copy from it. Then what you do now? Don’t worry, Today article is on Duplicate content, to help you to understand how you can deal with the duplicate content issue and fix it for SEO.
Table of Contents
Introduction Duplicate Content
Many people assume that creating multiple pages or similar copies of the same page will either increase their chances of getting listed in search engines or help them get multiple listings, due to the presence of more keywords. In order to make a search more relevant to a user, search engines use a Duplicate Content Filter that removes the duplicate content pages from the search results, and the spam along with it.
In order to understand the duplicate content filter, First, we must have to understand the concept of the duplicate content penalty. When we refer to penalties in search engine rankings, we are actually talking about the SEO point called link juice that are deducted from a page in order to come to an overall relevancy score.
But in reality, duplicate content pages are not penalized. Rather they are simply filtered out from the search engine, the way you would use a sieve to remove unwanted particles. Sometimes, “good particles” are accidentally filtered out by the search engine.
The Three Biggest Issues with the search engine to deal with the Duplicate Content issue are;
- Search engines don not know which version to include/exclude from their indices;
- whether to direct the link juice to one page or keep it separated between multiple versions
- Search engines don’t know which version(s) to rank for query results
Difference Between Google Penality and Filter
Knowing the difference between the Google penalty and filter, you can now understand how a search engine determines what duplicate content is. Duplicate content that is filtered out:
- Scraped Content – Scraped content is the content that is copied from the other website and repackaging it to make it look different. This is called scraped content. Either you place it on your another domain or someone copy from your domain, if google found two same content access by two different URL then google get puzzled and select any one of them and another page just filter out. I will explain at the end of the article how to google pic the best content.
- E-Commerce Product Descriptions – Many eCommerce sites out there use the manufacturer’s descriptions for the products, which hundreds or thousands of other eCommerce stores in the same competitive markets are using too. This duplicate content, while harder to spot, is still considered spam and marked as a penalty.
- Web sites with Identical Pages – If google finds two websites with the same page or similar to each other, google consider it is as spam. Affiliate sites with the same look and feel which contain identical content.
How Duplicate Content Filter Work?
when a search engine bots crawl a website, it reads the pages and stores the information in its database. Then, it compares its findings to other information it has in its database. Depending upon a few factors, such as
- When google detect duplicate content, then google group the duplicate URLs into one cluster.
- Select what we think is the “best” URL to represent the cluster in search results.
- Then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.
How To Find Duplicate Content
We can Divide Duplicate content into two section, First one is External Duplicate content (on the internet) and Second is Internal Duplicate content (inside the website).
External Duplicate content: First of all, You need to know which sites might have copied your site content or pages, you can use Online duplicate content checker (copyscape.com)to find the duplicate content on the internet. If you find your content are copied by someone take manual action aginst the Website and if, you find your content similar to another content then try to change your content to make it different from them.
Here you can add the URL of your website or the content you want to check whether it is a copy or not by searching content on another website. This tool can also help you to create unique content, or even address the issue of someone “borrowing” your content without your permission.
You will see a list of the other page that is similar to you page. Click on that page and check one by one. See, how many percentages of the page are similar with your page.
Second, Using Similar Page Checker tool, you will be able to determine the similarity between two pages and make them as unique as possible.
By entering the URLs of two pages, this tool will compare those pages, and point out how they are similar so that you can make them unique. If you have an eCommerce site, you should write original descriptions for your products. This can be hard to do if you have many products, but it really is necessary if you wish to avoid the duplicate content filter. Simple page checker tool can tell you how you can change your descriptions so as to have unique and original content for your site.
Internal Duplicate Content: Now the second is, check Duplicate content inside the website. Siteliner is CopyScapes brother that searches for internal duplicate content. This duplicate content checker will find duplicate content on your own site.
The Siteliner duplicate content checker will show you a lot of things, but limited to 250 pages and 30 days. Again, there is a premium version, but the free one will already give you a good idea. Just do a search, find the overview page.
You can also use HTML Improvement duplicate title Tag option in the webmaster tool to check the internal duplicate content in the blog.
Currently, I have no HTML duplicate content error on my website. Here are some other useful tools are;
Remember, some search engines, like Google, use link popularity to determine the most relevant results. Continue to build your link popularity, while using tools like www.copyscape.com to find how many other sites have the same article, and if allowed by the author, you may be able to alter the article as to make the content unique
How To Deal With Duplicate Content
There are several ways to deal with the duplicate content on the web, here is a list of some common content issue with its solution to fix it;
- Copied Page
- Unoptimized category
- Duplicate Product Information
- HTML Error
- Cross Domain
The Situation: Let’s start with the example, I’m a site owner who has found a great piece of content on a different website and I would like to share this content on my website.
The Issue: The issue you’ll face is that this content is going to be valued poorly or count as a scraped content on your site and may contribute to an overall domain score quality drop.
The Fix: A cross-domain canonical tag is the only fix here. You’ll need to add a canonical URL tag to the page indicating that the original source of the content is at a different location.
<link rel=”canonical” href=”http://www.abcwidgets.com/copied-article.html”/>
This will tell the engines that you know the article is copied, it is intentionally placed on your site and all link juice to that page should pass to the original location of the article.
The Situation:- If one article is in under two categories, then this can be assessed by two different URL.
The Issue:- This puzzle the google to select which Version of the page is original and where to pass the link juice.
Duplicated Product Information
The Situation: You run an e-commerce site selling blue widgets from a variety of manufacturers. These manufacturers provide you with product information (titles, descriptions, specs, and images) to post on your site.
The Issue: The manufacturers are also providing the exact same information to everyone else who’s selling their products.
The Fix: While specs remain the same and duplication is acceptable across multiple sites, you need to set your site apart. This will generally involve writing new product descriptions, taking new photos, and hopefully adding content unique to your site such as reviews.
HTML Duplicate Tag
The Situation: Whenever we change the URL of the existing post or edit the URL,
The Issue: Page can by access by two different URL, First is old one and second new one,
The Fix: Go to HTML Improvements under google webmaster tool and remove the old one by using removal tool in the webmaster tool,
The Situation: Suppose you have a two-domain and you publish the same content on both side.
The Issue: Google found same content, google take action and filter out one of them
The Fix: Use canonical URL tag to tell google which one have to index.
Google has said time and time again, duplicate content issues are rarely a penalty. It is more about Google knowing which page they should rank and which page they should not. Google doesn’t want to show the same content to searchers for the same query, So fix fixing of the duplicate content is important for SEO point of view to increase the overall Domain authority of the website and built a trust with google to rank higher in the search result.
Remember to share this post with anyone who might benefit from this information, including your Facebook friends, Twitter followers and members of your Google+ group! And also Support Us By Liking Our Facebook, Twitter, and Google+ Page.
If you have any suggestion or problem please feel free to comment below.