How to avoid duplicate content within the domain?
Content duplication is a big problem in SEO. Search engines filter the duplicate content and display only one copy, when multiple copies of the same identical content is available in different websites. It is difficult for the search engines identify which is the original content so sometimes the copied version will be shown in the search results and the original content will get filtered. There are two types of duplicate content, duplication within the domain and duplication across the domains.
What causes content duplication?
Content duplication within the domain happens due to various reasons:
1) The developer forgets to remove the duplicate pages before launching the website
2) Multiple versions of websites for various devices, printer friendly version of website
3) WWW and non-www verions of the website
4) Session ID, Parameter and Paginations in URLs
Why duplicate content should be avoided?
Duplicate content does not help you in anyway. It does not improve your sales or your SEO ranking.
Google does not penalize the website for duplicate content within the domain, it simply choose one URL (whichever URL it feels relevant at that time) among the multiple URLs and index it. When different URLs are indexed and displayed in the search results for the same query, the SEO weight(Search Metrics) of your content gets diluted. If one page is indexed for the same query, your page traffic for that query increases resulting in increase in SEO ranking.
If Google sees too many URLs for the same content, it may think these URLs are intentional for manipulating the SEO ranking, and hence may remove the pages from search results. This happens very rarely.
Another reason why you should duplicate content is to avoid bad user experience. If your visitors keep seeing the same content in different URLs, they get annoyed and never come back to your website again.
Please have a look at our SEO packages, If you want to hire us to do SEO for your website.
How to identify the content duplication?
There are many easy ways to identify duplicate content.
#1 Google Search
Do a Google search for a particular phrase from your webpage. It will show all the URLs of your website (and sometimes other domains too) which contain the same exact phrase. Make a list of these URLs and take the required steps to avoid duplication.
#2 Use the free online tools
Copyscape and Siteliner are good online duplicate content checker tools.
#3 Google Search Console
Many websites forget to add unique meta tags in all their pages. They change title and meta tags in important pages and leave the remaining pages as it is. Google search console(formerly known as WebMasterTools), helps in identifying the duplication in Title and Meta Tags. You can easily make a list of the pages with duplicate meta tags and work on it.
How to fix content duplication?
If the content duplication is within the same domain, it can be fixed easily by the webmaster by following these steps:
#1 301 redirect (Permanent redirect)
You can set a preferred URL for your content using URL canonicalization (301 Redirect). You can do this in your .htaccess file if your website is hosted in an Apache server. Set up a 301 redirect for all other URL to one preferred URL, so the ranking signals of your content will be consolidated to one single URL.
To redirect product1.html to product2.html add the following lines in your .htaccess
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^product1.html$ product2.html [R=301,L]
To redirect non-www version of your website to www version of your website, add the following lines in your .htaccess
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^mydomainname.com$ [NC] RewriteRule ^(.*)$ http://www.mydomainname.com/$1 [R=301,L]
#2 Rel=canonical tag
Another alternative is to use rel=canonical tag in your web pages which should be treated as duplicate. This works the same way as your 301 redirect.
Assume two links product1.html and product2.html. If you want to set product2.html as your preferred URL, you should add rel=canonical tag in the product1.html page as follows:
‘link rel=”canonical” href=”http://mydomainname.com/product2.html”>’
Add the above tag inside your ‘head’ tag.
When search engines see this tag, they will treat product1.html as a duplicate version of product2.html and pass all the ranking signals of product1.html to product2.html.
#3 Noindex Meta robots Tag
You can use robots=”noindex, follow” in your duplicate pages so that search engines don’t index that page. But Google does not recommend using “noindex” for duplicate content. If the search engines cannot crawl your duplicate pages, they cannot treat your pages as duplicate of your original content.
Search engines recommend the use of 301 redirect or rel=canonical tag. You can also use ULR parameters in Webmaster Tools to set the preferred URL.