Wednesday, December 18, 2013

An Interesting clip of Matt on how Google handles duplicate content

Matt released another video to help all webmasters and I find the link in his tweets.
Matt Cutts tweet
The video makes me clear about the duplicate content issue. Usually at the time of audit we check internal and external duplication to make my website content unique. But today I learned in depth watching this video, where he has stated “How does Google handle duplicate content?

According to matt cuts, 25 or 30 % of all the web content is duplicated. So, duplicate content does happen when people quote a paragraph and link the blog. So, it’s not the case that every single time a duplicate content is a spam. The fact is Google looks for duplicate content and when found, it often tries to group it all together, and treat it as if it’s just one piece of content. So, most of the time, suppose one is starting to return a set of search results, and there are two pages that are actually kind of identical, since they are duplicates, Google shows one of those pages and would crowd the other result out, and then one gets to the bottom of the search results.

Now that’s sad, it’s certainly the case that one does nothing but engages in content duplication and also it is done in an abusive, deceptive and malicious or in a manipulative way, it resorts the right to take action as spam.
If someone is showing content of another blog through RSS feed but not adding any extra value to the content, it would be more likely to be viewed as spam.

Matt also says,
“If one is automatically generating stuff that’s coming from nothing but an RSS feed one is not adding a lot of value so that duplicate content might be a little bit more likely to be viewed as spam.”
But when one is just making a regular website and he is worried, whether he has something on dot com and dot co dot uk or he might have two versions of terms and conditions, an older version or a newer version, or something like that. And that’s when a duplicate content happens all the time on the web. And it really don’t need to be stressed about the notion that one might have a little bit of duplicate content as long as one is not trying to massively copy from any journal or blogs.

Now I am a bit confused. As a SEO strategist I always suggest my clients to keep the content unique and 100% copyscape free as post panda the impact of duplicate content has become much more severe.
I learned three different type of duplication that can be a risky factor in SEO, which are: (1) True Duplication, (2) Near Duplication, and (3) Cross-domain Duplication.

But in this video matt stated that somewhere between 25% to 30% of the content on the web is duplicated and it does happen. So I think we can use someone’s statement as quotation and that will not consider as duplication but still 30% is huge.

Another point he has mentioned in the video that there is no reason to be worried that if you have multiple domain with same content. Then why will people use cross domain canonical tag. We know that search engines will generally take one version and filter the others out but still we need to use the canonical tag to avoid cross domain duplication. If we take this video as statement then we can say that Near Duplication, and Cross-domain Duplication is still considerable. Only we need to take care about true duplication.
What do you think?

No comments: