-
-

Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content.
Susan Moskwa, Google Webmaster Trends Analyst
The up-coming WordPress 2.7 introduce new advanced comment features; comment threading, nesting, paging (pagination) etc. You probably won’t have any issue with WordPress 2.7 as most of the new advanced/enhanced features is disabled by default. Thats a good news.
Unfortunately for those enthusiastic user (like me) who wish to use the Comment paging, there is one caveat; multiple similar content. The below screenshot is taken from Google webmaster diagnostic tools for duplicate content.
Comment paging duplicate content
If you have Google Webmaster account, check out the diagnostic Content analysis tools. There is high potential that your blog will generate duplicate content (if the comment paging feature is enabled).
How to prevent duplicate content
Reducing duplicate content in your website is a good SEO practice. I made a small filter script for WordPress 2.7. The script ↓ will append robots noindex meta tag rules on comment page section. This methods will prevent search engine indexer and services from indexing your blog’s comment page.
noIndex meta view code
Installation: copy paste the below code ↓ in your theme’s functions.php
/** * void wpi_comment_paging_noindex_meta() * Add meta noindex rules on Singular comment page section * * @author Avice D <ck+filter@kaizeku.com> * @license http://www.gnu.org/licenses/lgpl.html GNU Lesser General Public License * @link http://blog.kaizeku.com/wordpress/prevent-wordpress-27-duplicate-content/ * * @todo Check for duplicate meta-robots tag generated by * meta-tag type plugins (SEO plugins) * * @uses $wp_query Wp_query object * @return string Output HTML meta noindex */ function wpi_comment_paging_noindex_meta() { global $wp_query; if (version_compare( (float) get_bloginfo('version'), 2.7, '>=') ){ if ($wp_query->is_singular && get_option('page_comments')){ // comments paging enabled if (isset($wp_query->query['cpage']) && absint($wp_query->query['cpage']) >= 1 ){ echo '<meta name="robots" content="noindex" />'.PHP_EOL; } } } } add_action('wp_head','wpi_comment_paging_noindex_meta');Download:
Why duplicate content is bad for your blog?
- Having multiple search crawler indexing the same content on your website (over and over) is an absolute bandwidth waster.
- Bad PR. This are quite debatable to the point of myth, some search engine penalize website for duplicate content but according to Google webmaster team “If the duplicate content is not done to game (deceive) the search results there is no penalty”. Just pretend all major search engine service does have penalty rules for duplicate content issues. From my past experiences there is no telling or earlier notice “they’ll just drop you, so can you spent sometimes to figure out the why later” .
- Bad SERP, rationally the main articles that you post should have more weight than the rest of the comment page or sub page, you don’t want to see your archives or comment page on Search Engine results while your main articles is no where to be found.
Plug-it
I don’t have any intention of making any WP plugin out of this simple function as this issues will probably be address and solve by WP’s developer or someone else up there. This code is release under open-source license, do whatever you want with it.
Might be interest
-
23 Responsesto “Prevent Duplicate Content on Comment Paging”
Great code snippet. I’m adding it now.
[Reply]thx, that was a big problem 4me
[Reply]It’s a bit extreme to no-comment all your comments isn’t it? They might have some useful content in them.
I have also spotted this problem of duplicate content. I’d just managed to solve the problem for pages 2 & higher of the standard loop: http://www.malcolmcoles.co.uk/blog/avoid-duplicate-meta-descriptions-in-pages-2-and-higher-of-the-wordpress-loop/ when along came the problem for the comment loop!
I presume there is some variable like $paged for the comment loop but I have yet to find it.
[Reply]I would agree about `Comment’s “might have some useful content in them”` but I would rather leave that to the audiences. Its all depends on context.
Anyway having the no index meta-tag wont stop bots from crawling the page. The “no index” meta rules just exclude the URLs from the search results. Otherwise the “robots.txt” rules does the job at blocking the bots from visiting the page completely (assume that all bots are being nice). Some example http://wp.istalker.net/robots.txt
You’ll get multiple duplicate content, meta-title and meta-descriptions if the comment-page is not exclude from being index.
It would be nice if WordPress has separate sub-section for comments like the attachments page.
[Reply]no-index your comments I meant, not no-comment.
[Reply]Inevitably, the next post I read gave me the solution to the problem! My solution gives you unique titles:
[Reply]http://www.malcolmcoles.co.uk/blog/avoid-duplicate-content-paged-comments-wordpress-27/
and the post I link to explains how to show just the excerpts on pages 2 and higher of the comments.
By using noIndex meta means google doesn’t index all comments in our blog. It seems fortunate to me
[Reply]In theory yes, it’ll give more weight to your articles (without the extra sub comments page).
The post comment feeds will get index thought.
[Reply]Hi! Is it possible to target=”_blank” the rotative wallpapers? Thanks for sharing this…
[Reply]i guess yes, check out next release
[Reply]lol off topic
[Reply]Interesting article, thanks.
[Reply]I have tried quite a few WordPress SEO plugins to deal with duplicate content. Some of them work very well, but I like to hard code as much as I can into my template. I just feel that at some point in time, I am going to experience plugin hell of some sort.
I have been doing SEO for quite some time, but the more I learn the more I realize what I don’t know. Then, there is all of the stuff that changes to deal with as well. It is never ending and keeps a person thinking, that is for sure.
Are you very experienced with robots.txt file configuration? It is hard to find someone that know their business in that field. I have contacted quite a few people and they never follow through with the goods.
[Reply]It is important that the comment page URL be set to noindex.
Take for example these 2 URLs which are different but with the same content.
http://www.lugaluda.com/2009/02/17/a-15-year-old-mad-monkey-or-chimpanzee-attacks-a-woman-causing-serious-injuries/comment-page-1/?
http://www.lugaluda.com/2009/02/17/a-15-year-old-mad-monkey-or-chimpanzee-attacks-a-woman-causing-serious-injuries/?
Technically speaking, they are just one but google still sees this as duplicate content.
To solve this matter, you just need to disable the paged comment in your blog. And wait for the wordpress team on what solution they would give
[Reply]Thanks. It helped me a lot. I was going to change WordPress for something more SEO friendly but could not figure the better solution. After your advice I will probably stay with WP for a while. Too good support to leave it :)
[Reply]If you want to comment, please read the following guidelines. These are designed to protect you and other users of the site.
In order to keep these experiences enjoyable and interesting for all of our users, we ask that you follow the above guidlines. Feel free to engage, ask questions, and tell us what you are thinking! insightful comments are most welcomed.
Subscribe to this discussion via RSS
1 2 3 4 Next »
Trackback & Pingback
Taxonomy
Most used terms