Losing the WordPress spammers in the crowd

WordPress has a notoriously bad relationship with comments.  If a naive WordPress admin turns off the comment approval process on their site, they will be greeted with a wave of spam.  But don’t worry, WordPress has some great spam filtering plugins, so once you install one of those you don’t have to deal with spam, right?  Wrong.

The problem with spam filters is that as soon as they become used by a lot of sites, spammers go to work finding a way around them.  And so while they may reduce the number of spam comments you receive, they will not rid your site of them.  And the most effective filters involve adding a verification step to the commenting process (such as captcha).  This is a nuisance for users, I don’t like making humans prove they are not robots.

The Solution

Write your own damn filter.  Normally I do not actively try to re-inventing the wheel, but in this scenario it makes sense. Spammers only try to circumvent filters that are used by many sites, so making a custom filter used only by a couple sites keeps you off their radar.  Any competent spammer could easily reverse engineer my filter and figure out a way around it, but no one is going to do that just to spam my website.

The Code

The whole plugin is a whopping 47 lines of code. It alters the comment form (via the ‘comment_form‘ hook) and the comment submission process (via the ‘wp_insert_comment‘ hook):

Plugin Name: NCG Spam filter
Description: Adds a extra validation script to prevent spam
Version:     1
Author:      Nathan Gagnon

function add_form_spam_protection(){
  $encrypt = openssl_encrypt( date('Y-m-d H:i:s'), "aes128", "NCGTESTKEY" );
  $encrypt = preg_replace( "/=/", "", $encrypt);
  $encrypt = preg_replace( "/a/", "aa", $encrypt);
  $encrypt = preg_replace( "/\+/", "ab", $encrypt);
  $encrypt = preg_replace( "/\\//", "ac", $encrypt);

  $str = <<<END
<input type="hidden" id="ncg_part1" name="ncg_part1" value="$encrypt" />
<input type="hidden" id="ncg_part2" name="ncg_part2" value="" />
  var myval = document.getElementById('ncg_part1').value;
  document.getElementById('ncg_part2').value = myval.substring(8)+myval.substring(0,8);
}, 5000 );
  echo $str;

function check_form_spam( $comment_id, $comment_object ){
  $encrypt_orig = $_REQUEST['ncg_part2'];
  $encrypt = substr($encrypt_orig,-8).substr($encrypt_orig,0,strlen($encrypt_orig)-8);
  $encrypt = preg_replace( "/ab/", "+", $encrypt );
  $encrypt = preg_replace( "/ac/", "\\/", $encrypt );
  $encrypt = preg_replace( "/aa/", "a", $encrypt );
  $decrypt = openssl_decrypt( $encrypt, "aes128", "NCGTESTKEY" );
  if( !$decrypt || strtotime($decrypt) < strtotime('-4 hours') || strtotime($decrypt) > strtotime('-8 second') ){
    $comment_arr = array(
      "comment_ID" => $comment_object->comment_ID,
      "comment_approved" => "spam",
    wp_update_comment( $comment_arr );

add_filter('comment_form', 'add_form_spam_protection', 2);

The comment_form hook adds two hidden inputs to the form, ncg_part1 and ncg_part2. ncg_part1 value is set to an encrypted timestamp and the ncg_part2 is left blank.  The hook then adds a small snippet of javascript that sets ncg_part2‘s value to a shuffled version of ncg_part1. The snippet waits five seconds to do this in order to filter out comments that are submitted immediately after the page is loaded. Genuine commenters are not going to immediately submit when the page loads, spammers the probably will.

The wp_insert_comment hook is run after the comment is inserted into the database.  It is basically checking that the ncg_part2 value was populated correctly by the javascript snippet. It unshuffles and decrypts the value, then checks that the timestamp was generated recently.  If this check fails, it marks the comment as spam.

As you can see, this filter is far from bulletproof, anyone who took the time to look at the script, could figure out how to beat it.  In fact, all you have to do is load the page, let the javascript snippen run and wait a few seconds before submitting.  But are spammers doing this? No.  If thousands of people started using this to filter comments, would they? Probably.

The Conclusion

If we were talking about a security, it would be a horrible idea to rely people not bothering to figure out how your protection works.  But spam filtering has  low enough stakes for this approach to be okay.  The worse case scenario for a security breach is that sensitive information is stolen, your website is vandilized, your reputation is damaged and your site is used for ill-purposes by hackers.  The worse case scenario for your spam filter being breached is that you have spam in your comments.  For this reason, I recommend WordPress site maintainers implement their own spam filter like this.  The chance of this getting onto spammer radars is very low and even if that does happen it will not be a big deal.

Leave a Reply

Your email address will not be published. Required fields are marked *