Fighting Spam on the Cheap with CAPTCHA: A Simple Ruby Library for captchas.net 2
A recent article discussed two free services for cheaply integrating CAPTCHAs into Web applications. One of these services, captchas.net, apparently has no publicly-available Ruby library. Given the popularity of Ruby on Rails for building Web applications, and the increasing need for spam protection offered by services such as captchas.net, it seems only logical that such a library should exist. This article, the first in a series, documents the first step in the development of a simple Rails library for working with captchas.net.
Got Key?
To use captchas.net, you'll need to register for an account. You'll receive a secret key used in decoding the text represented in the CAPTCHA, and a username that will be encoded with your capthcas.net URLs.
Building a URL
To get your CAPTCHA image from captchas.net, construct a URL containing the appropriate parameters. The simplest form of a captchas.net URL accepts your user name ('demo') and a random phrase ('my_random_text'), and returns a complete CAPTCHA image. Customization is possible, but we'll just stick with the simple case for now. As an example, this URL:
http://image.captchas.net/?client=demo&random=my_random_text
generates this image:
The URL above is all you need to embed a CAPTCHA into your webpage. The random text we've encoded in the URL ('my_random_text') is processed by the captchas.net server to create the six-character sequence shown in the image. Read on to find out how.
Decoding the CAPTCHA
We've got a CAPTCHA, but how do we know what's written in it? This is where our secret key comes in. Here's the method used by the captchas.net server to generate the image text:
concatenate the secret key and the random string (example: 'secret' and 'my_random_text' become 'secretmy_random_text')
if alphabet or character_count differs from 'abcdefghijklmnopqrstuvwxyz' and 6, respectively, append both separated by ':' (<secret><random>:<alphabet>:<character_count>).
take the MD5-sum of the resulting string
take the first character_count bytes of the resulting 16-byte-long MD5 value
determine the remainders of this character_count bytes, when dividing by the length of alphabet
every number encodes a character from the chosen alphabet (example: "hnrppb")
The captchas.net site has a more complete description of the algorithm and an interactive CAPTCHA generator that is very helpful in understanding how CAPTCHAs are generated.
A Simple Library
Given the algorithm, a short Ruby library can be written to find the text encoded in a captchas.net CAPTCHA:
require 'digest/md5'
module Captcha
def get_text secret, random, alphabet='abcdefghijklmnopqrstuvwxyz', character_count = 6
if character_count < 1 || character_count > 16
raise "Character count of #{character_count} is outside the range of 1-16"
end
input = "#{secret}#{random}"
if alphabet != 'abcdefghijklmnopqrstuvwxyz' || character_count != 6
input << ":#{alphabet}:#{character_count}"
end
bytes = Digest::MD5.hexdigest(input).slice(0..(2*character_count - 1)).scan(/../)
text = ''
bytes.each do |byte|
text << alphabet[byte.hex % alphabet.size].chr
end
text
end
end $ irb irb(main):001:0> require 'captcha' => true irb(main):002:0> include Captcha => Object irb(main):003:0> get_text 'secret', 'my_random_text' => "hnrppb"
If we wanted to include numerical digits and require additional characters, the library enables this as well:
$ irb irb(main):001:0> require 'captcha' => true irb(main):002:0> include Captcha => Object irb(main):003:0> get_text 'secret', 'my_random_text', 'abcdefghijklmnopqrstuvwxyz0123456789', 7 => "62m3acs"
Conclusions
That's really all there is to the Ruby library. Once we can create a CAPTCHA image and decode its contents, we can begin to think about building an integrated Rails solution. But that's a story for another time.
image credit: Andrew Huff
Fighting Comment Spam on the Cheap with CAPTCHA 3
If you run a blog or website that allows public input, you've almost certainly been subjected to a spam attack. This is a problem because even one successful attack can eat up hours of time. After a recent spam attack on this blog, comments were disabled altogether. They've now been restored with the help of a more robust kind of protection, which is the subject of this article.
One of the best forms of spam protection is the Completely Automated Public Turing Test to Tell Computers and Humans Apart (CAPTCHA). CAPTCHA comes in many guises but usually consists of a noisy image of some text that a user must enter, like this one from Digg:

There are many CAPTCHA systems. The disadvantage most of them share is that they must be deployed on a server. Depending on your hosting situation and your platform, this may or may not be feasible. D-F is run by the Ruby on Rails blogging software Typo. Most CAPTCHA systems for Ruby require the installation of the C extension RMagick and its dependencies, which is either difficult or impossible on many hosts.
I recently found two solutions to this problem, and have implemented one of them:
- captchas.net This free service generates CAPTCHAs on a remote server, which your own server uses. By writing a small Ruby library and some glue code, I was able to integrate this solution, which is currently running on D-F. Here's an example in action:

- reCAPTCHA Not only does this service generate CAPTCHAs for you, but your users actually help solve OCR problems in the process. Talk about a win-win situation. If this sounds impossible, check out the description here. As an added bonus, reCAPTHCA APIs are available in a number of languages, including Ruby. reCAPTCHA is currently used on popular sites such as Twitter and looks like this:

The struggle against spam is an arms race. Currently, the best weapon for legitimate content producers is CAPTCHA, but even it can be foiled by a determined spammer. If past history is any guide, even more sophisticated forms of spam attacks and countermeasures are just around the corner.
image credit: freezelight

