Dear website owner, stop treating me like a spambot, it’s annoying

When I visit your site I really don’t want to spend my time deciphering mangled text, helping you solve OCR problems, unswirling swirled photos, playing infantile games, performing math, matching text to images or answering bizarre multiple choice questions.  Instead what I want to do is just register and then get on with the business of whatever it is you are providing.  In short, if your website is built for humans then I feel it’s only polite for you to treat me like one unless I display signs to the contrary.

Used as the first line of defense, visible CAPTCHA systems are a lazy attempt to prevent spambots from acquiring accounts on your website.   Worse still, if implemented incorrectly, these systems may do more harm than good.  In this post I’ll try to convince you that there are more effective ways to design your registration processes, ways that treat users like people while simultaneously keeping the spambot hoards at bay.

Too many websites slap a CAPTCHA/reCAPTCHA image on every web form and call it a day.  If your site falls into this category, let me ask, what are you protecting your website from?  Certainly not from a targeted attack and most likely not from a generic attack either.  In fact, I’ll argue that all you’re doing is putting an irritating obstacle in front of legitimate users.  To highlight the vulnerability of image CAPTCHA systems let’s take a look at it from the spambot creator’s perspective.  For just $6.95 a service like Death By Captcha can be used by anyone to solve up to 5000 CAPTCHAs.  Claiming an accuracy rate of roughly 90%, the Death by Captcha service returns the solution in about 11 seconds, not bad for $0.00139 per image.  Other services like Shanibpo are cheaper still, even giving away the first 100 CAPTCHA solutions for free.

“Ah”, you say, “but what if we give users an even more complicated CAPTCHA to solve?”  [Enter swirly images, math problems, games, etc.]  The problems with this strategy are two fold.

The first is that as soon as a new CAPTCHA method has been announced, it is defeated.  Jack Andrews illustrates this by showing how to unswirl images in 23 lines of python, SecuriTeam write about Jochem van der Vorm’s work on how speech-to-text techniques are used to circumvent voice CAPTCHAs while David J. Hill discusses how AI may be used to defeat games.

The other problem is that as CAPTCHAs become less trivial, real users get caught out, finding them too difficult or frustrating to solve.  Hill notes this in his article “Artificial Intelligence Will Defeat CAPTCHA” when he writes:

“A large-scale Stanford study a few years ago concluded that “CAPTCHAs are often difficult for humans.” It has also been reported that around 1 in 5 visitors will leave a website rather than complete a CAPTCHA.”

By forcing users to solve more complex CAPTCHAs you may actually have made the original problem worse, not only are these systems ineffective, they may even cost you 20% of your potential user base.

Designers and developers need to accept a simple truth:

If someone specifically targets your website, they WILL succeed in writing a script that can bypass any of your anti-spambot tests.

So what are the alternatives? In the next section I’ll introduce five solutions (by no means a definitive list) that I believe to be a superior first line of defense for your website.   These alternatives are no more effective at keeping bots at bay than any other method I’ve already mentioned but they can be distinguished by one very important characteristic:  They’re invisible and as such, most real users won’t even know they’ve performed and passed a test.

The Honeypot

When a user lands on your registration page, their browser downloads the HTML, applies the CSS stylesheet and executes any javascript code.  Many registration robots are designed to download only the HTML, then they extract the input fields from the form, fill in those fields with fake information and submit the form data back to your site.  Using the honeypot technique, we can take advantage of this behavior by hiding certain fields using CSS or Javascript and when the form is submitted, check if these hidden fields have been filled in.  Michael Clarke details this method in his article “Honeypot CAPTCHAs vs. Spambots”.

Dynamic Tokens

Again we take advantage of spambots that just download the HTML and don’t bother to execute the javascript.   When a typical user submits their registration form an AJAX call could dynamically add some token to the form.  Once the form has been submitted, you simply ensure that the form token received matches the token stored on your server.  A spambot that doesn’t download and execute this special javascript will be unable to submit a valid form to the site.

Form completion timing

When legitimate users are presented with a new registration page, the general behavior follows a familiar pattern:

  1. They read the form
  2. Understand what information is required to continue
  3. Begin typing in the required information
  4. Submit the form.

What users don’t tend to do is fill in the form in 0.1 of a second.  By recording the length of time a user takes to complete registration and comparing this duration against the average (plus or minus some error margin) you can filter out a lot of the common spambots.  And even if spambots do overcome this method, you will have significantly slowed the rate at which they are able to add new accounts.

Keystrokes and mouse clicks

To complete a registration form, most users do some combination of pressing buttons and moving the mouse.  If the form is filled in without any key up or key down events being fired and no mouse movements whatsoever you most likely have a spambot on your hands.

User Behaviour Analysis

Last but not least, if you haven’t already started using analytics to measure user behavior now is the time.  Ask yourself what are the typical actions of a “well-behaved” user on my site?  Generally, a well-behaved user might sign up to post a comment to a particular forum thread.  It is rare however, that a user will sign up and then in very quick succession, post to multiple threads.  When it comes to contributing content, most users do so in a constructive way.  Their comments are sometimes voted down, sometimes voted up and often just left alone.  On the other hand spambot comments are almost always voted down.  Watch for these patterns and treat spambots as you would any other badly behaved user, ban them!

The beauty of these “aren’t-you-human?” tests is that because they’re invisible to users, you have the option of layering and combining them, which increases the spambot creators’ work function by making it harder and more expensive to write generic spambot code.   Inevitably, rare cases will occur whereby some users slip through the cracks, but when this happens there is nothing to stop you from reverting to traditional methods such as reCAPTCHA, swirly images, games or dogs.

Lately it seems that far too much consideration has been given to the spambots.  Instead of being lazy it’s now time to innovate and refocus your efforts on your users who undoubtedly will appreciate your efforts, either consciously or subconsciously.  To get off to a good start, I urge you to take a look at Rob Tuley’s excellent post where he provides sample code for many of the invisible CAPTCHA methods discussed here.

One final thought is that if you force robots to be more human, perhaps someone someday will create a well-behaved bot that provides constructive value to your community at which point, why wouldn’t you want it to sign up?

Further Reading

3 thoughts on “Dear website owner, stop treating me like a spambot, it’s annoying

  1. Perhaps the best way to prevent spam bots is to keep an active log of the last n ip addresses to participate in a form submission, if an ip address starts to get hyperactive you can block it. Even the best distributed attack is going to have some repeat ip addresses in their attack.

  2. All your “solutions” require JavaScript, which still gets disabled by many people for security reasons. Also, screenreaders and text browsers, as well many people with mobile browsers, simply cannot use the stuff and so can’t use the website!

    • Hi Marco thanks for your comment. I agree with you that screenreaders can get caught out on the javascript based methods but “Form Completion Timing” and “User Behaviour Analysis” do not use javascript and could be used instead. You could also flip “The Honeypot” method to have a visible checkbox which reads “Check here if you are a spambot” (although I don’t like this because it’s visible to the user).

      Finally, the point I was trying to make is that we shouldn’t treat ALL users like spambots. So if you’re catering specifically to screenreaders then have a “screenreader mode” and when enabled the site uses standard audio captcha instead while normal users get the invisible captcha experience.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s