[Axiom-developer] Nifty anti-spam idea for making use of captchas

From: C Y
Subject: [Axiom-developer] Nifty anti-spam idea for making use of captchas
Date: Thu, 24 May 2007 17:58:31 -0700 (PDT)

I thought I'd mention a really nifty idea for making captchas (images
that humans convert to text in order to gain access to a site) useful. 
They're annoying and time consuming (relatively speaking) and automatic
image analysis can solve less well designed systems.  Well, someone
came up with the idea of using words that OCR systems have trouble with
in conjunction with another (more easily understood) image, and using
the results not only for authentication but for distributed OCR
assistance on converting texts to electronic form.  This means that
while captchas may still be annoying, they have a potential to be
useful as well :-).

Maybe we could take classic mathematical works in the public domain,
scan them, and use this technique to convert them into TeX documents
:-).  If regular OCR text is hard, a mathematical formula -> TeX
translator will be WAY beyond most of these systems. :-)  And we could
use the results to make literate documents for Axiom, since it would be
public domain content being translated.  

The University of Michigan has apparently scanned the 1910 version of
the Principia Mathematica and put it online as scanned pages of pdfs:;idno=AAT3201.0001.001;idno=AAT3201.0002.001;idno=AAT3201.0003.001

Based on their site it looks like we would need permission to do
anything with these images:

"These pages may be freely searched and displayed. Permission must be
received for subsequent distribution in print or electronically. Please
go to for more information."

However, if they were interested enough to give us permission to use
the images for the effort, perhaps we could turn the anti-spam measures
for the Axiom wiki into a simultaneous effort to produce a public
domain TeX version of the Principia Mathematica, making the wiki itself
serve (if indirectly) the idea of literate programming :-).  I doubt
the translations would be perfect, since TeX commands may require
context, but I wonder if it could be made useful none the less?

Bill, as our resident wiki guru do you know anything about this idea of
captcha utilization?


P.S. - I didn't know there was so much cool stuff out there of this
sort - also has a version of Newton's PhilosophiƦ naturalis
principia mathematica -

