Anti Spamming Technology ReCAPTCHA, Used to Aid Archiving
August 19, 2008
Technology used to help prevent access to websites by spamming bots, whilst still allowing access by human users is now being used to help archiving efforts. reCAPTCHA are now distributing words that cannot be read by optical character recognition software to its 40,000 sites which use its software.
The results of the text entered by the users are then used by the archivers to replace the unrecognized words.
reCAPTCHA technology, which was first developed by Luis von Ahn at Carnegie Mellon University, Pittsburgh, has previously helped digital archiving of records such as the New York Times archives. By distributing the unknown words with a control word both the control of human access and blocking or bots can be achieved whilst still gaining the feedback on the unknown word, with the collected responses then compared to ensure accuracy. The process has said to have reached an accuracy rate of 99.1% which is higher than that required by archivists and with CAPTCHA technology being used over 100 million times each day the process will go a long way with helping to digitalise many more archives, such as at the Internet Archive, which contain yellowed papers which cannot be recognised automatically by computers.
By Wiki News
Used Under a Creative Commons License









Similar Posts
Comments
Got something to say?