Monday, June 18, 2007

Don't People Like OCR?

The first flatbed scanner I ever owned came with OCR software that worked
just fine and I used it a number of times. I've bought two more scanners
since then and neither one had such software for some reason. Ironicly
the only reason I replaced that first one is that I needed to do a driver
re-install and I had lost the CD (and to add irony, I later found the CD,
after the scanner was gone).

Anyway, I've wanted to do some OCR for web pages for quite awhile.
Finnally there's this:

http://sourceforge.net/projects/tesseract-ocr

Interestingly, this is an old commercial engine from HP that they dropped
work on a long, long time ago. It's just resently been made open source.
It's reviewed in the June Lunix Journal and appears to be just what I
need!

No comments:

Post a Comment