Monday, June 23, 2008

Desktop OCR Performance and Speed Testing

So which Desktop Optical Character Recognition Software is the fastest? Has the best overall performance when converting images to Word? When converting images to PDF?

I ran some testing with 4 basic desktop OCR applications to see which would have the fastest conversion times. The OCR applications are:

-eCopy Desktop (Uses the ReadIRIS OCR Engine)
-Adobe 8
-Paperport 11
-OmniPage 15

I ran all the tests on a 9 month old laptop, with a Dual Core 2 GHz processor, and 2 GB of memory. I utilized all the "out of the box" settings on the apps, with no performance tuning of settings, and I timed the speed of the applications to convert a 100 page TIFF image to Word and to Adobe Image and Text PDF.

Results of the OCR Speed Test in minutes and seconds(Word/PDF):

eCopy Desktop 4:25/2:58
Adobe 8 3:54/3:22
OmniPage 15 2:16/2:16*
PaperPort 11 2:35**

*With OmniPage you run the conversion process and then save to your preferred format.
**PaperPort just had text conversion capabilities.

I have to note that the eCopy Desktop test can be misleading in that it performs auto-orientation on all the pages before performing OCR. Also note that when evaluating an OCR application, speed is not the only factor. You need to decide up front whether you want speed, accuracy, both, or want to focus on formatting. I will write another article on formatting and which application is best in the near future. Below is a link to the output files if you would like to see the formatting:

OCR Testing Results

Interested in reading more on the listed applications, there are some links here to all the tested applications:

OCR Software Information

Buy OCR Software

Document Management and Scanning Information

Thursday, June 19, 2008

OCR and the Scanning Process

Ah, OCR, also known as Optical Character Recognition. Is it really necessary to use OCR software after scanning files to TIFF or PDF? What are the key benefits of OCR? How can I use OCR to create searchable or editable documents?

OCR technology has come a long way in the past few years, and the OCR engines on the market today utilize intelligence and speed to quickly and accurately convert scanned paper documents from plain old images, into searchable or editable documents. For a quick overview of OCR, ICR and OMR, click here.

When looking at OCR technolgies, you need to determine your end goal: is it searchability or a cleanly formatted, editable document. Is your goal speed, or accuracy?

There are a number of desktop applilcations (eCopy Desktop, Adobe, OmniPage, ReadIRIS), that can provide the ability to create searchable files, as well as Word Processing files, or even spreadsheets. These are perfect for low-volume, daily conversions.

If you are scanning a large volume of paper, and need rapid and accurate conversion, most of the Advanced Capture applications on the market can accomplish the task. This capture software utilizes either the Expervision or OmniPage production OCR engines, and can convert a 1000 pages in 10 minutes to searchable PDF.

For more info on OCR and how it can work for you, see the links below:

OCR Software Links

Scanning and Document Management Articles and Research