openoffice-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rory O'Farrell <>
Subject Re: Using tesseract-ocr from within AOO ?
Date Sun, 03 Jan 2016 13:40:14 GMT
On Sun, 3 Jan 2016 14:28:22 +0100
Joost Andrae <> wrote:

> Hi there,
> I've just played with the OpenSource OCR engine 
> and it seems to do it's job 
> very well to do OCR on scanned bitmaps.
> As it comes with Apache License 2.0 and as it's available as C++ source 
> code why not integrating it's functionality into AOO or building an 
> extension that either connect it's API to AOO or which connect's it 
> using it's command line arguments ?
>  From my perspective both projects would benefit...
> Just my 2 EUR cents....
> Kind regards, Joost

My experience with it was that it was very accurate, perhaps very close in accuracy to the
best commercial products under Windows.  I was undertaking a major OCR project (ebook preparation
of two out of print 220 page books); I found that using a scan and OCR application under linux
(Linux-Intelligent-Ocr-Solution) made more sense for a project of this size; I later fed the
plain text files into OO Writer for detailed spellchecking and reformatting.

I doubt that full integration with OpenOffice would be a good idea; an extension might be
possible, although I doubt its general usefulness will be worth the effort of writing it.

Rory O'Farrell <>

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message