jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Péterfi Balázs <b.pete...@i-deal.hu>
Subject Re: searching in OCRed pdf
Date Mon, 26 Jan 2009 16:47:49 GMT
I think it has already OCRed because as I wrote I can search in the pdf 
with adobe reader and it also selects the result. But what I see is a 
scanned paper and I guess there is a text layer "behind" it. Is it possible?

Paco Avila írta:
> You can make a text extractor which perform an OCR.
>
> On Mon, Jan 26, 2009 at 5:25 PM, Péterfi Balázs <b.peterfi@i-deal.hu> wrote:
>   
>> Hello,
>>
>> I'm developing an application that uses jackrabbit and have some problem
>> with searching in pdf files. When I search in a pdf that was generated from
>> a word document it works. When I try to search in a pdf that has a scanned
>> document inside it and I can search through its contents from within Adobe
>> Reader (some sort of Optical Character Recognition) but my application does
>> not obtain results. I don't know how does this kind of pdf work but I need
>> to search in it. Does jackrabbit support it?
>>
>> Thank you!
>> Balazs
>>
>>
>>     
>
>
>
>   

Mime
View raw message