lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Tosovsky" <j.tosov...@email.cz>
Subject RE: Link Lucene index with Adobe reader
Date Tue, 06 Feb 2018 19:55:52 GMT
On 2018-02-06 Anuradha Rajaram (RBEI/ETB14) wrote:

> We are using Lucene for indexing the PDF. We need to link generated lucene
> index with Adobe reader.

In Adobe Acrobat there is a dedicated feature solving this task: Embed
index. It builds search index and stores it inside the PDF file. You can
indeed speedup your search significantly. In 2k pages manual you get results
instantly instead of several seconds without embedded index. But:
(1) It is supported in Acrobat Reader only. 
(2) That mechanism is undocumented. There is no specification of that format
available. It means you are locked-in to Adobe software though some
companies offer this feature as well, most likely based on some reverse
engineering or leaked Acrobat source code (so the compatibility with the
original is disputable).
(3) That index cannot be embedded in bulk with standard Adobe tools.

> Current Approach:
> Placed both the generated lucene index and PDF in  the folder. Open the
PDF
> and search for a word using Advance search in Adobe reader. Whole PDF is
> searched without using lucene index. Close Adobe reader.
> Open the PDF again in adobe reader and do Advance search. This time PDF is
> searched using index.

Are you sure your index is used? Isn't that just kind of search cache?
Acrobat Reader has to understand your index. I doubt it can work
out-of-the-box.

Jan


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message