lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Litchfield <...@csh.rit.edu>
Subject RE: Highlighting PDF file after the search
Date Mon, 27 Sep 2004 18:37:28 GMT

With some work this is possible with PDFBox.  PDFBox extracts text with
positioning and sizing.  When the text was found you could add to the page
content stream the drawing of a highlighted box.

PDFBox has an open RFE for this functionality, please monitor it for
progress.

http://sourceforge.net/tracker/index.php?func=detail&aid=1035635&group_id=78314&atid=552835

Ben

On Mon, 27 Sep 2004 Balasubramanian.Vijay@epamail.epa.gov wrote:

> Bruce,
> You are right, i tried this morning and when i try to stream the
> higlighter output as pdf, acrobat was not able to read or open it!!
> Which project do you recommend that would do pdf highlighting?
>
> Thanks,
> Vijay Balasubramanian
> DPRA Inc.,
>
>
>
>
>                       Bruce Ritchie
>                       <bruce@jivesoftwa        To:       Lucene Users List <lucene-user@jakarta.apache.org>
>                       re.com>                  cc:
>                                                Subject:  RE: Highlighting PDF file after
the search
>                       09/20/2004 05:35
>                       PM
>                       Please respond to
>                       Lucene Users List
>
>
>
>
>
>
> > From: Balasubramanian.Vijay@epamail.epa.gov
>
> > I can successfully index and search the PDF documents,
> > however i am not able to highlight the searched text in my
> > original PDF file (ie: like dtSearch highlights on original file)
> >
> > I took a look at the highlighter in sandbox, compiled it and
> > have it ready.  I am wondering if this highlighter is for
> > highlighting indexed documents or can it be used for PDF
> > Files as is !  Please enlighten !
>
> The highlighter code in sandbox can facilitate highlighting of text
> *extracted* from the PDF, however it does nothing for you to highlight
> search terms *inside* of the PDF. For that you will need some sort of
> tool
> that can modify the PDF on the fly as the user views it. I know of no
> quick
> and dirty tool that allows you to do this, though there is quite a few
> projects and products which allow you to manipulate PDF files which
> likely
> can be used to obtain the behavior you are looking for (with some effort
> on
> your part).
>
>
> Regards,
>
> Bruce Ritchie
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message