Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 61563 invoked from network); 27 Sep 2004 18:37:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 27 Sep 2004 18:37:47 -0000 Received: (qmail 12187 invoked by uid 500); 27 Sep 2004 18:37:34 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 12141 invoked by uid 500); 27 Sep 2004 18:37:33 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 12126 invoked by uid 99); 27 Sep 2004 18:37:33 -0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of ben@csh.rit.edu designates 129.21.60.6 as permitted sender) Received: from [129.21.60.6] (HELO blacksheep.csh.rit.edu) (129.21.60.6) by apache.org (qpsmtpd/0.28) with ESMTP; Mon, 27 Sep 2004 11:37:31 -0700 Received: from fury.csh.rit.edu (fury.csh.rit.edu [IPv6:2001:470:1f00:135:a00:20ff:fe8d:5399]) by blacksheep.csh.rit.edu (Postfix) with ESMTP id 819CE91D0 for ; Mon, 27 Sep 2004 14:37:30 -0400 (EDT) Received: by fury.csh.rit.edu (Postfix, from userid 38448) id 3B4C814A7; Mon, 27 Sep 2004 14:37:29 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by fury.csh.rit.edu (Postfix) with ESMTP id 179A414A6 for ; Mon, 27 Sep 2004 14:37:29 -0400 (EDT) Date: Mon, 27 Sep 2004 14:37:28 -0400 (EDT) From: Ben Litchfield To: Lucene Users List Subject: RE: Highlighting PDF file after the search In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N With some work this is possible with PDFBox. PDFBox extracts text with positioning and sizing. When the text was found you could add to the page content stream the drawing of a highlighted box. PDFBox has an open RFE for this functionality, please monitor it for progress. http://sourceforge.net/tracker/index.php?func=detail&aid=1035635&group_id=78314&atid=552835 Ben On Mon, 27 Sep 2004 Balasubramanian.Vijay@epamail.epa.gov wrote: > Bruce, > You are right, i tried this morning and when i try to stream the > higlighter output as pdf, acrobat was not able to read or open it!! > Which project do you recommend that would do pdf highlighting? > > Thanks, > Vijay Balasubramanian > DPRA Inc., > > > > > Bruce Ritchie > > re.com> cc: > Subject: RE: Highlighting PDF file after the search > 09/20/2004 05:35 > PM > Please respond to > Lucene Users List > > > > > > > > From: Balasubramanian.Vijay@epamail.epa.gov > > > I can successfully index and search the PDF documents, > > however i am not able to highlight the searched text in my > > original PDF file (ie: like dtSearch highlights on original file) > > > > I took a look at the highlighter in sandbox, compiled it and > > have it ready. I am wondering if this highlighter is for > > highlighting indexed documents or can it be used for PDF > > Files as is ! Please enlighten ! > > The highlighter code in sandbox can facilitate highlighting of text > *extracted* from the PDF, however it does nothing for you to highlight > search terms *inside* of the PDF. For that you will need some sort of > tool > that can modify the PDF on the fly as the user views it. I know of no > quick > and dirty tool that allows you to do this, though there is quite a few > projects and products which allow you to manipulate PDF files which > likely > can be used to obtain the behavior you are looking for (with some effort > on > your part). > > > Regards, > > Bruce Ritchie > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-user-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org