lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawn Zoƫ Raison <d...@digitorial.co.uk>
Subject Re: PDF Highlighting using PDF Highlight File
Date Thu, 12 May 2011 15:04:17 GMT

On 12/05/2011 15:47, Wulf Berschin wrote:
> I think support for highlighting documents would be a very welcome 
> feature. Highlighting HTML documents is already possible with the 
> org.apache.solr.analysis.HTMLStripCharFilter and a NullFragmenter, but 
> ther seems to be nothing for highlighting PDF files...

It would be very useful. That said being able to highlight words in any 
pictorial representation of a document would be a huge bonus.

> As starting point I quarried out from 
> org.apache.lucene.search.highlight.Highlighter the class below which 
> just returns the Tokens contributing to the hit.
>
I use a similar (home brew) solution to extract the hit terms, and then 
pass them to the Adobe PDF viewer plugin as a search term via the PDF URL.


-- 

Rgds.
*Dawn Raison*
Technical Director, Digitorial Ltd.

E:dawn@digitorial.co.uk	W:http://www.digitorial.co.uk
M: 07956 609 618	        T: 01428 729 431
Reg: 04644583, England&  Wales
Church Villas Ecchinswell, Newbury, RG20  4TT

This email and any attached files are for the exclusive use of the 
addressee and may contain privileged and/or confidential information. If 
you receive this email in error you should not disclose the contents to 
any other person nor take copies but should delete it immediately. 
Digitorial Ltd makes no warranty as to the accuracy or completeness of 
this email and accepts no liability for its contents or use. Any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Digitorial Ltd.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message