lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject RE: pdf and highlighting
Date Thu, 08 Dec 2005 12:00:23 GMT
> if it comes from PdfBox, the wrong text is
> highlighted.

Wrong in what sense?

A couple of things to consider from looking at your
code.
* It is preferable to pass a rewritten query to the
highlighter (pass the same rewritten query to searcher
if you want to avoid query rewriting costs twice).

* If you want to force the highlighter to strictly
match query terms with the document field you are
marking up, pass the relevant fieldname to QueryScorer
constructor (latest version of highlighter from SVN
required). This will then only consider matches for
query terms related to that field. If you dont do this
you could highlight "foo" in a body field when the
query was actually for "title:foo body:bar".


Cheers
Mark


		
___________________________________________________________ 
Yahoo! Exclusive Xmas Game, help Santa with his celebrity party - http://santas-christmas-party.yahoo.net/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message