lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Renaud Waldura" <renaud.wald...@library.ucsf.edu>
Subject RE: More Precise Highlighting
Date Fri, 02 Mar 2007 19:01:44 GMT
Hello Mark:

I apologize for not responding earlier, more urgent stuff took over. I
really appreciate your help with this. Since my first message was somewhat
terse, let me explain again.

My index includes both "regular" fields (stored, indexed, tokenized) and
"search-only" fields used for searching only (unstored, indexed, tokenized).
In particular, I have one such field called "metadata" that's the
accumulation of all metadata about the document (title, authors, date,
etc.). Using this field, I can easily query "all metadata" without searching
the document text (that's a separate field). It has no meaning other than
making this kind of query possible.

Say I query "+metadata:apple +banana". I would like to have the term "apple"
highlighted in all metadata fields; if Fiona Apple authored a document, her
lastname should be highlighted in the "author" field, but not in the
document text.

>From what I've experienced, when I pass NULL as fieldname to the
QueryTermExtractor, "apple" gets highlighted everywhere, both in any
metadata AND the document text. When I pass "author" as fieldname, it
doesn't highlight that field at all -- because the query was actually for
"metadata" not "author". It's not correct in either case, and I'm not sure
what to do. 

Thanks again for your help,

--Renaud


 

-----Original Message-----
From: markharw00d [mailto:markharw00d@yahoo.co.uk] 
Sent: Tuesday, February 13, 2007 3:24 PM
To: java-user@lucene.apache.org
Subject: Re: More Precise Highlighting

Not sure I fully understand the problem. The query is effectively
"allContent:someTitleText" and you want to highlight the string
"someTitleText" in the title field?
If you pass null as a fieldname to the QueryTermExtractor it will use all
term values, regardless of field, as string to highlight.
If you pass a fieldname it will only select highlight term values for that
field.
If you want, you can use QueryTermExtractor to extract just the "allContent"
field values and pass a TokenStream for the "title" field to the highlighter
and it would highlight the appropriate values in the title.

Do any of these options work?



Renaud Waldura wrote:
> The old highlighter code used to highlight found terms in any field 
> (too broad). The new highlighter lets one specify a field when 
> highlighting, but it highlights that field only (too narrow).
>  
> In my case we have an "all" field that is the concatenation of all 
> data about the document. When I highlight e.g. the "title" field, 
> nothing happens, because the Highlighter doesn't know the title is 
> included in this "all" field.
>  
> How can I tell the highlighter that my query fields can map to some 
> document fields? It looks like I'd change the QueryTermExtractor, but 
> it's conveniently all-static and final.
>  
> --Renaud
>  
>
>   



	
	
		
___________________________________________________________
All new Yahoo! Mail "The new Interface is stunning in its simplicity and
ease of use." - PC Magazine http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message