lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: grouping search results
Date Tue, 09 May 2006 19:47:34 GMT

: Damn. That's no good, then. What about doing it the opposite way:
: make a QueryFilter for each category (these could be cached between
: search sessions), and use those to filter the results from searching
: for the user's query? Would that actually be any faster than the
: original idea of constructing a boolean query for each category?

1) if your categories are defined by simple terms, there's no reason why
it would have to be a QueryFilter, you could make a simple TermFilter
class that would do the same thing

2) i don't think it woud be much faster then doing 10 seperate
BooleanQueries -- the advantage would be that in the BooleanQuery approach
the resulting scores will be affected by the category part of the query
... if it's a simple mandatory term query then there's no harm, becuase it
will affect all of the scores in that category equally -- but if it's more
complex (ie: if your portable music category is defined by the query
"name:mp3 or name:ipod or name:cd") then the TF/IDF of each term will
affect the score of the resulting documents, and could change the order.

: > is with a
: > HitCollector that maintains a Bounded PriorityQuery for each
: > category.  as
: > it collects matches, it can look up which category they are in
: > using the
: > FieldCache and add them to the appropriate queue.
: Did you mean PriorityQueue?

yes, sorry ... but i wasn't explicitly refering to the lucene
PriorityQueue class ... i just mean any data structure that will maintain
an ordered list of the N "biggest" items you give it.

: Can you explain what you mean by that? I'm looking at the javadocs
: for FieldCache, but there's no indication of how to obtain one.

FieldCache.DEFAULT.getStringIndex(reader, fieldName) ... or
FieldCache.DEFAULT.getInts(reader, fieldName) depending on how you store
your category info.  The resulting array can be used to lookup values by
docid very quickly for comparison (or to be used as a key in a Map of


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message