lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Pine <>
Subject HitCollector and Sort Objects
Date Thu, 29 Jun 2006 23:53:51 GMT

I've looked at the documentation for:

and it struck me that there are no search methods with
these signatures:

void search(Query query, Filter filter, HitCollector
results, Sort sort)

void search(Query query, HitCollector results, Sort

I have one type of search where I pass in a Query and
a Sort (built with a SortField and Decompresses) and
deal with the Hits object, and another which takes a
Query and a HitCollector, which I then run my own
sorting on the results.

I'd like to combine both into one call so that I can
get the results sorted and have the collected data
available. Is that crazy talk? Does the way document
ids stream through the HitCollector prevent sorting
them first?

Would it be best for me to convert my Decompresses
logic into a Comparator to pass into Arrays.sort() or
something like that and just forget about using the
built in Lucene functionality?

I currently use the HitCollector to count occurrences
of words in the result set, basically trying compute
the TermFrequencies in the result of a search vs.
across the entire index. I've tried using bitsets but
because I can't constrain the number of unique words
people use (currently in the millions in our data),
and the number of document (also currently in the
millions), keeping a cache of bitsets is not feasible
on our hardware (I can save about 10,000 but the cache
hit ratio is very low). If someone thinks my use of
HitCollector is flawed, I'm certainly open to
suggestions. Thanx.


Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message