lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlos Pita" <carlosjosep...@gmail.com>
Subject Re: Two differently sorted result sets from the same search
Date Fri, 01 Jun 2007 21:27:35 GMT
Mh, now I found out that latests sdks provide a priority queue too, although
unbounded. But, supposing the queue has already top-n elements, I could
compare the current heap with the head (O(1)), if it is smaller or equal
just discard it, and another way remove the head ((O(log(top-n))) and insert
the current element ((O(log(top-n)) again). In overall, the entire algorithm
will be O(n log(top-n)) as wished. Do you think this is a good idea? Do you
know of a better way to achieve the same recurring to lucene arcana?

Cheers,
Carlos

http://java.sun.com/j2se/1.5.0/docs/api/java/util/PriorityQueue.html

On 6/1/07, Carlos Pita <carlosjosepita@gmail.com> wrote:
>
> Hi all,
>
> I need to find a couple of result sets at the same time from the same
> search-criteria. The two sets are sorted according different sort-criteria.
> From both I need just the few top N results, but anyway, because of business
> rules, I need to process the entire hit set for the search. Till now I have
> implemented this search for just one result set on top of custom
> TopFieldDocCollector, Filter and Sort (w/ custom ScoreDocComparator). But
> now, having to generate the two different sets, I see two choices:
>
> 1) Keep the same implementation, but search twice, switching the Sort on
> each search. Here not only the search is done twice but the HitCollector
> iterates along all documentIds twice also.
>
> 2) Instead of a TopFieldDocCollector use some kind of bounded priority
> queue optimized for top-N results. With a HitCollector, a Filter and two of
> these queues it's easy and efficient to find both result sets on one simple
> pass. Do you know of a good implementation of this data structure? This
> seems nice
> http://www.alias-i.com/lingpipe/docs/api/com/aliasi/util/BoundedPriorityQueue.htmlbut
it has some license issues. Is there something similar that I can borrow
> from lucene sources?
>
> Any other suggestion?
>
> Thank you very much in advance.
> Regards,
> Carlos
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message