lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neeraj Gupta" <neeraj.gupt...@hewitt.com>
Subject Re: Paging & Sorting
Date Thu, 07 Aug 2008 15:57:12 GMT
A Gentle Reminder! Any Solution around this problem?

Regards
-Neeraj




Thank  you for the reply, As you said

Sure, just iterate over the first 100 entries in your Hits object

It means before Iteration Lucene has already spent time and memory in 
finding all the 50k documents and sorting them, then i will retrieve 100 
documents. Hence it is costing me a search for 50k documents + sorting of 
these 50k matching documents + retrieval of first 100 documents.. Right?

But what i want is the way, to get the same result as per the above 
process but with not costing me to search and sort all 50K matching 
documents. is this possible?

Many Thx!
-Neeraj



"Erick Erickson" <erickerickson@gmail.com> 

08/05/2008 04:16 PM
Please respond to
java-user@lucene.apache.org



To
java-user@lucene.apache.org
cc

Subject
Re: Paging & Sorting






Sure, just iterate over the first 100 entries in your Hits object
(or topdocs).

If you're asking how to ignore 49,900 of your documents (that
is, not even consider them at all), you're asking the impossible
because you can't know whether to ignore those other docs
unless you sort them first.

If you're asking if you can get the 100 most relevant docs and
then sort only those, then you should look at FieldSortedHitQueue.
What you'd do is feed the top 100 documents into an instance
of FieldSortedHit queue then read traverse the queue.

If none of this applies, could you explain in a bit more detail?

Best
Erick

On Tue, Aug 5, 2008 at 4:35 PM, Neeraj Gupta 
<neeraj.gupta.2@hewitt.com>wrote:

> Hi,
>
> I need first 100 documents in a sorted order lets say sorted on the
> document id and there are more then 50K documents in the index. My 
search
> query is matching all those 50K documents. Is there any way to get only
> first 100 documents that too in a sorted order of document id. I mean
> Lucene will only give out those 100 documents?
>
> Many Thx!









The information contained in this e-mail and any accompanying documents may contain information
that is confidential or otherwise protected from disclosure. If you are not the intended recipient
of this message, or if this message has been addressed to you in error, please immediately
alert the sender by reply e-mail and then delete this message, including any attachments.
Any dissemination, distribution or other use of the contents of this message by anyone other
than the intended recipient is strictly prohibited. All messages sent to and from this e-mail
address may be monitored as permitted by applicable law and regulations to ensure compliance
with our internal policies and to protect our business. E-mails are not secure and cannot
be guaranteed to be error free as they can be intercepted, amended, lost or destroyed, or
contain viruses. You are deemed to have accepted these risks if you communicate with us by
e-mail. 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message