lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Judson <>
Subject Re: splitting docIds from a search by segment [SEC=UNOFFICIAL]
Date Sat, 02 Nov 2013 13:36:45 GMT

Is the best way to get the docIDs in a case like this to use to get TopDocs and then get the ScoreDoc[] from



On 10/30/13 4:56 AM, "Michael McCandless" <>

>You should try MultiDocValues first; it's trivial to use and may not
>be horribly slow.
>It must do a binary-search for every docID lookup.
>And then if this is too slow, assuming you traverse the docIDs in
>order, you can use IndexReader.leaves() to get the sub-readers.  The
>docIDs are just "appended" from these sub-readers, so you'd walk your
>docIDs and also walk you sub-readers, moving to the next sub-reader
>once you have a docID that's beyond its end.  Each sub-reader spans
>AtomicReaderContext.docBase to docBase +
>Mike McCandless
>On Wed, Oct 30, 2013 at 2:21 AM, Stephen GRAY <>
>> Hi everyone,
>> I am trying to write an application that loops through 500,000 -
>>1,000,000 documents returned by a search and calculates some statistics
>>using the value in a stored field. Obviously this needs to be as fast as
>>possible so I am using a NumericDocValues field to store the value.
>> What I don't know is how to get the NumericDocValues value for each
>>docId returned by the search. What I've been told to do in a previous
>>thread was:
>> 1.       Split the docIds according to the segment they belong to
>> 2.       Get a per-segment NumericDocValues instance and use this to
>>extract the values
>> Can someone tell me how to do 1 and 2? I don't know how to discover
>>what segment a given docId is in, or how to convert a segment into a
>>NumericDocValues array.
>> By the way it's also been suggested that I just use
>>MultiDocValue.getNumericValues, but I gather that this will be much
>> I'd appreciate any help,
>> Thanks,
>> Steve
>> --------------------------------------------------------------------
>> Important Notice: If you have received this email by mistake, please
>> the sender and delete the message and attachments immediately.  This
>> including attachments, may contain confidential, sensitive, legally
>> and/or copyright information.  Any review, retransmission, dissemination
>> or other use of this information by persons or entities other than the
>> intended recipient is prohibited.  DIAC respects your privacy and has
>> obligations under the Privacy Act 1988.  The official departmental
>> policy can be viewed on the department's website at
>> ---------------------------------------------------------------------
>To unsubscribe, e-mail:
>For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message