lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Retrieving values for a NumericDocValuesField [SEC=UNOFFICIAL]
Date Thu, 24 Oct 2013 07:19:01 GMT
Hi Stephen,

On Thu, Oct 24, 2013 at 1:18 AM, Stephen GRAY <stephen.gray@immi.gov.au> wrote:
> I actually need to loop through a large number of documents (50,000 - 100,000) calculating
a number of statistics (min, max, sum) so I really need the most efficient/fastest solution
available. It sounds like it would be best to just store the data in a stored field.

I see. For that many documents, doc values are actually the right
thing to use, sorry if I put you on the wrong track I was assuming you
were only going to collect values from a few documents.

In your case the best option would be to split your doc ids according
to the segment they belong to, and then for each segment, get a
per-segment NumericDocValues instance and aggregate your statistics.
It is better than using MultiDocValues because MultiDocValues needs to
binary-search for the appropriate segment for every document.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message