lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: [jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation
Date Sat, 06 Dec 2008 20:41:51 GMT

Mark Miller wrote:

>> EG when sorting by field, we could pull say an IntData iterator from
>> the reader, and then access the int values in docID order as we visit
>> the docs.
> We need random access after collecting/visiting we put  
> what we collect into a map? If a lot of docs match?

I guess I was thinking we'd copy the values into the  
FieldSortedHitQueue entry if the doc "made it" into the queue, though,  
I'm no longer liking this idea as much (such slurping can easily  
become very costly).

But, we could still save the binary search during comparisons when  
inserting into FieldSortedHitQueue if each entry maintained a pointer  
to the SegmentReader's array, and docID within that array.  Ie "cache"  
which sub-reader the binary search had chosen.  And, even though we'd  
have random access to the native array, since we are still accessing  
the field values sequentially by docID, we should not have to re-do  
the binary search every time.

If we can implement this approach (not sure exactly how just yet...)  
then I think that'd probably drop the additional cost of not having  
materialized the global field value array to near zero.

>> For norms, which we should eventually switch to FieldCache +
>> column-stride fields, it would be the same story.
>> Accessing via iterator should go a long ways to reducing the overhead
>> of "using a method" instead of accessing the full array directly.
> I agree...dropping the binary search per id access on a multi reader  
> is a must (at least for the common sort use case of field cache).
> I'm still attempting to consolidate my ideas on this. I've come at  
> it from a rather lazy approach - Hoss laid out a an API that was  
> 'almost' complete, and I just completed things and pushed in a  
> couple directions (the ArrayObject). I've been thinking from the  
> inside out. Originally I was just playing around and trying to build  
> some interest. I have more interest in getting this finished now  
> though, and it sounds like others do as well. This discussion will  
> hopefully get things moving.

I hear you!


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message