lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Future projects
Date Thu, 02 Apr 2009 20:09:22 GMT
I'm not sure how big a win this'd be, since the OS will cache those in
RAM and the CPU cost there (to pull from OS's cache and reprocess) is
maybe not high.

Optimizing search is interesting, because it's the wicked slow queries
that you need to make faster even when it's at the expense of wicked
fast queries.  If you make a wicked fast query 3X slower (eg 1 ms -> 3
ms), it's almost harmless in nearly all apps.

So this makes things like PFOR (and LUCENE-1458, to enable pluggable
codecs for postings) important since it addresses the very large
queries.  In fact for very large postings we should do PFOR minus the
exceptions, ie, do a simple Nbit encode, even if it wastes some bits.

Mike

On Thu, Apr 2, 2009 at 1:52 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
> 4) An additional possibly contrib module is caching the results of
> TermQueries.  In looking at the TermQuery code would we need to cache the
> entire docs and freqs as arrays which would be a memory hog?
>
> On Wed, Apr 1, 2009 at 4:05 PM, Jason Rutherglen
> <jason.rutherglen@gmail.com> wrote:
>>
>> Now that LUCENE-1516 is close to being committed perhaps we can
>> figure out the priority of other issues:
>>
>> 1. Searchable IndexWriter RAM buffer
>>
>> 2. Finish up benchmarking and perhaps implement passing
>> filters to the SegmentReader level
>>
>> 3. Deleting by doc id using IndexWriter
>>
>> With 1) I'm interested in how we will lock a section of the
>> bytes for use by a given reader? We would not actually lock
>> them, but we need to set aside the bytes such that for example
>> if the postings grows, TermDocs iteration does not progress to
>> beyond it's limits. Are there any modifications that are needed
>> of the RAM buffer format? How would the term table be stored? We
>> would not be using the current hash method?
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message