lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Future projects
Date Wed, 08 Apr 2009 10:00:00 GMT
On Tue, Apr 7, 2009 at 7:05 PM, Jason Rutherglen
<jason.rutherglen@gmail.com> wrote:
>  >  I think we should keep it simple, unless we discover real perf problems
> with the current approach.
>
> Simple is good, however the indexing performance will lag because we're back
> to the indexing speed of pre ram buffer? (i.e. merging segments using a
> ramdirectory).

Only if you open a new reader after each added document.

Then, your indexing throughput will be low.  But, that's expected (you
don't use NRT to maximize indexing throughput... you use it to
minimize index-to-search delay).

I wouldn't expect this to be a problem in practice, though.  Because
if you find the indexing throughput is too low for your app (ie, you
need NRT and a relative fast indexing throughput), then don't open the
NRT reader after every single doc.  EG, open it every 100 ms instead,
and you'll increase your indexing throughput.

>> need to do a merge sort (across the N thread states)
>
> I'm confused about why a merge sort is required?

The docIDs in each thread's ram buffer do not "concatenate" like
MultiReader.  Instead, they "interleave".  One thread has docIDs 0, 4,
6.  Another has 1, 5, 7.  Another has 2, 3, 9.  Etc.  So a TermDocs
must "zip" them back together when iterating.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message