lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer
Date Tue, 16 Mar 2010 06:28:27 GMT


Jason Rutherglen commented on LUCENE-2312:

{quote}This makes the reference to the array volatile, not the
slots in the array{quote}

That's no good! :)

{quote}If you use a RW lock then the writer thread will block
all reader threads while it's making changes{quote}

We probably need to implement more fine grained locking, perhaps
using volatile booleans instead of RW locks. Fine grained
meaning on the byte array/block level. I think this would imply
that changes are not visible until a given byte block is more or
less "flushed"? This is different than the design that's been
implicated, that we'd read from byte arrays as their being
written to. We probably don't need to read from and write to the
same byte array concurrently (that might not be feasible?).

The performance win here is probably going to be the fact that
we avoid segment merges.  

> Search on IndexWriter's RAM Buffer
> ----------------------------------
>                 Key: LUCENE-2312
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>    Affects Versions: 3.0.1
>            Reporter: Jason Rutherglen
>            Assignee: Michael Busch
>             Fix For: 3.1
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable. 
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing. 
> Michael Busch has good suggestions regarding how to handle deletes using max doc ids.
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here: 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message