lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2655) Get deletes working in the realtime branch
Date Fri, 01 Oct 2010 10:21:33 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916851#action_12916851
] 

Michael McCandless commented on LUCENE-2655:
--------------------------------------------

Stepping back here...

There are two different goals mixed into the RT branch effort, I
think:

# Make thread states fully independent so flushing is no longer sync'd (plus it's a nice simplification,
eg no more *PerThread in the indexing chain)
# Enable direct searching on the thread states RAM buffer, for awesome
  NRT performance

It seems to me like the first one is not so far off?  Ie we nearly
have it already (LUCENE-2324)... it's just that we don't have the
deletes working?

Whereas the 2nd one is a much bigger change, and still iterating under
LUCENE-2475.

Is it possible to decouple #1 from #2?  Ie, bring it to a committable
state and land it on trunk and let it bake some?

Eg, on deletes, what if we simply have each thread state buffer its own
delete term -> thread's docID, privately?  We know this approach
will "work" (it does today), right?  It's just wasteful of RAM (though,
cutover to BytesRefHash should help alot here), and, makes
deletes somewhat slower since you must now enroll the del term
into each thread state....

It wouldn't actually be that wasteful of RAM, since the BytesRef instance
would be shared across all the maps.  Also, if we wanted (separately)
we could make a more efficient buffer when the app deletes many terms
at once, or, many calls to delete-by-single-term with no adds in between
(much like how Amazon optimizes N one-click purchases in a row...).

I really want to re-run my 6-thread indexing test on beast and see the
indexing throughput double!!

> Get deletes working in the realtime branch
> ------------------------------------------
>
>                 Key: LUCENE-2655
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2655
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: Realtime Branch
>            Reporter: Jason Rutherglen
>             Fix For: Realtime Branch
>
>         Attachments: LUCENE-2655.patch
>
>
> Deletes don't work anymore, a patch here will fix this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message