lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2047) IndexWriter should immediately resolve deleted docs to docID in near-real-time mode
Date Thu, 12 Nov 2009 07:01:40 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776877#action_12776877
] 

Jason Rutherglen commented on LUCENE-2047:
------------------------------------------

After thumbing a bit through the code, somehow, and this is a
bit too complex for me to venture a guess on, we'd need to
prevent the deletion of the SR's files we're deleting from, even
if that SR is no longer live. Which means possibly interacting
with the deletion policy, or some other method. It's a bit hairy
in the segment info transaction shuffling that goes on in IW. 

I believe asyncing the live deletes is a good thing, as that way
we're taking advantage of concurrency. The possible expense is in
deleting from segments that are no longer live while the
deleting is occurring.

> IndexWriter should immediately resolve deleted docs to docID in near-real-time mode
> -----------------------------------------------------------------------------------
>
>                 Key: LUCENE-2047
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2047
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2047.patch, LUCENE-2047.patch
>
>
> Spinoff from LUCENE-1526.
> When deleteDocuments(Term) is called, we currently always buffer the
> Term and only later, when it's time to flush deletes, resolve to
> docIDs.  This is necessary because we don't in general hold
> SegmentReaders open.
> But, when IndexWriter is in NRT mode, we pool the readers, and so
> deleting in the foreground is possible.
> It's also beneficial, in that in can reduce the turnaround time when
> reopening a new NRT reader by taking this resolution off the reopen
> path.  And if multiple threads are used to do the deletion, then we
> gain concurrency, vs reopen which is not concurrent when flushing the
> deletes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message