lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <>
Subject [jira] Commented: (LUCENE-2047) IndexWriter should immediately resolve deleted docs to docID in near-real-time mode
Date Tue, 17 Nov 2009 19:34:40 GMT


Jason Rutherglen commented on LUCENE-2047:

I'm browsing through the applyDeletes call path, I'm tempted to
rework how we're doing this. For my own thinking I'd still like
to have a queue of deletes per SR and for the ram doc buffer. I
think this gives future flexibility and makes it really clear
when debugging what's happening underneath. I find the remapping
doc ids to be confusing, meaning stepping through the code would
seem to be difficult. If we're storing doc ids alongside the SR
the docs correspond to, there's a lot less to worry about and
just seems clearer. This may make integrating LUCENE-1313
directly into IW more feasible as then we're working directly at
the SR level, and can tie the synchronization process together.
Also this could make exposing SRs externally easier and aid in
making IW more modular in the future?

I can't find the code that handles aborts. 

> IndexWriter should immediately resolve deleted docs to docID in near-real-time mode
> -----------------------------------------------------------------------------------
>                 Key: LUCENE-2047
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 3.1
>         Attachments: LUCENE-2047.patch, LUCENE-2047.patch
> Spinoff from LUCENE-1526.
> When deleteDocuments(Term) is called, we currently always buffer the
> Term and only later, when it's time to flush deletes, resolve to
> docIDs.  This is necessary because we don't in general hold
> SegmentReaders open.
> But, when IndexWriter is in NRT mode, we pool the readers, and so
> deleting in the foreground is possible.
> It's also beneficial, in that in can reduce the turnaround time when
> reopening a new NRT reader by taking this resolution off the reopen
> path.  And if multiple threads are used to do the deletion, then we
> gain concurrency, vs reopen which is not concurrent when flushing the
> deletes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message