lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1516) Integrate IndexReader with IndexWriter
Date Wed, 04 Mar 2009 19:59:56 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678884#action_12678884
] 

Michael McCandless commented on LUCENE-1516:
--------------------------------------------


bq. I added a flushDeletesToDir flag that defaults to true except for IW.getReader.

I think that should not be necessary.  On releasing a reader, if it
has pending changes and we want to drop it from the pool, we should
move its deletes to the directory?

SegmentMerger should clear the hasChanges on the reader after its done
merging, so we don't bother saving delete files for segments that were
merged away.

bq. In TestConcurrentMergeScheduler.testNoWaitClose I'm seeing a couple of exceptions.

I think the first is the root cause (and causes the 2nd one).

Which test case hits that?  The test cases that intentionally throw
exceptions at interesting times are especially fun to debug ;)

{quote}
Because the patch does not flush deletes to disk like the existing
code does, the SegmentInfo delGen etc isn't updated at the same
points as expected.
{quote}

IFD should be fine with this -- there must be something else at play,
causing us to over-decRef.

Other notes:

  * You still have SegmentInfoKey (see above)

  * The new DirectoryIndexReader.open should just take IndexWriter
    (see above)?

  * Rather than the getSegmentsInUse "polling" approach, I think you
    can incrementally add & remove readers from the pool?  EG after a
    merge commits, drop the readers for the just-merged segments and
    add a reader for the newly merged segment (and using it to hold
    the deletes carried over in commitMergedDeletes).  I think for
    just-flushed segments you can wait for a getReader() call to
    happen again, to init their readers, unless something else does
    first (flushing deletes, merging)?

  * You commented out the last part of commitMergedDeletes, that
    actually saves the deletes.  You need to instead get the reader
    for the merged segment from the pool and hand it the new deletes.

  * We lost the cutover to using the already computed docMap in
    commitMergedDeletes?  We should put that back -- it saves having
    to read in the prior deletions from disk.


> Integrate IndexReader with IndexWriter 
> ---------------------------------------
>
>                 Key: LUCENE-1516
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1516
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch,
LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch,
LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch,
LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch, LUCENE-1516.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> The current problem is an IndexReader and IndexWriter cannot be open
> at the same time and perform updates as they both require a write
> lock to the index. While methods such as IW.deleteDocuments enables
> deleting from IW, methods such as IR.deleteDocument(int doc) and
> norms updating are not available from IW. This limits the
> capabilities of performing updates to the index dynamically or in
> realtime without closing the IW and opening an IR, deleting or
> updating norms, flushing, then opening the IW again, a process which
> can be detrimental to realtime updates. 
> This patch will expose an IndexWriter.getReader method that returns
> the currently flushed state of the index as a class that implements
> IndexReader. The new IR implementation will differ from existing IR
> implementations such as MultiSegmentReader in that flushing will
> synchronize updates with IW in part by sharing the write lock. All
> methods of IR will be usable including reopen and clone. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message