lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory
Date Sun, 11 Apr 2010 16:34:41 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855747#action_12855747
] 

Michael McCandless commented on LUCENE-2386:
--------------------------------------------

Actually I consider this a bug in IW's ACID transactional semantics...

The "I" in ACID is "isolation", meaning, a reader shouldn't ever see
uncommitted changes from a writer.

This bug breaks "isolation", ie, if you open with CREATE, today, IW
may or may not slip in a commit on you, depending (rather
unexpectedly, and likely the opposite of what you'd guess) on whether
a prior index is already there.

So IW sometimes will break ACID transactions, and sometimes not, if
you use CREATE.

Further, IW really shouldn't ever write a commit "automatically" -- it
used to do this (with autoCommit=true), but we stopped doing so
(except for this bug), now that autoCommit is false.

With this fix, IW will never sneak in a commit.  Your app fully
controls when the transaction (including CREATE, which in Lucene,
unlike most RDMBSs, is part of the transaction) becomes visible.

bq. Thats not a very strong argument for a back compat break on a minor release though...

Hmmm... I think the back compat break is very minor.  Also, IW's
now-gone autoCommit never "promised" when commits would be made, so
this was really undocumented behavior.

Solr's test failures were not due to this break; they we actually due
to a sneaky bug that otherwise (had we not had Solr's tests) would
likely have remained undiscovered for quite some time.

And, much of the "noise" in the patch is from tests relying on exact
file names, commit counts, etc.  Plus some of the usual Shai-cleanups :)

I guess if we really care to, we can emulate the "not quite ACID" bug
when Version <= 3.1?

bq. The question is also, what happens if you call IndexWriter.getReader() without the initial
commit? Does this work with your patch?

This should be perfectly fine -- you'll get a reader searching 0 docs.

Shai can you add a test case confirming this?

Shai a couple other things:

  * Please shrink-wrap the try/except in IFD

  * In Directory.listCommits, instead of catching
    IndexNotFoundException and returning empty list, I think we should
    throw it?  You get this exception today, right?


> IndexWriter commits unnecessarily on fresh Directory
> ----------------------------------------------------
>
>                 Key: LUCENE-2386
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2386
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch
>
>
> I've noticed IndexWriter's ctor commits a first commit (empty one) if a fresh Directory
is passed, w/ OpenMode.CREATE or CREATE_OR_APPEND. This seems unnecessarily, and kind of brings
back an autoCommit mode, in a strange way ... why do we need that commit? Do we really expect
people to open an IndexReader on an empty Directory which they just passed to an IW w/ create=true?
If they want, they can simply call commit() right away on the IW they created.
> I ran into this when writing a test which committed N times, then compared the number
of commits (via IndexReader.listCommits) and was surprised to see N+1 commits.
> Tried to change doCommit to false in IW ctor, but it got IndexFileDeleter jumping on
me .. so the change might not be that simple. But I think it's manageable, so I'll try to
attack it (and IFD specifically !) back :).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message