lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory
Date Mon, 12 Apr 2010 07:21:40 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855870#action_12855870
] 

Shai Erera commented on LUCENE-2386:
------------------------------------

I'm not sure if we're arguing about the same thing here ... why when I open an IW on empty
Directory I need an empty segment that's created, and from now on never changed, populated
or even read? That just seems wrong to me ... when I fixed the tests to not rely on the buggy
behavior, I noticed several which count the list of commits (especially the IDP ones) w/ a
documentation like "1 for opening + N for committing" ...

It just looks weird that when you open IW a commit happens, a set of empty files are created,
but from now on they are never modified, until IDP kicks in, after the second commit ... it's
nothing like initing the Directory to be able to receive input ..

And I don't know what's the benefit of doing "new IW()" following by "IR.open()" ... that
IR will always see 0 documents, until you call reopen (if commit happened in between). So
what's the convenience here? that your code can call IR.open once, and from that point forward
just 'reopen()'? That seems low advantage to me, really. Maybe what we should do is fix IR.open
to return a null IR in case the directory hasn't been populated w/ anything yet. Then you
can check easily if you should call open() (==null) or reopen (otherwise). Or create a blank
stub of IR which emulates an empty Dir, and when reopen is called works well (if the Directory
is not empty now) ...

BTW, FWIW, Solr's code did not break from this change at all ... it was the combination of
FSDir and NoLF/SingleInstanceLF that broke some tests that used it ... I don't know how many
apps out there are using that combination, but I'd bet it's small? I use that combination,
however in my case an IR is opened only after a commit signal/event is raised (so I don't
check isCurrent often or attempt to reopen()). What I'm trying to say is that this combination
is dangerous, and the application needs to ensure that only one IW is open at any given time,
and I'm sure such apps are more sophisticated then opening IW and then IR just for the convenience
of it.

> IndexWriter commits unnecessarily on fresh Directory
> ----------------------------------------------------
>
>                 Key: LUCENE-2386
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2386
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch,
LUCENE-2386.patch
>
>
> I've noticed IndexWriter's ctor commits a first commit (empty one) if a fresh Directory
is passed, w/ OpenMode.CREATE or CREATE_OR_APPEND. This seems unnecessarily, and kind of brings
back an autoCommit mode, in a strange way ... why do we need that commit? Do we really expect
people to open an IndexReader on an empty Directory which they just passed to an IW w/ create=true?
If they want, they can simply call commit() right away on the IW they created.
> I ran into this when writing a test which committed N times, then compared the number
of commits (via IndexReader.listCommits) and was surprised to see N+1 commits.
> Tried to change doCommit to false in IW ctor, but it got IndexFileDeleter jumping on
me .. so the change might not be that simple. But I think it's manageable, so I'll try to
attack it (and IFD specifically !) back :).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message