lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Earwin Burrfoot (JIRA)" <>
Subject [jira] Commented: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory
Date Mon, 12 Apr 2010 08:55:41 GMT


Earwin Burrfoot commented on LUCENE-2386:

Meh, that all is just a matter of perspective.

It doesn't look weird for me that an empty commit happens. Go, create some SVN repository
- it has initial #0 commit inside. I bet all version control systems and all databases have
the concept of null initial commit.
This empty commit is what discerns empty existing directory from empty lucene index. If you
create some docs, index them, then delete and commit, you're going to get just the same picture
- a directory with a single segments* file (only generation number will differ).

And references to "but from now on they are never modified" are just weird - it's a Lucene
index dammit, no files are ever modified here.

But, what looks *really* weird to me is suggestions like
bq. Then you can check easily if you should call open() (==null) or reopen (otherwise). Or
create a blank stub of IR which emulates an empty Dir, and when reopen is called works well
(if the Directory is not empty now) ...
adding complexity and workarounds to plug a hole that didn't exist before the "fix" that does
essentially nothing of value.

My question is still unanswered - what is the proper way (after this fix) to open an IR over
possibly-empty directory? My app, for example, opens an IR and does scheduled reopens. The
time interval is small, a reopen over unmodified index is a noop, so this is all beautifully
simple and just as effective as waiting for a commit event (like I did in the past, just like
you). It opens an IndexWriter with CREATE_OR_APPEND mode before opening the first reader,
and is thus guaranteed to have a good index directory regardless of the situation.
Immediate fix I see is to check whether the directory is empty and do a first empty commit
myself - UGLY.

> IndexWriter commits unnecessarily on fresh Directory
> ----------------------------------------------------
>                 Key: LUCENE-2386
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>         Attachments: LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch,
> I've noticed IndexWriter's ctor commits a first commit (empty one) if a fresh Directory
is passed, w/ OpenMode.CREATE or CREATE_OR_APPEND. This seems unnecessarily, and kind of brings
back an autoCommit mode, in a strange way ... why do we need that commit? Do we really expect
people to open an IndexReader on an empty Directory which they just passed to an IW w/ create=true?
If they want, they can simply call commit() right away on the IW they created.
> I ran into this when writing a test which committed N times, then compared the number
of commits (via IndexReader.listCommits) and was surprised to see N+1 commits.
> Tried to change doCommit to false in IW ctor, but it got IndexFileDeleter jumping on
me .. so the change might not be that simple. But I think it's manageable, so I'll try to
attack it (and IFD specifically !) back :).

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message