lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Reopened: (LUCENE-2386) IndexWriter commits unnecessarily on fresh Directory
Date Sun, 11 Apr 2010 14:26:40 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera reopened LUCENE-2386:
--------------------------------


As I indicated in an email, Solr tests failed (sorry for not running them before). After some
investigation (thanks Robert !), that's the problem: before this change, IW always committed
first on an empty directory. It called SegmentInfos.commit(dir), which by a chain of calls
ensured the directory exists (in FSDir) by calling file.mkdirs().

After this change, that chain of calls did not happen ... yet somehow tests we still passing
for Lucene. Some investigation shows that the Solr tests that failed used SingleInstanceLockFactory,
or NoLockFactory. By default, FSDir uses either SimpleFSLF or NativeFSLF. The IW.ctor calls
LF.makeLock and obtain, which for these two LFs meant that calling file.mkdirs ... and thus
the problem was hidden. SingleInstanceLF and NoLF don't do that !

So first, a test which uses FSDir and one of these LFs need to be created, so we catch that
problem in Lucene code (this is not related to Solr -- just a missing test in Lucene). Second
we need to fix IW ctor, or Dir or whatever.

I've added that code to IW.ctor, as a sanity check to make sure it works - and indeed all
Solr tests pass. So that's one option, even though a bit messy.
{code}
try {
  directory.createOutput("temp").close();
} finally {
  directory.deleteFile("temp");
}
{code}

Another option is to add to Directory a prepareForWrite() or simply prepare() which will be
called by IW. A default empty impl on Directory, and file.mkdirs on FSDirectory should be
enough.

A third option is to define clear semantics for dir.listAll(), to throw a NoSuchDirectoryException
and then change IndexFileDeleter to ignore that exception if OpenMode of IW is CREATE*. It
kind of makes sense - if the directory is empty, why bother looking for any index files. Lucene
code today already expects that exception to be thrown in SegmentInfos.getCurrentSegmentGeneration
-- so we kind of say 'either you use RAMDirectory, or a sub-class of FSDirectory, and then
that's what we expect'. So it's not so much of a backwards change ...

While dir.prepare() or prepareForWrite() is very explicit ... it's not protective enough -
one can still call listAll w/o calling prepareForWrite (why would you call it if you just
want to list files) so I'm not sure which is the best option ... Maybe the last option is
the best as at least the caller should not assume anything about the state of the directory.
Just prepare to handle the NoSuchDirectoryException, vs. 'there is a directory but it's empty'
case.

I'll revert my commit until this is resolved.

> IndexWriter commits unnecessarily on fresh Directory
> ----------------------------------------------------
>
>                 Key: LUCENE-2386
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2386
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>             Fix For: 3.1
>
>         Attachments: LUCENE-2386.patch, LUCENE-2386.patch, LUCENE-2386.patch
>
>
> I've noticed IndexWriter's ctor commits a first commit (empty one) if a fresh Directory
is passed, w/ OpenMode.CREATE or CREATE_OR_APPEND. This seems unnecessarily, and kind of brings
back an autoCommit mode, in a strange way ... why do we need that commit? Do we really expect
people to open an IndexReader on an empty Directory which they just passed to an IW w/ create=true?
If they want, they can simply call commit() right away on the IW they created.
> I ran into this when writing a test which committed N times, then compared the number
of commits (via IndexReader.listCommits) and was surprised to see N+1 commits.
> Tried to change doCommit to false in IW ctor, but it got IndexFileDeleter jumping on
me .. so the change might not be that simple. But I think it's manageable, so I'll try to
attack it (and IFD specifically !) back :).

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message