lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <>
Subject Re: [jira] Commented: (LUCENE-743) IndexReader.reopen()
Date Mon, 12 Nov 2007 23:30:59 GMT
Horse poo poo.  If you are working in a local environment, the files  
should be opened with exclusive access. This guarantees that the  
operations will succeed for the calling process.

That NFS is a viable solution is highly debatable, and IMO shows a  
lack of understanding of NFS and the unix/linux filesystem design  
principles.  Read about why unix never offered file locking, and  
never really needed it...

Still, if the proper uses of exclusive access controls is used,  
Lucene (and Java) have no problems working in NFS/shared filesystem  

Sorry but that some only recently became aware of FD.sync() shows  
that they don't really know enough to be designing/testing systems  
like this.

Sorry if the tone of this is harsh, but I hate seeing lots of complex  
code because the designers fail to understand the basic operating  
principles of what they are working with...

On Nov 12, 2007, at 5:18 PM, Michael McCandless wrote:

> Not just virus scanners: any program that uses the Microsoft API for
> being notified of file changes.  I think TortoiseSVN was one such
> example.
> People who embed Lucene can't control what their users install on
> their desktops.  Virus scanners are naturally very common on
> desktops.  I think we want Lucene to work in these cases.
> NFS (and other shared filesystems) is a convenient, if not performant,
> way to share an index.  I think Lucene should work in such cases
> as well.
> Mike
> "robert engels" <> wrote:
>> What are you basing the "rename" is not reliable on windows on? That
>> a virus scanner has the file open. If that is the case, that should
>> either be an incorrect setup, or the operation retried until it
>> completes.
>> Writing directly to a file that someone else can open for reading is
>> bound to be a problem. If the file is opened exclusive for write,
>> then the others will be prohibited from opening for read, so there
>> should not be a problem.
>> All of the "delete on last close" stuff is a poor design. The
>> database can be resync on startup.
>> The basic design flaw is one I have pointed out many times - you
>> either use Lucene in a local environment, or a server environment.
>> Using NFS to "share" a Lucene database is a poor choice (normally due
>> to performance, but there are other problems - e.g. resource and user
>> monitoring, etc.) is a poor choice !.
>> People have written reliable database systems without very advanced
>> semantics for years. There is no reason for all of this esoteric code
>> in Lucene.
>> Those that claim, Lucene had problems with NFS in the past, did not
>> perform reliable testing, or their OS was out of date.  What is
>> Lucene was failing for an OS needed an update, would you change
>> Lucene, or fix/update the OS??? Obviously the former.
>> Some very loud voices complained about the NFS problems without doing
>> the due diligence and test cases to prove the problem. Instead they
>> just mucked up the Lucene code.
>> On Nov 12, 2007, at 4:54 PM, Michael McCandless wrote:
>>> robert engels <> wrote:
>>>> Then how can the commit during reopen be an issue?
>>> This is what happens:
>>>   * Reader opens latest segments_N & reads all SegmentInfos
>>>     successfully.
>>>   * Writer writes new segments_N+1, and then deletes now un- 
>>> referenced
>>>     files.
>>>   * Reader tries to open files referenced by segments_N and hits  
>>> FNFE
>>>     when it tries to open a file writer just removed.
>>> Lucene handles this fine (it just retries on the new segments_N+1),
>>> but the patch in LUCENE-743 is now failing to decRef the Norm
>>> instances when this retry happens.
>>>> I am not very family with this new code, but it seems that you need
>>>> to write and then rename to segments.XXX.
>>> We don't rename anymore (it's not reliable on windows).  We write
>>> straight to segments_N.
>>>> As long as the files are sync'd, even on nfs the reopen should not
>>>> see segments.XXX until is is ready.
>>>> Although lockless commits are beneficial in their own rite, I still
>>>> think that people's understanding of NFS limitations are
>>>> flawed. Read the section below on "close to open" consistency.  
>>>> There
>>>> should be no problem using Lucene across NFS - even the old  
>>>> version.
>>>> The write-once nature of Lucene makes this trivial.  The only
>>>> problem was the segments file, which is lucene used the read/write
>>>> lock and close(0 correctly never would have been a problem.
>>> Yes, in an ideal world, NFS server+clients are supposed to implement
>>> close-to-open semantics but in my experience they do not always
>>> succeed.  Previous version of Lucene do in fact have problems over
>>> NFS.  NFS also does not give you "delete on last close" which Lucene
>>> normally relies on (unless you create a custom deletion policy).
>>> Mike
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message