incubator-lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-user] Couldn't completely remove 'seg_N'
Date Fri, 18 Nov 2011 15:18:20 GMT
On Fri, Nov 18, 2011 at 12:22:09PM +0200, goran kent wrote:
> On Thu, Nov 17, 2011 at 9:08 AM, Marvin Humphrey <marvin@rectangular.com> wrote:
> > If you don't supply a hostname, machines will zap each other's lockfiles.
> 
> Another one of these has popped up this morning:
> 
> "Lucy::Index::Indexer->new failed (Couldn't completely remove 'seg_2'"
> 
> ...even though I'm using IndexManager:
> 
> my $manager = Lucy::Index::IndexManager->new(
>         host => $host,
>     );
> $index = Lucy::Index::Indexer->new(
>             schema   => $schema,
>             index    => $target,
>             manager  => $manager,
>             create   => 1,
>             truncate => 0,
>         );
> 
> The lockfile contains:
> {
>   "host": "host6",
>   "name": "write",
>   "pid": "24342"
> }
 
> The hostname and PID correspond to the current host and the PID
> corresponds to the script trying to update the index at the time of
> the Lucy::Index::Indexer->new above.

OK, all that looks correct.  Also, since the lockfile is still there and
definitely corresponds to the process that crashed, we can assume that no
other process has messed with the index directory since.

Question: is there a seg_2 folder in the index dir?  If so, is there anything
inside it?

The other question is *why* seg_2 existed in that index, because even if it's
gone now, it was there before.  Either an Indexer crashed, or an Indexer was
created but commit() was never called.

> Is my code sample above correct in it's usage of IndexManager()?  eg,
> do I need to do specify anything else to ensure write exclusivity?  Is
> there something else going on here?

It could be NFS cache consistency: a deletion operation succeeds, and the item
is really gone from the NFS drive, but the local cache of the NFS client
doesn't get updated in time and a subsequent check on whether the item exists
returns an incorrect result.  

    http://nfs.sourceforge.net/#faq_a8

    Perfect cache coherency among disparate NFS clients is very expensive to
    achieve, so NFS settles for something weaker that satisfies the
    requirements of most everyday types of file sharing. 

A tremendous amount of energy has gone into making NFS mimic local file system
behaviors as closely as possible, both by the NFS devs and by us (see
<http://incubator.apache.org/lucy/docs/perl/Lucy/Docs/FileLocking.html>) but
it's a very hard problem and compromises are impossible to avoid.

Best practice would be to avoid writing to Lucy indexes on NFS drives if
possible.  Read performance is going to be lousy anyway unless you make the
NFS mount read-only.

Marvin Humphrey


Mime
View raw message