lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Lucene index on NFS
Date Tue, 02 Oct 2012 13:01:52 GMT
There are no real issues with NFS regarding safety of the data. The problem with NFS is the
following (maybe it is fixed in NFS4, I have no idea):
Lucene deletes index files while they are in use, which is perfectly fine for local file systems
(because the inode is still alive, although it is no longer appearing in directory listing).
Unfortunately the deletes of those index files are not visible to the directory listing asap
when using NFS; also newly added files are not always showing up in the directory listing
once created. This causes problems with Lucene like file not found exceptions. Also the index
directory locking does not work (it times out, because NativeFSLockFactory does not work with
NFS - which is a somehow a bug in NFS).

To use it with NFS make sure:
- Use a custom deletion policy on IndexWriter, so unused files are not deleted asap (https://lucene.apache.org/core/3_6_1/api/all/org/apache/lucene/index/IndexDeletionPolicy.html)
- Use SimpleFSLockFactory

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Paul Libbrecht [mailto:paul@hoplahup.net]
> Sent: Tuesday, October 02, 2012 2:45 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene index on NFS
> 
> I doubt NFS is an unreliable file-system.
> Lucene uses normal random access to files and this has no reason to be
> unreliable unless bad things such as network drops happen (in which case you'd
> get direct failures or  timeouts rather than corruption). I've seen fairly large
> infrastructures being based on NFS and corruption is something I've never
> heard about.
> 
> Note: no concurrent access to a lucene index, right?
> 
> Paul
> 
> 
> Le 2 oct. 2012 à 14:01, Jong Kim a écrit :
> 
> > Thank you all for reply.
> >
> > So it soudns like it is a known fact that the performance would suffer
> > rather significantly when the index files are accessed over NFS. But
> > how about reliability and robustness (which seems even more
> > important)? Isn't there any increased possibility for intermittent
> > errors such as index file corruption (due to cache inconsistency,
> > difference in delete semantics,
> > etc.) when using NFS? Has anyone run into such trouble? Or is it
> > strictly just a performance issue?
> >
> > /Jong
> > On Tue, Oct 2, 2012 at 5:17 AM, Paul Libbrecht <paul@hoplahup.net> wrote:
> >
> >> My experience in the Lucene 1.x times were a factor of at least four
> >> in writing to NFS and about two when reading from there. I'd
> >> discourage this as much as possible!
> >>
> >> (rsync is way more your friend for transporting and replication à la
> >> solr should also be considered)
> >>
> >> paul
> >>
> >>
> >> Le 2 oct. 2012 à 11:10, Ian Lea a écrit :
> >>
> >>> You'll certainly need to factor in the performance of NFS versus
> >>> local
> >> disks.
> >>>
> >>> My experience is that smallish low activity indexes work just fine
> >>> on NFS, but large high activity indexes are not so good,
> >>> particularly if you have a lot of modifications to the index.
> >>>
> >>> You may want to install a custom IndexDeletionPolicy.  See the
> >>> javadocs for details with specific reference to NFS.
> >>>
> >>>
> >>> --
> >>> Ian.
> >>>
> >>> On Tue, Oct 2, 2012 at 3:21 AM, Vitaly Funstein
> >>> <vfunstein@gmail.com>
> >> wrote:
> >>>> How tolerant is your project of decreased search and indexing
> >> performance?
> >>>> You could probably write a simple test that compares search and
> >>>> write speeds of local and NFS-mounted indexes and make the decision
> >>>> based on
> >> the
> >>>> results.
> >>>>
> >>>> On Mon, Oct 1, 2012 at 3:06 PM, Jong Kim <jong.lucene@gmail.com>
> wrote:
> >>>>
> >>>>> Hi,
> >>>>>
> >>>>> According to the Lucene In Action (Second Edition), the section
> >>>>> 2.11.2 "Accessing an index over a remote file system" explains
> >>>>> that there are issues related to accessing a Lucene index across
> >>>>> remote file system including NFS.
> >>>>>
> >>>>> I'm particuarly interested in NFS compatibility, and wondering if
> >> there has
> >>>>> been any work done to solve or mitigate this problem. Has this
> >>>>> issue
> >> been
> >>>>> addressed? If not, are there some reliable work-arounds that make
> >>>>> this possible at the expense of some sacrifice in other areas?
> >>>>>
> >>>>> Any information would be greatly appreciated, since my project
> >>>>> heavily depends on the feasibility of this.
> >>>>>
> >>>>> Thanks
> >>>>> /Jong
> >>>>>
> >>>
> >>> --------------------------------------------------------------------
> >>> - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message