Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-dev@lucene.apache.org
Received-SPF: error (idunn.apache.osuosl.org: domain mikemccandless.com from
 66.111.4.25 cause and error)
Message-ID: <4513E3D4.7050906@mikemccandless.com>
Date: Fri, 22 Sep 2006 09:23:32 -0400
From: Michael McCandless <lucene@mikemccandless.com>
User-Agent: Thunderbird 1.5.0.7 (Windows/20060909)
MIME-Version: 1.0
To: java-dev@lucene.apache.org
Subject: Re: Distributed Indexes, Searches and HDFS
References: <34cc3b0a0609210751w66bc8835y99c1747976186ef7@mail.gmail.com>
	 <34cc3b0a0609211544v465b04fx88d46dc61351bec5@mail.gmail.com>
 <c68e39170609212047ida90d08n8176ef095d3ae3b1@mail.gmail.com>
In-Reply-To: <c68e39170609212047ida90d08n8176ef095d3ae3b1@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit


I think this is a great question ("what's the best way to really scale
up Lucene?").  I don't have alot of experience in that area so I'll
defer to others (and I'm eager to learn myself!).

I think understanding Solr's overall approach (whose design I believe
came out of the thread you've referenced) is also a good step here.
Even if you can't re-use the hard links trick, you might be able to
reuse its snapshotting & index distribution protocol.

However, I have been working on some "bottoms up" improvements to
Lucene (getting native OS locking working and [separate but related]
"lock-less commits") that I think could be related to some of the
issues you're seeing with HDFS -- see below:

 > > The cronjob/link solution which is quite clean, doesn't work well in
 > > a windows environment. While it's my favorite, no dice... Rats.
 >
 > There may be hope yet for that on Windows.  Hard links work on
 > Windows, but the only problem is that you can't rename/delete any
 > links when the file is open. Michael McCandless is working on a
 > patch that would eliminate all renames (and deletes can be handled
 > by deferring them).

Right, with "lock-less commits" patch we never rename a file and also
never re-use a file name (ie, making Lucene's use of the filesystem
"write once").

 > 1) Indexing and Searching Directly from HDFS
 >
 > Indexing to HDFS is possible with a patch if we don't use CFS. While
 > not ideal performance-wise, it's reliable and takes care of data
 > redundancy, component failure and means that I can have cheap small
 > drives instead of a large expensive NAS. It's also quite simple to
 > implement (see Nutch's indexer.FsDirectory for the Directory
 > implmentation)

This is very interesting!  I don't know enough about HDFS (yet!).  On
very quick read, I like that it's a "write once" filesystem because
it's a good match to lock-less commits.

 > So I would have several indexes (ie 16) and the same number of
 > indexers, and a searcher for each index (possibly in the same
 > process) that searches each one directly from HDFS. One problem I'm
 > having is an occasional filenotfound exception. (Probably locking
 > related)
 >
 > It comes out of the Searcher when I try to do a search while things
 > are being indexed. I'd be interested to know what exactly is
 > happening when this exception is thrown, maybe I can design around
 > it. (Do synchronization at the appropriate times or similar)

That exception looks disturbingly similar to the ones Lucene hits on
NFS.  See here for gory details:

     http://issues.apache.org/jira/browse/LUCENE-673

The summary of that [long] issue is that these exceptions seem to be
due to cache staleness of Lucene's "segments" file (due to how the NFS
client does caching, even on NFS V4 client/server) and not in fact due
to locking (as had been previously assumed/expected).  The good news
is the lock-less commits fixes resolve this at least in my testing so
far (ie, make it possible to share a single index over NFS).

I wonder if in HDFS a similar cause is at work?  HDFS is "write once"
but the current Lucene isn't (not until we can get lock-less commits
in).  For example, it re-uses the "segments" file.

I think even if lock-less commits ("write once") enables sharing of a
single copy of index over remote filesystems like HDFS or NFS or
SMB/CIFS, whether or not that's performant enough (vs replicating
copies to local filesystems that are presumably quite a bit faster at
IO, at the expense of local storage consumed) would still be a big
open question.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org