hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Hadoop + Lucene integration: possible? how?
Date Mon, 15 Jan 2007 21:18:15 GMT
Andrzej Bialecki wrote:
> It's possible to use Hadoop DFS to host a read-only Lucene index and use 
> it for searching (Nutch has an implementation of FSDirectory for this 
> purpose), but the performance is not stellar ...

Right, the "best practice" is to copy Lucene indexes to local drives in 
order to search them.  Solr uses rsync to efficiently replicate an 
index.  If, however you have lots of small indexes, it can make sense to 
keep them in HDFS and copy them to local drives as they're deployed. 
Then, when a box fails, one can quickly re-deploy its index to its 


View raw message