lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <>
Subject Re: Lucene-based Distributed Index Leveraging Hadoop
Date Thu, 07 Feb 2008 01:31:20 GMT
Ning Li wrote:
> One main focus is to provide fault-tolerance in this distributed index
> system. Correct me if I'm wrong, I think SOLR-303 is focusing on merging
> results from multiple shards right now. We'd like to start an open source
> project for a fault-tolerant distributed index system (or join if one
> already exists) if there is enough interest. Making Solr work on top of such
> a system could be an important goal and SOLR-303 is a big part of it in that
> case.

I guess it depends on how you set up your shards in 303.
We plan on having a master/slave relationship on each shard, so that 
each shard would sync the same way solr does currently.


> I should have made it clear that disjoint data sets are not a requirement of
> the system.
> On Feb 6, 2008 12:57 PM, Ian Holsman <> wrote:
>> Hi.
>> AOL has a couple of projects going on in the lucene/hadoop/solr space,
>> and we will be pushing more stuff out as we can. We don't have anything
>> going with solr over hadoop at the moment.
>> I'm not sure if this would be better than what SOLR-303 does, but you
>> should have a look at the work being done there.
>> One of the things you mentioned is that the data sets are disjoint.
>> SOLR-303 doesn't require this, and allows us to have a document stored
>> in multiple shards (with different caching/update characteristics).

View raw message