lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <li...@holsman.net>
Subject Re: Infrastructure for large Lucene index
Date Wed, 11 Oct 2006 08:09:21 GMT


would it be possible to design your solution so that you could have  
multiple replicas at the shard level?

obviously you have a memory issue on how many shards a single machine  
can serve, but if you make your shards small enough you might be able
to get a single machine capable of serving 3-4 shards at any one  
time.  so with 10 machines you would have 40 shards being able to be  
served. you would
then distribute the shards so the popular ones would we on 2-3x as  
many machines as the less popular ones.

loss of a machine could be handled in a non disruptive manner as  
well, as long as you ensure that ALL shards are served by at least 2  
machines in your cluster

would this work or be possible in your situation?

regards
Ian


On 11/10/2006, at 5:04 PM, Otis Gospodnetic wrote:

> It sounds like the 11th node would have to have a large disk with  
> all indices.  Or perhaps you'd keep copies of all your indices  
> elsewhere, and would pull the right one in when you see which node  
> you need to replace.
>
> Otis
>
> ----- Original Message ----
> From: Slava Imeshev <imeshev@yahoo.com>
> To: general@lucene.apache.org
> Sent: Tuesday, October 10, 2006 4:34:07 PM
> Subject: Re: Infrastructure for large Lucene index
>
> Doug,
>
> --- Doug Cutting <cutting@apache.org> wrote:
>
>>> the availability of this approach doesn't scale very cleanly  
>>> though ... if
>>> any one box in either cluster goes down, the entire cluster becomes
>>> unusable.
>>
>> A cost-effective variation works as follows: if you have 10  
>> indexes and
>> 11 nodes, then you keep one node as a spare.  When any of the 10  
>> active
>> nodes fail, the 11th resumes its duties.  While the 11th node is
>> launching you search only 9 out of the 10 indexes, so failover is not
>> entirely seamless, but it's a lot cheaper than mirroring all nodes.
>
> How does the 11th know what index it has to bring up? In other words,
> where would it get the lost index?
>
> Slava
>
>
>
>

--
Ian Holsman
Ian@Zilbo.com
http://personalinjuryfocus.com/




Mime
View raw message