incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: Shard takeover behavior
Date Thu, 06 Mar 2014 11:30:56 GMT
I came to know about zk.session.timeout variable just now, while reading
more about this problem.

This will only trigger dead-node notification after the configured timeout
exceeds. Setting it to 3-4 mins must be fine for OOMs and rolling-restarts.

Only extra stuff I am looking for, is to divert search calls to a read-only
shard instance during this 3-4 mins time to avoid mini-outages

--
Ravi



On Thu, Mar 6, 2014 at 3:34 PM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> What do you think of giving an extra leeway for shard-server  failover
> cases?
>
> Ex: Whenever a shard-server process gets killed, the controller-node does
> not immediately update-layout, but rather mark it as a suspect.
>
> When we have a read-only back-up of shard, searches can continue
> unhindered. Indexing during this time can be diverted to a queue, which
> will store and retry-ops, when shard-server comes online again.
>
> Over configured number of attempts/time, if the shard-server does not come
> up, then one controller-server can authoritatively mark it as down and
> update the layout.
>
> --
> Ravi
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message