incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <>
Subject Re: Shard takeover behavior
Date Thu, 06 Mar 2014 11:30:56 GMT
I came to know about zk.session.timeout variable just now, while reading
more about this problem.

This will only trigger dead-node notification after the configured timeout
exceeds. Setting it to 3-4 mins must be fine for OOMs and rolling-restarts.

Only extra stuff I am looking for, is to divert search calls to a read-only
shard instance during this 3-4 mins time to avoid mini-outages


On Thu, Mar 6, 2014 at 3:34 PM, Ravikumar Govindarajan <> wrote:

> What do you think of giving an extra leeway for shard-server  failover
> cases?
> Ex: Whenever a shard-server process gets killed, the controller-node does
> not immediately update-layout, but rather mark it as a suspect.
> When we have a read-only back-up of shard, searches can continue
> unhindered. Indexing during this time can be diverted to a queue, which
> will store and retry-ops, when shard-server comes online again.
> Over configured number of attempts/time, if the shard-server does not come
> up, then one controller-server can authoritatively mark it as down and
> update the layout.
> --
> Ravi

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message