geode-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Diane Hardman (JIRA)" <>
Subject [jira] [Commented] (GEODE-4250) Users would like a command to re-establish redundancy without rebalancing
Date Tue, 30 Jan 2018 22:54:00 GMT


Diane Hardman commented on GEODE-4250:

For related request see GEODE-4434

> Users would like a command to re-establish redundancy without rebalancing
> -------------------------------------------------------------------------
>                 Key: GEODE-4250
>                 URL:
>             Project: Geode
>          Issue Type: Improvement
>          Components: docs, regions
>            Reporter: Fred Krone
>            Priority: Major
> Acceptance criteria:
> -- There is a way for a user to detect that redundancy is restored
> -- There is a way to check current redundancy
> -- Can set moveBuckets and movePrimary to false and run rebalance
> Command would only succeed when the system is fully redundant.
> Re-establishing Redundancy after the loss of a peer node is typically far more urgent
and important than achieving better balance. The operational impact of rebalancing is also
much higher, forcing impacted buckets' updates to be distributed to _redundancy-copies + 1_
peer processes and potentially spiking p2p connections/threads (and thus load) far beyond
normal operations. If the system is already close to exhausting available capacity for some
hardware component, this can be enough to push it over-the-edge (and may force the original
fault to recur). This problem is exacerbated when the cluster's overall capacity has been
reduced due to the loss of a physical server. Without the ability to separate the operational
tasks of re-establishing full data redundancy and rebalancing bucket partitions (that are
already safely redundant), system administrators may be forced to provision replacement capacity
_before_ they can restore full service, thus increasing downtime unnecessarily.
> For these reasons, we must add the option to execute these operational tasks separately.
> It still makes sense for _rebalancing_ ops to first re-establish redundancy, so we can
keep the existing GFSH command/behavior (it would still be useful to clearly log completion
of one step before the next one begins). We need a new GFSH command/ResourceManager API to
execute re-establishment of redundancy _without_ rebalancing.

This message was sent by Atlassian JIRA

View raw message