hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back
Date Sat, 08 Nov 2014 02:37:34 GMT

     [ https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell updated HBASE-12450:
-----------------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Pushed to 0.98+

> Unbalance chaos monkey might kill all region servers without starting them back
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-12450
>                 URL: https://issues.apache.org/jira/browse/HBASE-12450
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>            Priority: Minor
>             Fix For: 2.0.0, 0.98.8, 0.99.2
>
>         Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch, HBASE-12450.patch
>
>
> UnbalanceKillAndRebalanceAction does kill, balance and then start of region servers.
But if the balance fails exception is thrown causing the region servers to not start. For
me, the balance always kept on failing with socket timeout (default 1 min) as master runs
one iteration of balance for 5 mins (default config). Eventually all servers are killed but
never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message