hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virag Kothari (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back
Date Sat, 08 Nov 2014 00:18:33 GMT

     [ https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Virag Kothari updated HBASE-12450:
----------------------------------
    Attachment: HBASE-12450-0.98.patch

Thanks for the quick review Andrew.
Attached is patch for 0.98. The patch for master is cleanly applying to branch-1


> Unbalance chaos monkey might kill all region servers without starting them back
> -------------------------------------------------------------------------------
>
>                 Key: HBASE-12450
>                 URL: https://issues.apache.org/jira/browse/HBASE-12450
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>            Priority: Minor
>             Fix For: 2.0.0, 0.98.8, 0.99.2
>
>         Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch
>
>
> UnbalanceKillAndRebalanceAction does kill, balance and then start of region servers.
But if the balance fails exception is thrown causing the region servers to not start. For
me, the balance always kept on failing with socket timeout (default 1 min) as master runs
one iteration of balance for 5 mins (default config). Eventually all servers are killed but
never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message