hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11405) Multiple invocations of hbck in parallel disables balancer permanently
Date Sat, 20 Sep 2014 11:26:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141925#comment-14141925
] 

Sean Busbey commented on HBASE-11405:
-------------------------------------

okay, the test failure is caused because on [hadoop 1 dfs client transparently retries the
file create for up to 5 minutes|https://github.com/apache/hadoop/blob/release-1.2.1/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java#L163].

Looking at the logs of the hadoop 1 run, I can see the second hbck instance fail to create
the file. Then later after the first one cleans up, the second picks up again and keeps going.
Thus both succeed and the assert fails. Same failure on 0.94.

I'd rather not insert a 5 minute sleep in the hbck to ensure the second instance fails. Anyone
have some idea of a different way we can test for the branches that support hadoop 1?

> Multiple invocations of hbck in parallel disables balancer permanently 
> -----------------------------------------------------------------------
>
>                 Key: HBASE-11405
>                 URL: https://issues.apache.org/jira/browse/HBASE-11405
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer, hbck
>    Affects Versions: 0.99.0
>            Reporter: bharath v
>            Assignee: bharath v
>             Fix For: 2.0.0, 0.98.7, 0.94.24, 0.99.1
>
>         Attachments: 11405-1.0.txt, 11405-v3.txt, 11405-v5.patch, 11405-v6.patch, 11405v7.patch,
HBASE-11405-trunk-rebased.patch, HBASE-11405-trunk.patch, HBASE-11405-trunk.patch.1, hbase-11405.rebase.140911.patch
>
>
> This is because of the following piece of code in hbck
> {code:borderStyle=solid}
>   boolean oldBalancer = admin.setBalancerRunning(false, true);
>     try {
>       onlineConsistencyRepair();
>     }
>     finally {
>       admin.setBalancerRunning(oldBalancer, false);
>     }
> {code}
> Newer invocations set oldBalancer to false as it was disabled by previous invocations
and this disables balancer permanently unless its manually turned on by the user. Easy to
reproduce, just run hbck 100 times in a loop in 2 different sessions and you can see that
balancer is set to false in the HMaster logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message