hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13732) TestHBaseFsck#testParallelWithRetriesHbck fails intermittently
Date Thu, 21 May 2015 20:38:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14555000#comment-14555000
] 

Hadoop QA commented on HBASE-13732:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12734285/HBASE-13732.patch
  against master branch at commit e1e8434340f02907976f20566c3e55d8d627d4c4.
  ATTACHMENT ID: 12734285

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified
tests.

    {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions
(2.4.1 2.5.2 2.6.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the total number of
protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the total number
of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     

     {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/14149//testReport/
Release Findbugs (version 2.0.3) 	warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/14149//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/14149//artifact/patchprocess/checkstyle-aggregate.html

  Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/14149//console

This message is automatically generated.

> TestHBaseFsck#testParallelWithRetriesHbck fails intermittently
> --------------------------------------------------------------
>
>                 Key: HBASE-13732
>                 URL: https://issues.apache.org/jira/browse/HBASE-13732
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck, test
>    Affects Versions: 2.0.0, 1.1.0, 1.2.0
>            Reporter: Stephen Yuan Jiang
>            Assignee: Stephen Yuan Jiang
>            Priority: Minor
>             Fix For: 2.0.0, 1.2.0, 1.1.1
>
>         Attachments: HBASE-13732.patch
>
>
> TestHBaseFsck#testParallelWithRetriesHbck failed intermittently (especially in Windows
environment) with "java.io.IOException: Duplicate hbck - Abort"
> {noformat}
> java.util.concurrent.ExecutionException: java.io.IOException: Duplicate hbck - Abort
> 	at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:111)
> 	at org.apache.hadoop.hbase.util.TestHBaseFsck.testParallelWithRetriesHbck(TestHBaseFsck.java:644)
> Caused by: java.io.IOException: Duplicate hbck - Abort
> 	at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:484)
> 	at org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:53)
> 	at org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:43)
> 	at org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:38)
> 	at org.apache.hadoop.hbase.util.TestHBaseFsck$2RunHbck.call(TestHBaseFsck.java:635)
> 	at org.apache.hadoop.hbase.util.TestHBaseFsck$2RunHbck.call(TestHBaseFsck.java:628)
> 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> 	at java.lang.Thread.run(Thread.java:722)
> {noformat}
> HBASE-13591 tried to address this issue.  It did improve the pass rate in Linux environment
(after the fix, I could not repro in my machine); however, the test still failed intermittently
in Windows environment during testing of 1.1 release.
> Looking at the code, it uses the ExponentialBackoffPolicy (starting with 200ms sleep
time after first failed attempt to acquire the lock in ZK, then 400ms, then 800ms, etc.) in
between retries.  Therefore, even the first hbck run completes, the second hbck run would
still fail due to long sleep time.  
> the proposal to fix the problem is to use ExponentialBackoffPolicyWithLimit and cap the
max sleep time to some small number (eg. 5 seconds, it should be configurable).  This would
make the test more robust.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message