hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10142) TestLogRolling#testLogRollOnDatanodeDeath test failure
Date Thu, 19 Dec 2013 20:41:07 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853279#comment-13853279
] 

Ted Yu commented on HBASE-10142:
--------------------------------

There used to be some comment around low replication checking in FSHlog:
{code}
      // TODO: preserving the old behavior for now, but this check is strange. It's not
      //       protected by any locks here, so for all we know rolling locks might start
      //       as soon as we enter the "if". Is this best-effort optimization check?
      if (!this.logRollRunning) {
        checkLowReplication();
{code}
This means that checkLowReplication() may be running when FSHLog#rollWriter() is also running
- hence the race.
That is why checkLowReplication() is now put under reentrant lock so that the race wouldn't
happen.


> TestLogRolling#testLogRollOnDatanodeDeath test failure
> ------------------------------------------------------
>
>                 Key: HBASE-10142
>                 URL: https://issues.apache.org/jira/browse/HBASE-10142
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0, 0.99.0
>            Reporter: Andrew Purtell
>            Assignee: Ted Yu
>             Fix For: 0.98.0, 0.99.0
>
>         Attachments: 10142-v1.txt
>
>
> This is a demanding unit test, which fails fairly often as software versions (JVM, Hadoop)
and system load change. Currently when testing 0.98 branch I see this failure:
> {noformat}
> Failed tests:   testLogRollOnDatanodeDeath(org.apache.hadoop.hbase.regionserver.wal.TestLogRolling):
LowReplication Roller should've been disabled, current replication=1
> {noformat} 
> Could be a timing issue after the recent switch to Hadoop 2 as default build/test profile.
Let's see if more leniency makes sense and if it can stabilize the test before disabling it.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message