hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8349) TestLogRolling#TestLogRollOnDatanodeDeath hangs under hadoop2 profile
Date Wed, 17 Apr 2013 03:57:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633731#comment-13633731
] 

Hudson commented on HBASE-8349:
-------------------------------

Integrated in hbase-0.95 #150 (See [https://builds.apache.org/job/hbase-0.95/150/])
    HBASE-8349 TestLogRolling#testLogRollOnDatanodeDeath hangs under hadoop2 profile (Revision
1468563)

     Result = SUCCESS
jmhsieh : 
Files : 
* /hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java

                
> TestLogRolling#TestLogRollOnDatanodeDeath hangs under hadoop2 profile
> ---------------------------------------------------------------------
>
>                 Key: HBASE-8349
>                 URL: https://issues.apache.org/jira/browse/HBASE-8349
>             Project: HBase
>          Issue Type: Sub-task
>          Components: hadoop2
>    Affects Versions: 0.98.0, 0.95.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.98.0, 0.95.1
>
>         Attachments: hbase-8349.patch
>
>
> TestLogRolling has been hanging -- after a data node is killed the client attempts to
recover a lease and fails forever.  (This example ran for a while and shows recovery attempt
541888).  
> {code}
> 2013-04-15 16:37:49,074 INFO  [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(72):
Attempt 541888 to recoverLease on file hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
returned false, trying for 2642865ms
> 2013-04-15 16:37:49,075 ERROR [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(86):
Can't recoverLease after 541888 attempts and 2642866ms  for hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
- continuing without the lease, but we could have a data loss.
> 2013-04-15 16:37:49,075 INFO  [IPC Server handler 9 on 41333] namenode.FSNamesystem(1957):
recoverLease: recover lease [Lease.  Holder: DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091,
pendingcreates: 1], src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
from client DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091
> 2013-04-15 16:37:49,075 INFO  [IPC Server handler 9 on 41333] namenode.FSNamesystem(2981):
Recovering lease=[Lease.  Holder: DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091,
pendingcreates: 1], src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
> 2013-04-15 16:37:49,078 WARN  [IPC Server handler 9 on 41333] namenode.FSNamesystem(3096):
DIR* NameSystem.internalReleaseLease: File /user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
has not been closed. Lease recovery is in progress. RecoveryId = 543317 for block blk_7636447875270454121_1019{blockUCState=UNDER_RECOVERY,
primaryNodeIndex=1, replicas=[ReplicaUnderConstruction[127.0.0.1:38288|RBW], ReplicaUnderConstruction[127.0.0.1:35956|RWR]]}
> 2013-04-15 16:37:49,078 INFO  [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(72):
Attempt 541889 to recoverLease on file hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
returned false, trying for 2642869ms
> 2013-04-15 16:37:49,079 ERROR [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(86):
Can't recoverLease after 541889 attempts and 2642870ms  for hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
- continuing without the lease, but we could have a data loss.
> 2013-04-15 16:37:49,079 INFO  [IPC Server handler 4 on 41333] namenode.FSNamesystem(1957):
recoverLease: recover lease [Lease.  Holder: DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091,
pendingcreates: 1], src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
from client DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091
> 2013-04-15 16:37:49,079 INFO  [IPC Server handler 4 on 41333] namenode.FSNamesystem(2981):
Recovering lease=[Lease.  Holder: DFSClient_hb_rs_localhost,39898,1366065830907_1890639591_1091,
pendingcreates: 1], src=/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
> 2013-04-15 16:37:49,082 WARN  [IPC Server handler 4 on 41333] namenode.FSNamesystem(3096):
DIR* NameSystem.internalReleaseLease: File /user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
has not been closed. Lease recovery is in progress. RecoveryId = 543318 for block blk_7636447875270454121_1019{blockUCState=UNDER_RECOVERY,
primaryNodeIndex=1, replicas=[ReplicaUnderConstruction[127.0.0.1:38288|RBW], ReplicaUnderConstruction[127.0.0.1:35956|RWR]]}
> 2013-04-15 16:37:49,083 INFO  [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(72):
Attempt 541890 to recoverLease on file hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
returned false, trying for 2642874ms
> 2013-04-15 16:37:49,083 ERROR [SplitLogWorker-localhost,39898,1366065830907] util.FSHDFSUtils(86):
Can't recoverLease after 541890 attempts and 2642874ms  for hdfs://localhost:41333/user/jon/hbase/.logs/localhost,41341,1366065830879-splitting/localhost%2C41341%2C1366065830879.1366065836654.meta
- continuing without the lease, but we could have a data loss.
> {code}
> It initially starts with permissions errors similar to HBASE-7636.  For now we disable
in this test and will address with a fix in HBASE-8337 with an assist from HDFS folks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message