hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-668) TestFileAppend3#TC7 sometimes hangs
Date Thu, 15 Oct 2009 21:07:31 GMT

    [ https://issues.apache.org/jira/browse/HDFS-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12766255#action_12766255
] 

Hairong Kuang commented on HDFS-668:
------------------------------------

Even with Konstantin's quick fix, TC7 still occasionally hangs because it is not able to close
the file.  There is yet another bug. :-( The below logs helped me pinpoint the problem:

    [junit] 2009-10-14 23:15:07,359 DEBUG hdfs.StateChange (NameNode.java:blockReceived(948))
- *BLOCK* NameNode.blockReceived: from 127.0.0.1:57020 1 blocks.
    [junit] 2009-10-14 23:15:07,364 DEBUG hdfs.StateChange (FSNamesystem.java:blockReceived(2657))
- BLOCK* NameSystem.blockReceived: blk_-3647824922726121286_1008 is received from 127.0.0.1:57020
    [junit] 2009-10-14 23:15:07,364 DEBUG namenode.FSNamesystem (DatanodeDescriptor.java:processReportedBlock(466))
- Reported block blk_-3647824922726121286_1008 on 127.0.0.1:57020 size 65536 replicaState
= FINALIZED
    [junit] 2009-10-14 23:15:07,364 DEBUG namenode.FSNamesystem (DatanodeDescriptor.java:processReportedBlock(479))
- In memory blockUCState = UNDER_CONSTRUCTION
    [junit] 2009-10-14 23:15:07,365 WARN  namenode.FSNamesystem (BlockManager.java:addStoredBlock(1029))
- Inconsistent size for block blk_-3647824922726121286_1008 reported from 127.0.0.1:57020
current size is 32768 reported size is 65536
    [junit] 2009-10-14 23:15:07,365 INFO  hdfs.StateChange (BlockManager.java:addStoredBlock(1098))
- BLOCK* NameSystem.addStoredBlock: blockMap updated: 127.0.0.1:57020 is added to BlockInfoUnderConstruction{
    [junit]   blockUCState=UNDER_CONSTRUCTION
    [junit]   replicas=[blk_-3647824922726121286_1007, blk_-3647824922726121286_1007]
    [junit]   primaryNodeIndex=-1} size 65536
    [junit] 2009-10-14 23:15:07,365 INFO  namenode.FSNamesystem (FSNamesystem.java:updatePipeline(3951))
- updatePipeline(block=blk_-3647824922726121286_1007, newGenerationStamp=1008, newLength=65536,
newNodes=[127.0.0.1:57020], clientName=DFSClient_-1021096837)
    [junit] 2009-10-14 23:15:07,366 DEBUG hdfs.StateChange (FSDirectory.java:persistBlocks(321))
- DIR* FSDirectory.persistBlocks: /TC7/foo with 1 blocks is persisted to the file system
    [junit] 2009-10-14 23:15:07,369 INFO  namenode.FSNamesystem (FSNamesystem.java:updatePipeline(4000))
- updatePipeline(blk_-3647824922726121286_1007) successfully to blk_-3647824922726121286_1008
    [junit] 2009-10-14 23:15:07,370 DEBUG hdfs.DFSClient (DFSClient.java:endBlock(2613)) -
Closing old block blk_-3647824922726121286_1008
...
    [junit] 2009-10-14 23:15:12,317 DEBUG hdfs.StateChange (NameNode.java:complete(642)) -
*DIR* NameNode.complete: /TC7/foo for DFSClient_-1021096837
    [junit] 2009-10-14 23:15:12,317 DEBUG hdfs.StateChange (FSNamesystem.java:completeFileInternal(1318))
- DIR* NameSystem.completeFile: /TC7/foo for DFSClient_-1021096837
    [junit] 2009-10-14 23:15:12,318 INFO  hdfs.StateChange (FSNamesystem.java:checkFileProgress(1402))
- BLOCK* NameSystem.checkFileProgress: block BlockInfoUnderConstruction{
    [junit]   blockUCState=COMMITTED
    [junit]   replicas=[blk_-3647824922726121286_1008]
    [junit]   primaryNodeIndex=-1} has not reached minimal replication 1
    [junit] 2009-10-14 23:15:12,720 INFO  hdfs.DFSClient (DFSClient.java:completeFile(3634))
- Could not complete file /TC7/foo retrying...
 

> TestFileAppend3#TC7 sometimes hangs
> -----------------------------------
>
>                 Key: HDFS-668
>                 URL: https://issues.apache.org/jira/browse/HDFS-668
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 0.21.0
>            Reporter: Hairong Kuang
>             Fix For: 0.21.0
>
>         Attachments: hdfs-668.patch
>
>
> TestFileAppend3 hangs because it fails on close the file. The following is the snippet
of logs that shows the cause of the problem:
>     [junit] 2009-10-01 07:00:00,719 WARN  hdfs.DFSClient (DFSClient.java:setupPipelineForAppendOrRecovery(3004))
- Error Recovery for block blk_-4098350497078465335_1007 in pipeline 127.0.0.1:58375, 127.0.0.1:36982:
bad datanode 127.0.0.1:36982
>     [junit] 2009-10-01 07:00:00,721 INFO  datanode.DataNode (DataXceiver.java:opWriteBlock(224))
- Receiving block blk_-4098350497078465335_1007 src: /127.0.0.1:40252 dest: /127.0.0.1:58375
>     [junit] 2009-10-01 07:00:00,721 INFO  datanode.DataNode (FSDataset.java:recoverClose(1248))
- Recover failed close blk_-4098350497078465335_1007
>     [junit] 2009-10-01 07:00:00,723 INFO  datanode.DataNode (DataXceiver.java:opWriteBlock(369))
- Received block blk_-4098350497078465335_1008 src: /127.0.0.1:40252 dest: /127.0.0.1:58375
of size 65536
>     [junit] 2009-10-01 07:00:00,724 INFO  hdfs.StateChange (BlockManager.java:addStoredBlock(1006))
- BLOCK* NameSystem.addStoredBlock: addStoredBlock request received for blk_-4098350497078465335_1008
on 127.0.0.1:58375 size 65536 But it does not belong to any file.
>     [junit] 2009-10-01 07:00:00,724 INFO  namenode.FSNamesystem (FSNamesystem.java:updatePipeline(3946))
- updatePipeline(block=blk_-4098350497078465335_1007, newGenerationStamp=1008, newLength=65536,
newNodes=[127.0.0.1:58375], clientName=DFSClient_995688145)
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message