hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2976) Blocks staying underreplicated (for unclosed file)
Date Tue, 11 Mar 2008 07:52:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12577325#action_12577325
] 

dhruba borthakur commented on HADOOP-2976:
------------------------------------------

A block report calculates under-replicated and over-replicated values only if the block is
not already in the blocksMap. 

In our case, the first datanode confirmed the block and the block is inserted into the blocksMap.
When the next block report arrives from this datanode, the namenode notices that the blocksMap
already contains this information. So, the namenode does not compute if this block is over-replicated
or under-replicated. I guess it is expensive to compute under-replication and over-replication
for each block in a block report. 

> Blocks staying underreplicated (for unclosed file)
> --------------------------------------------------
>
>                 Key: HADOOP-2976
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2976
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.3
>            Reporter: Koji Noguchi
>            Assignee: dhruba borthakur
>            Priority: Minor
>             Fix For: 0.17.0
>
>         Attachments: leaseExpiryReplication.patch
>
>
> We had two files staying underreplicated for over a day.
> I checked that these under-replicated blocks are not corrupted.
> (They were both task tmp files and most likely didn't get closed.)
> Taking one file, /aaa/_task_200803040823_0001_r_000421_0/part-00421
> Namenode log showed
> namenode.log.2008-03-04 2008-03-04 16:19:21,478 INFO org.apache.hadoop.dfs.StateChange:
BLOCK* NameSystem.allocateBlock: /aaa/_task_200803040823_0001_r_000421_0/part-00421.  blk_-7848645760735416126
> 2008-03-04 16:19:24,357 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock:
blockMap updated: 11.1.111.111:22222 is added to blk_-7848645760735416126
> On the datanode 11.1.111.111, it showed 
> 2008-03-04 16:19:24,358 INFO org.apache.hadoop.dfs.DataNode: Received block blk_-7848645760735416126
from /55.55.55.55 and operation failed at /22.2.222.22
> On the second datanode 22.2.222.22, it showed 
> 2008-03-04 16:19:21,578 INFO org.apache.hadoop.dfs.DataNode: Exception writing to mirror
33.3.33.33
> java.net.SocketException: Connection reset
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>   at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>   at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>   at java.io.DataOutputStream.write(DataOutputStream.java:90)
>   at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveChunk(DataNode.java:1333)
>   at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:1386)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:938)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:804)
>   at java.lang.Thread.run(Thread.java:619)
> 2008-03-04 16:19:24,358 ERROR org.apache.hadoop.dfs.DataNode: DataXceiver: java.net.SocketException:
Broken pipe
>   at java.net.SocketOutputStream.socketWrite0(Native Method)
>   at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>   at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>   at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>   at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>   at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>   at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:1394)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:938)
>   at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:804)
>   at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message