hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4562) Many excess replicas getting created
Date Thu, 07 Mar 2013 07:22:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595631#comment-13595631
] 

Amareshwari Sriramadasu commented on HDFS-4562:
-----------------------------------------------

Here is log for /user/mapred/system/job_201303040902_85451/jobToken :

{noformat}
grep '/user/mapred/system/job_201303040902_85451/jobToken' namenode.log
2013-03-06 00:02:21,787 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock:
/user/mapred/system/job_201303040902_85451/jobToken. blk_6341538266279390032_231558736
2013-03-06 00:02:21,795 INFO org.apache.hadoop.hdfs.StateChange: Removing lease on  file /user/mapred/system/job_201303040902_85451/jobToken
from client DFSClient_201543923
2013-03-06 00:02:21,795 INFO org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.completeFile:
file /user/mapred/system/job_201303040902_85451/jobToken is closed by DFSClient_201543923
{noformat}

Here is the configuration wrt replication:
{noformat}
      <name>dfs.replication.max</name>
      <value>50</value>

      <name>dfs.replication</name>
      <value>3</value>

      <name>mapred.submit.replication</name>
      <value>3</value>
{noformat}

bq. If I were to guess, one of the datanode in the pipeline reports a replica for a block
late. The replication monitor is too aggressive and creates an additional replica meanwhile.
However this should not happen for every block that is created.
I also agree this is not happening for every block. But we are seeing this number so huge
in our cluster. Also we have many files short lived in our cluster, which cause two consecutive
delete requests to a data node, then hitting HDFS-4544. 
                
> Many excess replicas getting created
> ------------------------------------
>
>                 Key: HDFS-4562
>                 URL: https://issues.apache.org/jira/browse/HDFS-4562
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>            Reporter: Amareshwari Sriramadasu
>             Fix For: 1.2.0
>
>
> We are seeing too many excess replicas getting created in our cluster. The number excess
replicas in day coming out to be more than 1 lakh.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message