hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (file system partition) to allocate for the next block
Date Fri, 25 Apr 2008 18:01:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592452#action_12592452
] 

Runping Qi commented on HADOOP-2094:
------------------------------------


By analyzing disk utilization data, we have found that the four disks on each node were not
evenly utlized.
It seems that first disk was the most heavily utilized, which is consistent with the potential
impact of the current policy for 
volume selection for a nw block on data nodes.


> DFS should not use round robin policy in determing on which volume (file system partition)
 to allocate for the next block
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2094
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2094
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: Runping Qi
>            Assignee: dhruba borthakur
>         Attachments: randomDatanodePartition.patch
>
>
> When multiple file system partitions are configured for the data storage of a data node,
> it uses a strict round robin policy to decide which partition to use for writing the
next block.
> This may result in anormaly cases in which the blocks of a file are not evenly distributed
across 
> the partitions. For example, when we use distcp to copy files with each node have 4 mappers
running concurrently, 
> those 4 mappers are writing to DFS at about the same rate. Thus, it is possible that
the 4 mappers write out
> blocks interleavingly. If there are 4 file system partitions configured for the local
data node, it is possible that each mapper will
> continue to write its blocks on to the same file system partition.
> A simple random placement policy will avoid such anormaly cases, and does not have any
obvious drawbacks.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message