hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1338) Improve TestDFSIO
Date Tue, 10 Aug 2010 16:43:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12896935#action_12896935

Arun C Murthy commented on HDFS-1338:

Some possible enhancements we could aim for, under certain assumptions, are:

# Assume that the cluster is empty and TestDFSIO is the *only* application running
# Get all the nodes in the cluster
# Implement a custom input-split to write only 1 replica to a subset of the nodes in the cluster
# Read the replicas in a manner that ensures equal node-local, rack-local and off-switch replicas.


> Improve TestDFSIO
> -----------------
>                 Key: HDFS-1338
>                 URL: https://issues.apache.org/jira/browse/HDFS-1338
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
> Currently the read test in TestDFSIO benchmark just opens a large side file and measures
the read performance. The MR scheduler has no opportunity to do *any* optimization for the
TestDFSIO MR application. The side-effect of this is that it is *very* hard to do any meaningful
analysis of the results of the benchmark i.e. to check if node-local or rack-local or off-switch
read performance improved/degraded.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message