hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sri (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1338) Improve TestDFSIO
Date Mon, 24 Oct 2011 18:36:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13134338#comment-13134338
] 

sri commented on HDFS-1338:
---------------------------

Hey, when I run the DFSIO benchmark, one thing I find is that only one node runs all the mapper
and stores one copy in that node and distribute the second and the third copy over the other
nodes in the cluster, is this how this benchmark behaves usually? or am I missing out something.
I really do not know why all my mappers runs in one node. Can some one help me to understand
this better?

                
> Improve TestDFSIO
> -----------------
>
>                 Key: HDFS-1338
>                 URL: https://issues.apache.org/jira/browse/HDFS-1338
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: test
>            Reporter: Arun C Murthy
>         Attachments: TestDFSIOparser_y0.20.patch
>
>
> Currently the read test in TestDFSIO benchmark just opens a large side file and measures
the read performance. The MR scheduler has no opportunity to do *any* optimization for the
TestDFSIO MR application. The side-effect of this is that it is *very* hard to do any meaningful
analysis of the results of the benchmark i.e. to check if node-local or rack-local or off-switch
read performance improved/degraded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message