hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3285) map tasks with node local splits do not always read from local nodes
Date Wed, 23 Apr 2008 01:43:22 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591508#action_12591508
] 

Hadoop QA commented on HADOOP-3285:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
http://issues.apache.org/jira/secure/attachment/12380724/3285-4.patch
against trunk revision 645773.

    @author +1.  The patch does not contain any @author tags.

    tests included +1.  The patch appears to include 3 new or modified tests.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new javac compiler warnings.

    release audit +1.  The applied patch does not generate any new release audit warnings.

    findbugs +1.  The patch does not introduce any new Findbugs warnings.

    core tests -1.  The patch failed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2298/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2298/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2298/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2298/console

This message is automatically generated.

> map tasks with node local splits do not always read from local nodes
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3285
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3285
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Owen O'Malley
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: 3285-3.patch, 3285-4.patch, 3285.patch, 3285.patch
>
>
> I ran a simple map/reduce job counting the number of records in the input data.
> The number of reducers was set to 1.
> I did not set the number of mappers. Thus by default, all splits except the last split
of a file contain one dfs block (128MB in my case).
> The web gui indicated that 99% of map tasks were with local splits.
> Thus I expected that most of the dfs reads should have come from the local data nodes.
> However, when I examine the traffic of the ethernet interfaces, 
> I found about 50% traffic of each node were through the loopback interface and other
50% were through the ethernet card!
> Also,  the switch monitoring indicated that a lot of traffic went through the links and
cross racks!
> This indicated that the data locality feature does not work as expected.
> To confirm that, I set the number of map tasks to a very high number so that it forced
the split size down to about 27MB.
> The web gui indicated that 99% of map tasks were with local splits, as expected.
> The ethernet interface monitor showed that almost 100% traffic went through the loopback
interface, as it should be. 
> I found about 50% traffic of each node were through the loopback interface and other
50% were through the ethernet card!
> Also,  the switch monitoring indicated that there were very little traffic through the
links and cross racks.
> This implies that some corner cases are not handled properly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message