hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4396) sort on 400 nodes is now slower than in 18
Date Mon, 13 Oct 2008 19:30:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639177#action_12639177

Devaraj Das commented on HADOOP-4396:

Jothi, I did one browse of the code and it looks like one FSDataOutputStream layer is redundant.
Specifically, in the second IFile.Writer constructor, could you replace the line
{code} this.checksumOut = new IFileOutputStream(out); {code} with {code} this.checksumOut
= new IFileOutputStream(out.getWrappedStream()); {code} and see whether it helps. Of course,
please verify that the functionality of the HADOOP-3514 patch remains intact with this change.
Also, please ensure that all the rfs.open/create calls where buffersize is passed as an argument
are indeed passed the same values as in the pre-HADOOP-3514 case (lfs.open/create).

> sort on 400 nodes is now slower than in 18
> ------------------------------------------
>                 Key: HADOOP-4396
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4396
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>            Priority: Blocker
>             Fix For: 0.19.0
> Sort on 400 nodes on  hadoop release 18 takes about 29 minutes, but with the 19 branch
takes about 32 minutes. This behavior is consistent.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message