hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4396) sort on 400 nodes is now slower than in 18
Date Mon, 13 Oct 2008 19:30:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639177#action_12639177
] 

Devaraj Das commented on HADOOP-4396:
-------------------------------------

Jothi, I did one browse of the code and it looks like one FSDataOutputStream layer is redundant.
Specifically, in the second IFile.Writer constructor, could you replace the line
{code} this.checksumOut = new IFileOutputStream(out); {code} with {code} this.checksumOut
= new IFileOutputStream(out.getWrappedStream()); {code} and see whether it helps. Of course,
please verify that the functionality of the HADOOP-3514 patch remains intact with this change.
Also, please ensure that all the rfs.open/create calls where buffersize is passed as an argument
are indeed passed the same values as in the pre-HADOOP-3514 case (lfs.open/create).

> sort on 400 nodes is now slower than in 18
> ------------------------------------------
>
>                 Key: HADOOP-4396
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4396
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>
> Sort on 400 nodes on  hadoop release 18 takes about 29 minutes, but with the 19 branch
takes about 32 minutes. This behavior is consistent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message