hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3514) Improve the map output handling at the tasktracker for shuffle
Date Sun, 08 Jun 2008 14:17:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603381#action_12603381
] 

Devaraj Das commented on HADOOP-3514:
-------------------------------------

Some thoughts: 
(1) keep the index file in memory. It won't be too costly in most cases since the order of
space required is in the order of the number of reducers.
(2) have the checksum inline with the map output

> Improve the map output handling at the tasktracker for shuffle
> --------------------------------------------------------------
>
>                 Key: HADOOP-3514
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3514
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.19.0
>
>
> A couple of improvements can be done to reduce the number of seeks in the files related
to the map outputs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message