hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3366) Shuffle/Merge improvements
Date Tue, 13 May 2008 19:11:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596502#action_12596502
] 

Devaraj Das commented on HADOOP-3366:
-------------------------------------

Sameer, today's ramfs serves both as a memory manager and as a filesystem. So if we were to
implement a new memory manager, I am guessing that it'd be close to what we already have in
the ramfs (for e.g. it already does byte array allocations, keeps track of mem usage, etc.).
We can get to an optimal memory manager by reducing the complexity (if any) in the ramfs memory
manager.

Regarding using the ramfs as a FileSystem, I think if we remove the ChecksumFS layer, we'd
have removed a good amount of complexity. Other than that if we ensure that the apis that
read from the ramfs do not allocate buffers but reset internal pointers on the byte arrays
for the keys and values, we should be good. So the two classes that is used as the destination
of data read from files are the DataOutputBuffer and the ValueBytes. Both these internally
allocate byte arrays. I am suggesting that we implement these two classes specially for the
ramfs files wherein we'd just update the pointers/offsets/lengths in these classes instead
of copying from the files.

> Shuffle/Merge improvements
> --------------------------
>
>                 Key: HADOOP-3366
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3366
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.18.0
>
>
> This is intended to be a meta-issue to track various improvements to shuffle/merge in
the reducer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message