hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-326) The lowest level map-reduce APIs should be byte oriented
Date Fri, 05 Feb 2010 08:06:28 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829993#action_12829993

eric baldeschwieler commented on MAPREDUCE-326:

heh!   I still like this idea, but don't see it as the first thing we should think about.
We'll want to be sure we do this in a performance and API neutral or positive way if we do

I think a long term goal should be to refactor the whole MR framework into a resource manager
that is map-reduce unaware and "user" or library code that executes MR.

If we got there, then we could have competing implementations of MR that did this either way
and let folks compare and contrast.  I'd suggest that it might make more sense to invest in
moving the MR framework into "user space" rather than replumbing the current implementation
to be binary first.  

We have a V2 MR framework bug kicking about somewhere.  Anyone know it?

> The lowest level map-reduce APIs should be byte oriented
> --------------------------------------------------------
>                 Key: MAPREDUCE-326
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-326
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: eric baldeschwieler
> As discussed here:
> https://issues.apache.org/jira/browse/HADOOP-1986#action_12551237
> The templates, serializers and other complexities that allow map-reduce to use arbitrary
types complicate the design and lead to lots of object creates and other overhead that a byte
oriented design would not suffer.  I believe the lowest level implementation of hadoop map-reduce
should have byte string oriented APIs (for keys and values).  This API would be more performant,
simpler and more easily cross language.
> The existing API could be maintained as a thin layer on top of the leaner API.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message