hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-326) The lowest level map-reduce APIs should be byte oriented
Date Mon, 08 Feb 2010 17:56:28 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831017#action_12831017

Doug Cutting commented on MAPREDUCE-326:

> hadoop's users are demanding we behave as if hadoop is 1.0

The consensus plan of record is that one may make incompatible API changes only by deprecating
existing APIs through at least one release cycle.  Prior to 1.0, deprecations might be removed
in the very next release cycle, but we've certainly made exceptions.  After 1.0, deprecated
functionality won't be removed until 2.0.

If we're not going to release 0.21 then we should perhaps remove that branch.  That will lengthen
the time before any deprecations can be removed, since 0.22 would then need to be API-compatible
with the un-deprecated APIs of 0.20.

Back to a low-level binary API: the proposal here isn't to deprecate any higher level APIs,
but rather to add a new lower-level API that we can implement both the current APIs and new
APIs atop.  This should in fact help us to preserve high-level API compatibility longer, since
the mapreduce kernel will be independent of the high-level API.

> The lowest level map-reduce APIs should be byte oriented
> --------------------------------------------------------
>                 Key: MAPREDUCE-326
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-326
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: eric baldeschwieler
> As discussed here:
> https://issues.apache.org/jira/browse/HADOOP-1986#action_12551237
> The templates, serializers and other complexities that allow map-reduce to use arbitrary
types complicate the design and lead to lots of object creates and other overhead that a byte
oriented design would not suffer.  I believe the lowest level implementation of hadoop map-reduce
should have byte string oriented APIs (for keys and values).  This API would be more performant,
simpler and more easily cross language.
> The existing API could be maintained as a thin layer on top of the leaner API.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message