hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2399) Input key and value to combiner and reducer should be reused
Date Tue, 11 Dec 2007 19:55:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550692
] 

Doug Cutting commented on HADOOP-2399:
--------------------------------------

+1

As a general rule, I think applications should not expect to be able to hold on to pointers
to objects passed to them, but should expect to be able to hold on to pointers returned to
them.  Lots of exceptions of course, but, in this case, I don't think applications should
be expecting to be able to hold on to these objects, and so any that break if we reuse them
were not well written.

These were originally reused.  Reuse was removed when the combiner was added, since the original
combiner kept pointers to the objects.



> Input key and value to combiner and reducer should be reused
> ------------------------------------------------------------
>
>                 Key: HADOOP-2399
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2399
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.1
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.16.0
>
>
> Currently, the input key and value are recreated on every iteration for input to the
combiner and reducer. It would speed up the system substantially if we reused the keys and
values. The down side of doing it, is that it may break applications that count on holding
references to previous keys and values, but I think it is worth doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message