hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Klaas Bosteels (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4842) Streaming combiner should allow command, not just JavaClass
Date Mon, 16 Feb 2009 16:39:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673926#action_12673926

Klaas Bosteels commented on HADOOP-4842:

Actually, shell-only programmers can already combine by adding something like "| sort | sh
combiner.sh" to their mapper script. More generally, I think it makes more sense to combine
locally in the streaming application process itself, instead of running an additional application
process and requiring another round trip to the Java process and back. Both [Pipes|http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/pipes/package-summary.html]
and [Dumbo|http://wiki.github.com/klbostee/dumbo] use this approach for combining.

> Streaming combiner should allow command, not just JavaClass
> -----------------------------------------------------------
>                 Key: HADOOP-4842
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4842
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Marco Nicosia
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
> Streaming jobs are way slower than Java jobs for many reasons, but certainly stopping
the shell-only programmer from using the combiner feature won't help. Right now, the streaming
usage says:
> {quote}
>   -mapper   <cmd|JavaClassName>      The streaming command to run
>   -combiner <JavaClassName> Combiner has to be a Java class
>   -reducer  <cmd|JavaClassName>      The streaming command to run
> {quote}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message