hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4842) Streaming combiner should allow command, not just JavaClass
Date Tue, 17 Mar 2009 10:00:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682612#action_12682612

Amareshwari Sriramadasu commented on HADOOP-4842:

On a 10 node cluster, I ran a job(finding unique words in input) with and without the combiner,
runtimes are 1 min 40 sec and 5mins, 58sec repsectively.

> Streaming combiner should allow command, not just JavaClass
> -----------------------------------------------------------
>                 Key: HADOOP-4842
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4842
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/streaming
>            Reporter: Marco Nicosia
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.21.0
>         Attachments: patch-4842-1.txt, patch-4842-2.txt, patch-4842-3.txt, patch-4842.txt
> Streaming jobs are way slower than Java jobs for many reasons, but certainly stopping
the shell-only programmer from using the combiner feature won't help. Right now, the streaming
usage says:
> {quote}
>   -mapper   <cmd|JavaClassName>      The streaming command to run
>   -combiner <JavaClassName> Combiner has to be a Java class
>   -reducer  <cmd|JavaClassName>      The streaming command to run
> {quote}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message