crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-400) Materialized jobs should have stage in PipelineResult
Date Tue, 27 May 2014 21:23:03 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010293#comment-14010293
] 

Josh Wills commented on CRUNCH-400:
-----------------------------------

Just a little bit-- it's not clear to me that this would run two MR jobs vs. one; since there's
only one groupByKey operation in the pipeline, both the materialized outputs and the HFile
outputs could be written by the same MapReduce job. Are you saying that after calling pipeline.run()
after this, you only get one StageResult where you expect two, or that you're getting zero
StageResults where you expect oen?

> Materialized jobs should have stage in PipelineResult
> -----------------------------------------------------
>
>                 Key: CRUNCH-400
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-400
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.9.0, 0.8.2
>            Reporter: Micah Whitacre
>
> Brought up as part of the proposed fix for CRUNCH-272 and on the mailing list[1], a set
of jobs kicked off due to a materialize() call will not be tracked as part of the Pipeline's
stage results returned by the PipelineResult.
> [1] - http://mail-archives.apache.org/mod_mbox/crunch-dev/201405.mbox/%3CCANFazTUAffvTctK5%3DWvW4KyBLSqLCNcke7ZMWwgASu%2BEtkDmyQ%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message