crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rafal Wojdyla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-470) Add hdfs/yarn minicluster crunch pipeline
Date Thu, 11 Sep 2014 21:03:33 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130709#comment-14130709
] 

Rafal Wojdyla commented on CRUNCH-470:
--------------------------------------

It's more of what you mention about HFileTargetIT - but make minicluster pipeline first class
citizen, make it easy for users  to create minicluster so that they can run pipelines in mode
pretty close to actual distributed cluster. Such pipeline would extend MRPipeline, take care
of managing minicluster etc.

> Add hdfs/yarn minicluster crunch pipeline
> -----------------------------------------
>
>                 Key: CRUNCH-470
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-470
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Rafal Wojdyla
>            Assignee: Josh Wills
>            Priority: Minor
>
> Crunch currently has two pipelines:
> * MemPipeline
> * MRPipeline
> MemPipeline is in-memory pipelines based on local in-memory mapreduce mode.
> MRPipeline is distributed pipeline based on distributed MapReduce.
> Using HDFS/YARN Minicluster it's possible to better emulate Hadoop cluster, and it could
be a 'final test' before running on the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message