crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Charles Pritchard (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-474) Reduce dependencies on MapReduce library in standard MemPipeline
Date Mon, 06 Oct 2014 23:52:34 GMT


Charles Pritchard commented on CRUNCH-474:

At present I'm looking into running crunch via stdin (or InputStream) which
is more-or-less agnostic. Java.

Ideally the jar could be picked back up and run on MR. It fits within
prototyping but I'd like to skip over the Hadoop dependency for the
standalone jar.

May be too much of a stretch but I really do like the Crunch decorators and
pattern for processing methods.


> Reduce dependencies on MapReduce library in standard MemPipeline
> ----------------------------------------------------------------
>                 Key: CRUNCH-474
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Charles Pritchard
> There are currently dependencies on the MapReduce library that could be removed or otherwise
re-wired in the MemPipeline method.
> Currently MemPipeline relies on setting up tasks to match Hadoop libraries without using
any of their functionality, beyond the counters. Crunch may be useful in areas where data
is processed without Hadoop.
> As an aside, the Avro writables have completely unused references to Hadoop in their
import statements.

This message was sent by Atlassian JIRA

View raw message