crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-449) Add sequentialDo function for injecting arbitrary non-parallel code
Date Wed, 30 Jul 2014 00:14:39 GMT


Josh Wills updated CRUNCH-449:

    Attachment: CRUNCH-449c.patch

Lots of fixes based on the feedback. Lots of javadoc for starters.

I changed the name of SeqDo to PipelineCallable and made it implement the Callable<>
interface, since I was effectively doing that anyway inside of the CrunchJobControl class.
I also added a getConfiguration() method, as per Micah's request, and allowed for multithreaded
execution, as per Gabriel's.

> Add sequentialDo function for injecting arbitrary non-parallel code
> -------------------------------------------------------------------
>                 Key: CRUNCH-449
>                 URL:
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-449.patch, CRUNCH-449b.patch, CRUNCH-449c.patch
> I've been noodling on this one for awhile: how to add the ability to execute some code
if and only if one or more targets are created, and have that executed code (optionally) return
one or more new PCollections as a result. I was thinking that this functionality could be
wired in to libraries to do things like bulk loading HBase tables or running Sqoop jobs as
part of Crunch pipelines automatically.

This message was sent by Atlassian JIRA

View raw message