crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-449) Add sequentialDo function for injecting arbitrary non-parallel code
Date Thu, 24 Jul 2014 00:40:39 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Wills updated CRUNCH-449:
------------------------------

    Attachment: CRUNCH-449.patch

This is my first, minimally functional crack at this-- would love for [~gabriel.reid] and
[~mkwhitacre] to look it over when they have some downtime. I still have some more testing
and docs to do, and I need to wire up the Spark implementation as well.

> Add sequentialDo function for injecting arbitrary non-parallel code
> -------------------------------------------------------------------
>
>                 Key: CRUNCH-449
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-449
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>            Reporter: Josh Wills
>            Assignee: Josh Wills
>         Attachments: CRUNCH-449.patch
>
>
> I've been noodling on this one for awhile: how to add the ability to execute some code
if and only if one or more targets are created, and have that executed code (optionally) return
one or more new PCollections as a result. I was thinking that this functionality could be
wired in to libraries to do things like bulk loading HBase tables or running Sqoop jobs as
part of Crunch pipelines automatically.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message