incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Wills (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CRUNCH-127) Allow multiple HBaseTargets in a single pipeline
Date Wed, 12 Dec 2012 08:27:21 GMT

     [ https://issues.apache.org/jira/browse/CRUNCH-127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Josh Wills updated CRUNCH-127:
------------------------------

    Attachment: CRUNCH-127.patch

First cut at this. I banged my head against making HBaseTarget work w/MultipleOutputs, to
no avail. In the process, I rewrote most of the MultipleOutputs stuff to make it work more
like CrunchInputs, which has some advantages (and some disadvantages) that might be worth
exploring later.

In the meantime, here's a simple patch that adds in support for HBase's MultiTableOutputFormat.
For this change, the key is the name of the table to write, and the value is either a Put
or a Delete, so it needs to be given a PTable<ImmutableBytesWritable, Put|Delete> in
order to work. Still need to write an integration test, but let me know if you get a chance
to bang on it.
                
> Allow multiple HBaseTargets in a single pipeline
> ------------------------------------------------
>
>                 Key: CRUNCH-127
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-127
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>         Attachments: CRUNCH-127.patch
>
>
> Currently when a pipeline contains writes to multiple HBaseTargets, all puts are being
sent to the first configured HBaseTarget ignoring the second one and causing issues if the
columns are not the same.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message