crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-212) Need target wrapper for HFileOuptutFormat
Date Wed, 31 Jul 2013 06:47:52 GMT


Chao Shi updated CRUNCH-212:

    Attachment: crunch-212-v0.patch

Attached the final version.

This does not include the automatically bulk load code, but one can do it by calling o.a.h.hbase.mapreduce.LoadIncrementalHFiles
(as I did in test). This patch uses its own version of HFileOutputFormat and does sort in
MR shuffle phase rather than reducer phase.

This is manually tested on a real cluster (hbase 0.94.0, multiple regions).
> Need target wrapper for HFileOuptutFormat
> -----------------------------------------
>                 Key: CRUNCH-212
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>            Reporter: Chao Shi
>         Attachments: crunch-212-draft2.patch, crunch-212-draft.patch, crunch-212-v0.patch
> I need to import data to hbase from MR. I found HFileOutputFormat is ~5x more efficient
than HTableOutputFormat. So maybe we need a target wrapper for it.
> Future more, is it possible to call HBase to load it automatically after HFiles are generated?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message