crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Commented] (CRUNCH-212) Need target wrapper for HFileOuptutFormat
Date Fri, 19 Jul 2013 17:54:56 GMT


Gabriel Reid commented on CRUNCH-212:

Cool, just took a look.

Chao, do you have any thoughts about how the bulk load could be performed?

 Two things that I know that the traditional HFileOutputFormat does (you touched on them above
already) are:
- setting up the partitioning to match regions on an existing HBase table
- handling multiple column families

Just wondering if you've already got ideas on how to tackle these.
> Need target wrapper for HFileOuptutFormat
> -----------------------------------------
>                 Key: CRUNCH-212
>                 URL:
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>            Reporter: Chao Shi
>         Attachments: crunch-212-draft.patch
> I need to import data to hbase from MR. I found HFileOutputFormat is ~5x more efficient
than HTableOutputFormat. So maybe we need a target wrapper for it.
> Future more, is it possible to call HBase to load it automatically after HFiles are generated?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message