crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Brush (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-246) HFileSource
Date Fri, 09 Aug 2013 14:20:49 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13734828#comment-13734828
] 

Ryan Brush commented on CRUNCH-246:
-----------------------------------

Alright, file attached, with a few disclaimers. ;)  

It also doesn't handle timestamp and deletion like Chao mentions; our use case was limited
to HFiles generated in such a way that there were no deletions or multiple versions of a given
cell. So this code simply produces an InputFormat of KeyValues, and requires the caller to
deal with these things. But Chao is correct that it is possible to address these needs before
exposing them to user code...


                
> HFileSource
> -----------
>
>                 Key: CRUNCH-246
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-246
>             Project: Crunch
>          Issue Type: Improvement
>          Components: IO
>            Reporter: Chao Shi
>            Assignee: Chao Shi
>         Attachments: HFileInputFormat.java
>
>
> I found this useful when directly perform MR on HFiles. I used it yesterday when copying
a bunch of HFiles to another cluster (where the region layout is different).
> There is no HFileInputFormat provided by HBase, but I found the following from google:
> https://gist.github.com/leifwickland/1120311
> http://blog.csdn.net/kirayuan/article/details/7794402 (Java version of the above. The
webpage is in chinese, but you can see the code)
> I'm not sure if we copy their code directly (copyright issue?). Anyone knows?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message