hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Quinn (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs
Date Mon, 03 Dec 2012 19:43:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508974#comment-13508974

Shawn Quinn commented on HBASE-3996:

So, this is something we'd really start liking to use here as well, as we're trying to stay
within the released HBase APIs (so, we're currently using a custom TableInputFormatBase extension
which hasn't been ideal.)  Based on the comments here and the references to this ticket in
the mailing list, it appears there's a good amount of interest in this enhancement.  I've
monkeyed with a few things within the HBase code locally here, but haven't yet tried to submit
a patch.  

Lars/Stack, if you let me know you wouldn't mind another person's contribution being added
to the mix here, I'd be glad to give this one a go and submit an updated patch.  I don't want
cause you guys any headaches if adding another person into the mix is just going to complicate
or slow this one down though.
> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> ------------------------------------------------------------------------------
>                 Key: HBASE-3996
>                 URL: https://issues.apache.org/jira/browse/HBASE-3996
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Eran Kutner
>            Assignee: Lars Hofhansl
>             Fix For: 0.96.0, 0.94.4
>         Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, 3996-v6.txt,
3996-v7.txt, HBase-3996.patch
> It seems that in many cases feeding data from multiple tables or multiple scanners on
a single table can save a lot of time when running map/reduce jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message