hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs
Date Wed, 28 Mar 2012 18:59:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240632#comment-13240632
] 

Todd Lipcon commented on HBASE-3996:
------------------------------------

bq. hbase jar on job tracker is updated to include the versioning mechanism but the job client
has pre-versioning hbase jar.

The jar on the JT doesn't matter. Split computation and interpretation happens only in the
user code -- i.e on the client machine and inside the tasks themselves. So you don't need
HBase installed on the JT at all. As for the TTs, it's possible to configure the TTs to put
an hbase jar on the classpath, but I usually recommend against it for the exact reason you're
mentioning - if the jars differ in version, and they're not 100% API compatible, you can get
nasty  errors. The recommended deployment is to _not_ put hbase on the TT classpath, and instead
ship the HBase dependencies as part of the MR job, using the provided utility function in
TableMapReduceUtil.
                
> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-3996
>                 URL: https://issues.apache.org/jira/browse/HBASE-3996
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>             Fix For: 0.96.0
>
>         Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, 3996-v6.txt,
3996-v7.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple scanners on
a single table can save a lot of time when running map/reduce jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message