hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4587) HBase MR support for multiple tables as input
Date Thu, 28 Feb 2013 22:03:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589972#comment-13589972
] 

Nick Dimiduk commented on HBASE-4587:
-------------------------------------

bq. "Being able to have multiple tables as your input path"
bq. "Being able to filter on specific columns/column families".

{{MultiTableInputFormat}} provides both of these requests.

bq. Providing the source location (table/row/column) to the results

The {{Result}} instances provided to the mapper satisfy row and column. Table would be an
addition. Perhaps {{map.input.file}} can be used to deliver the table name?

bq. Multiple clusters

This is tricky as it amounts to setting multiple conf objects for the job. From the client
perspective, it could be passed in as a {{List<Configuration>}} in the same way {{initTableMapper}}
already accepts a {{List<Scan>}}. Do you have any ideas on implementation?

bq. Different schemas

This request doesn't make sense to me. How do you mean?
                
> HBase MR support for multiple tables as input
> ---------------------------------------------
>
>                 Key: HBASE-4587
>                 URL: https://issues.apache.org/jira/browse/HBASE-4587
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapreduce
>    Affects Versions: 0.90.3
>            Reporter: Rajeev Rao
>
> Some requirements:
>  - Being able to have multiple tables as your input path
>  - Being able to filter on specific columns/column families
>  - Providing the source location (table/row/column) to the results
>  - Multiple clusters
>  - Different schemas.
> Overall this seems difficult for now so I am going to punt on it. On the other hand it
would be easy enough to write all of the MR values into an intermediate table and then work
from there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message