hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Teng Yutong (JIRA)" <>
Subject [jira] [Updated] (HIVE-6584) Add HiveHBaseTableSnapshotInputFormat
Date Thu, 19 Jun 2014 05:21:25 GMT


Teng Yutong updated HIVE-6584:

    Attachment: HIVE-6584.5.patch


this patch is my current workaround when dealing with HBase snapshot.

but in order to make this patch work, still some changes is needed on the HBase side (change
the visible descriptor of mapreduce.TableMapReduceUitls.convertStringToScan and mapreduce.TableSnapshotInputFormat.TableSnapshotRegionSplit
 into public). Since there is no issue related to this in HBase JIRA, so i haven't create
a patch for these changes.

> Add HiveHBaseTableSnapshotInputFormat
> -------------------------------------
>                 Key: HIVE-6584
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: HBase Handler
>            Reporter: Nick Dimiduk
>            Assignee: Nick Dimiduk
>             Fix For: 0.14.0
>         Attachments: HIVE-6584.0.patch, HIVE-6584.1.patch, HIVE-6584.2.patch, HIVE-6584.3.patch,
HIVE-6584.4.patch, HIVE-6584.5.patch
> HBASE-8369 provided mapreduce support for reading from HBase table snapsopts. This allows
a MR job to consume a stable, read-only view of an HBase table directly off of HDFS. Bypassing
the online region server API provides a nice performance boost for the full scan. HBASE-10642
is backporting that feature to 0.94/0.96 and also adding a {{mapred}} implementation. Once
that's available, we should add an input format. A follow-on patch could work out how to integrate
this functionality into the StorageHandler, similar to how HIVE-6473 integrates the HFileOutputFormat
into existing table definitions.

This message was sent by Atlassian JIRA

View raw message