hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxman <lakshman...@huawei.com>
Subject Faster Bulkload from Oracle to HBase
Date Mon, 30 Jan 2012 13:48:22 GMT

We have the following use-case.

We have data in relational database (Oracle).
We need to export this data to HBase and perform analysis on this data.
We need to perform this export-import 500G periodically, say every month.

Following are the different approaches I can see as per my knowledge.
Before testing and finding out the best way by myself, I wanted to listen
from the experts here.

Approach #1
1) Export from Oracle to raw text file (Using Oracle export utility - Faster
- Involves no transactional overhead)

2) Upload text file to HDFS

3) Run the bulk load job (HFileOutputFormat.configureIncrementalLoad())

Approach #2
1) Write a custom Job using DBInputFormat to directly read from database.
	- Just a thought to avoid multiple hops(Oracle to Local FS, Local FS
to HDFS, HDFS to HBase) involved in approach #1.

2) Use the HBase bulk load tool to load this data to

Approach #3
1) Use Apache Sqoop (Currently under incubation) to achieve my requirement.
	- I'm not aware of the istability of this.

Also, please suggest me if we have a better approach than the above.

View raw message