hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chetan Khatri <chetan.opensou...@gmail.com>
Subject Approach: Incremental data load from HBASE
Date Wed, 21 Dec 2016 10:28:59 GMT
Hello Guys,

I would like to understand different approach for Distributed Incremental
load from HBase, Is there any *tool / incubactor tool* which satisfy
requirement ?

*Approach 1:*

Write Kafka Producer and maintain manually column flag for events and
ingest it with Linkedin Gobblin to HDFS / S3.

*Approach 2:*

Run Scheduled Spark Job - Read from HBase and do transformations and
maintain flag column at HBase Level.

In above both approach, I need to maintain column level flags. such as 0 -
by default, 1-sent,2-sent and acknowledged. So next time Producer will take
another 1000 rows of batch where flag is 0 or 1.

I am looking for best practice approach with any distributed tool.


- Chetan Khatri

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message