flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shimin Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-10245) Add DataStream HBase Sink
Date Mon, 10 Sep 2018 02:30:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608642#comment-16608642
] 

Shimin Yang commented on FLINK-10245:
-------------------------------------

Hi [~hequn8128],

For the comments you mentioned last time, I looked into the HBase client implementation and
think that I can add a scheduler to flush the data periodically by the time set by user.

I am not very sure about should I replace the api with Hbase batch api since it already provided
buffer and flush functionality. 

And if I stick with this api, I think it's hard to deduplicate data using rowkey as it is
buffered in the BufferedMutator in HBase client and there's no deletion of Mutator function
provided.

What do you think?

Best

Shimin

> Add DataStream HBase Sink
> -------------------------
>
>                 Key: FLINK-10245
>                 URL: https://issues.apache.org/jira/browse/FLINK-10245
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming Connectors
>            Reporter: Shimin Yang
>            Assignee: Shimin Yang
>            Priority: Major
>              Labels: pull-request-available
>
> Design documentation: [https://docs.google.com/document/d/1of0cYd73CtKGPt-UL3WVFTTBsVEre-TNRzoAt5u2PdQ/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message