flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benchao Li (Jira)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-16497) Improve default flush strategy for JDBC sink to make it work out-of-box
Date Mon, 01 Jun 2020 02:06:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-16497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120696#comment-17120696

Benchao Li commented on FLINK-16497:

[~sunjincheng121] Thanks for addressing this issue, I like the idea to improve user experience
out-of-box. However I'm a little hesitate to change to 1 row by default. I prefer to change
the default flush interval.

The reason is if we change 1 row for flush size, then it will have a very low throughput for
larger dataset. If we change the default flush interval, like 1s or 2s, then it will have
a good performance for both very small dataset and larger dataset.


> Improve default flush strategy for JDBC sink to make it work out-of-box
> -----------------------------------------------------------------------
>                 Key: FLINK-16497
>                 URL: https://issues.apache.org/jira/browse/FLINK-16497
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / JDBC, Table SQL / Ecosystem
>            Reporter: Jark Wu
>            Priority: Major
>             Fix For: 1.11.0
> Currently, JDBC sink provides 2 flush options:
> {code}
> 'connector.write.flush.max-rows' = '5000', -- default is 5000
> 'connector.write.flush.interval' = '2s', -- no default value
> {code}
> That means if flush interval is not set, the buffered output rows may not be flushed
to database for a long time. That is a surprising behavior because no results are outputed
by default. 
> So I propose to have a default flush '1s' interval for JDBC sink or default 1 row for
flush size. 

This message was sent by Atlassian Jira

View raw message