flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wangsan <wamg...@163.com>
Subject Re: Confusions About JDBCOutputFormat
Date Tue, 10 Jul 2018 16:12:15 GMT
Hi Hequn,

Establishing a connection for each batch write may also have idle connection problem, since
we are not sure when the connection will be closed. We call flush() method when a batch is
finished or  snapshot state, but what if the snapshot is not enabled and the batch size not
reached before the connection is closed?

May be we could use a Timer to test the connection periodically and keep it alive. What do
you think?

I will open a jira and try to work on that issue.

Best, 
wangsan



> On Jul 10, 2018, at 8:38 PM, Hequn Cheng <chenghequn@gmail.com> wrote:
> 
> Hi wangsan,
> 
> I agree with you. It would be kind of you to open a jira to check the problem.
> 
> For the first problem, I think we need to establish connection each time execute batch
write. And, it is better to get the connection from a connection pool.
> For the second problem, to avoid multithread problem, I think we should synchronized
the batch object in flush() method.
> 
> What do you think?
> 
> Best, Hequn
> 
> 
> 
> On Tue, Jul 10, 2018 at 2:36 PM, wangsan <wamgsam@163.com <mailto:wamgsam@163.com>>
wrote:
> Hi all,
> 
> I'm going to use JDBCAppendTableSink and JDBCOutputFormat in my Flink application. But
I am confused with the implementation of JDBCOutputFormat.
> 
> 1. The Connection was established when JDBCOutputFormat is opened, and will be used all
the time. But if this connction lies idle for a long time, the database will force close the
connetion, thus errors may occur.
> 2. The flush() method is called when batchCount exceeds the threshold, but it is also
called while snapshotting state. So two threads may modify upload and batchCount, but without
synchronization.
> 
> Please correct me if I am wrong.
> 
> ——
> wangsan
> 


Mime
View raw message