beam-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Kirpichov (Jira)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-12044) JdbcIO should explicitly setAutoCommit to false
Date Thu, 25 Mar 2021 18:06:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-12044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308870#comment-17308870
] 

Eugene Kirpichov commented on BEAM-12044:
-----------------------------------------

[~aromanenko] my take: it is possible, but I think Sylvain's PR improves the default behavior
for all users, so we should do it by default. Because this is in JdbcIO.read(), there are
no commits happening anyway, so this should be purely a performance improvement with no semantic
changes.

Actually there's one possible caveat that I'll mention on the PR in a moment.

> JdbcIO should explicitly setAutoCommit to false
> -----------------------------------------------
>
>                 Key: BEAM-12044
>                 URL: https://issues.apache.org/jira/browse/BEAM-12044
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>    Affects Versions: 2.28.0
>            Reporter: Sylvain Veyrié
>            Priority: P2
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hello,
> Per [PostgreSQL JDBC documentation|https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor],
autocommit must be explicitly disabled on the connection to allow cursor streaming.
> [~jkff] mentionned it [on the mailing list|https://www.mail-archive.com/dev@beam.apache.org/msg16808.html],
however even if there is:
> {code:java}
> poolableConnectionFactory.setDefaultAutoCommit(false);
> {code}
> in JdbcIO:1555, currently, at least with JDBC driver 42.2.16, any read with JdbcIO will
memoize the whole dataset (which leads to OOM), since 
> {code:java}
> connection.getAutoCommit()
> {code}
> returns true in JdbcIO#ReadFn#processElement.
> I can provide a PR — the patch is pretty simple (and solves the problem for us in 2.28.0):
> {code:java}
> if (connection == null) {
>         connection = dataSource.getConnection();
> }
> connection.setAutoCommit(false); // line added
> {code}
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message