beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksandr <>
Subject Re: Reducing database connection with JdbcIO
Date Wed, 14 Mar 2018 11:01:45 GMT
We did own jdbcio with thread pool per jwm (using lazy initialization in
@Setup). In processElement we are getting/freeing connection.

Best Regards,
Aleksandr Gortujev.

14. märts 2018 12:49 PM kirjutas kuupäeval "Derek Chan" <


We are new to Beam and need some help.

We are working on a flow to ingest events and writes the aggregated counts
to a database. The input rate is rather low (~2000 message per sec), but
the processing is relatively heavy, that we need to scale out to 5~6 nodes.
The output (via JDBC) is aggregated, so the volume is also low. But because
of the number of workers, it keeps 3000 connections to the database and it
keeps hitting the database connection limits.

Is there a way that we can reduce the concurrency only at the output stage?
(In Spark we would have done a repartition/coalesce).

And, if it matters, we are using Apache Beam 2.2 via Scio, on Google

Thank you in advance!

View raw message