flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: JDBCInputFormat GC overhead limit exceeded error
Date Wed, 20 Jan 2016 18:28:13 GMT
Super, thanks for finding this. Makes a lot of sense to have a result set
that does hold onto data.

Would be great if you could open a pull request with this fix, as other
users will benefit from that as well!

Cheers,
Stephan


On Wed, Jan 20, 2016 at 6:03 PM, Maximilian Bode <
maximilian.bode@tngtech.com> wrote:

> Hi Stephan,
>
> thanks for your fast answer. Just setting the Flink-managed memory to a
> low value would not have worked for us, as we need joins etc. in the same
> job.
>
> After investigating the JDBCInputFormat, we found the line
> statement = dbConn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
> ResultSet.CONCUR_READ_ONLY);
> to be the culprit; to be more exact, the scrollable result set. When
> replaced with TYPE_FORWARD_ONLY, some changes have to be made to
> nextRecord() and reachedEnd(), but this does the job without memory leak.
>
> Another change that might be useful (as far as performance is concerned)
> is disabling autocommits and letting users decide the fetchSize (somewhat
> in parallel to batchInterval in JDBCOutputFormat).
>
> Cheers,
> Max
> —
> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com * 0176
> 1000 75 50
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
> Am 19.01.2016 um 21:26 schrieb Stephan Ewen <sewen@apache.org>:
>
> Hi!
>
> This kind of error (GC overhead exceeded) usually means that the system is
> reaching a state where it has very many still living objects and frees
> little memory during each collection. As a consequence, it is basically
> busy with only garbage collection.
>
> Your job probably has about 500-600 MB or free memory, the rest is at that
> memory size reserved for JVM overhead and Flink's worker memory.
> Now, since your job actually does not keep any objects or rows around,
> this should be plenty. I can only suspect that the Oracle JDBC driver is
> very memory hungry, thus pushing the system to the limit. (I found this,
> for example
>
> What you can do:
>  For this kind of job, you can simply tell Flink to not reserve as much
> memory, by using the option "taskmanager.memory.size=1". If the JDBC driver
> has no leak, but is simply super hungry, this should solve it.
>
> Greetings,
> Stephan
>
>
> I also found these resources concerning Oracle JDBC memory:
>
>  -
> http://stackoverflow.com/questions/2876895/oracle-t4cpreparedstatement-memory-leaks
> (bottom answer)
>  - https://community.oracle.com/thread/2220078?tstart=0
>
>
> On Tue, Jan 19, 2016 at 5:44 PM, Maximilian Bode <
> maximilian.bode@tngtech.com> wrote:
>
>> Hi Robert,
>>
>> I am using 0.10.1.
>>
>>
>> Am 19.01.2016 um 17:42 schrieb Robert Metzger <rmetzger@apache.org>:
>>
>> Hi Max,
>>
>> which version of Flink are you using?
>>
>> On Tue, Jan 19, 2016 at 5:35 PM, Maximilian Bode <
>> maximilian.bode@tngtech.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I am facing a problem using the JDBCInputFormat which occurred in a
>>> larger Flink job. As a minimal example I can reproduce it when just writing
>>> data into a csv after having read it from a database, i.e.
>>>
>>> DataSet<Tuple1<String>> existingData = env.createInput(
>>> JDBCInputFormat.buildJDBCInputFormat()
>>> .setDrivername("oracle.jdbc.driver.OracleDriver")
>>> .setUsername(…)
>>> .setPassword(…)
>>> .setDBUrl(…)
>>> .setQuery("select DATA from TABLENAME")
>>> .finish(),
>>> new TupleTypeInfo<>(Tuple1.class, BasicTypeInfo.STRING_TYPE_INFO));
>>> existingData.writeAsCsv(…);
>>>
>>> where DATA is a column containing strings of length ~25 characters and
>>> TABLENAME contains 20 million rows.
>>>
>>> After starting the job on a YARN cluster (using -tm 3072 and leaving the
>>> other memory settings at default values), Flink happily goes along at first
>>> but then fails after something like three million records have been sent by
>>> the JDBCInputFormat. The Exception reads "The slot in which the task was
>>> executed has been released. Probably loss of TaskManager …". The local
>>> taskmanager.log in the affected container reads
>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded
>>>         at
>>> java.util.Collections$UnmodifiableCollection.iterator(Collections.java:1063)
>>>
>>> at org.jboss.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:119)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
>>>         at
>>> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>>>         at
>>> org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>         at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>         at java.lang.Thread.run(Thread.java:744)"
>>>
>>> Any ideas what is going wrong here?
>>>
>>> Cheers,
>>> Max
>>>
>>> —
>>> Maximilian Bode * Junior Consultant * maximilian.bode@tngtech.com
>>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>>
>>>
>>
>>
>
>

Mime
View raw message