kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6551) Unbounded queues in WorkerSourceTask cause OutOfMemoryError
Date Fri, 23 Feb 2018 04:19:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373906#comment-16373906

Ewen Cheslack-Postava commented on KAFKA-6551:

Seems reasonable – this should only be an issue if producing to the topic is failing and
we generate a large backlog, but very good point that this should be bounded, at least roughly,
and pause poll()ing until it is resolved. A bit hard to say what the right metric for measurement
is since this holds onto the entire record. Maybe # of records will work in practice just
because you can set it to a reasonable default and never think about it again while still
not hitting any OOMs. But any large messages could make that assumption fail.

> Unbounded queues in WorkerSourceTask cause OutOfMemoryError
> -----------------------------------------------------------
>                 Key: KAFKA-6551
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6551
>             Project: Kafka
>          Issue Type: Bug
>          Components: KafkaConnect
>            Reporter: Gunnar Morling
>            Priority: Major
> A Debezium user reported an {{OutOfMemoryError}} to us, with over 50,000 messages in
the {{WorkerSourceTask#outstandingMessages}} map.
> This map is unbounded and I can't see any way of "rate limiting" which would control
how many records are added to it. Growth can only indirectly be limited by reducing the offset
flush interval, but as connectors can return large amounts of messages in single {{poll()}}
calls that's not sufficient in all cases. Note the user reported this issue during snapshotting
a database, i.e. a high number of records arrived in a very short period of time.
> To solve the problem I'd suggest to make this map backpressure-aware and thus prevent
its indefinite growth, so that no further records will be polled from the connector until
messages have been taken out of the map again.

This message was sent by Atlassian JIRA

View raw message