spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen (JIRA)" <>
Subject [jira] [Resolved] (SPARK-22991) High read latency with spark streaming 2.2.1 and kafka
Date Tue, 16 Jan 2018 13:35:00 GMT


Sean Owen resolved SPARK-22991.
    Resolution: Not A Problem

> High read latency with spark streaming 2.2.1 and kafka
> ---------------------------------------------------------------
>                 Key: SPARK-22991
>                 URL:
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams
>    Affects Versions: 2.2.1
>            Reporter: Kiran Shivappa Japannavar
>            Priority: Critical
> Spark 2.2.1 + Kafka 0.10 + Spark streaming.
> Batch duration is 1s, Max rate per partition is 500, poll interval is 120 seconds, max
poll records is 500 and no of partitions in Kafka is 500, enabled cache consumer.
> While trying to read data from Kafka we are observing very high read latencies intermittently.The
high latencies results in Kafka consumer session expiration and hence the Kafka brokers removes
the consumer from the group. The consumer keeps retrying and finally fails with the
> [org.apache.kafka.clients.NetworkClient] - Disconnecting from node 12 due to request
> [org.apache.kafka.clients.NetworkClient] - Cancelled request ClientRequest
> [org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient] - Cancelled FETCH
request ClientRequest.**
> Due to this a lot of batches are in the queued state.
> The high read latencies are occurring whenever multiple clients are parallelly trying
to read the data from the same Kafka cluster. The Kafka cluster is having a large number of
brokers and can support high network bandwidth.
> When running with spark 1.5 and Kafka 0.8 consumer client against the same Kafka cluster
we are not seeing any read latencies.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message