spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuval Itzchakov (JIRA)" <>
Subject [jira] [Created] (SPARK-21873) CachedKafkaConsumer throws NonLocalReturnControl during fetching from Kafka
Date Wed, 30 Aug 2017 08:42:00 GMT
Yuval Itzchakov created SPARK-21873:

             Summary: CachedKafkaConsumer throws NonLocalReturnControl during fetching from
                 Key: SPARK-21873
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 2.2.0, 2.1.1, 2.1.0
            Reporter: Yuval Itzchakov
            Priority: Minor

In Scala, using `return` inside a function causes a `NonLocalReturnControl` exception to be
thrown and caught in order to escape the current scope.

While profiling Structured Streaming in production, it clearly shows:


This happens during a 1 minute profiling session on a single executor. The code is:

while (toFetchOffset != UNKNOWN_OFFSET) {
      try {
        return fetchData(toFetchOffset, untilOffset, pollTimeoutMs, failOnDataLoss)
      } catch {
        case e: OffsetOutOfRangeException =>
          // When there is some error thrown, it's better to use a new consumer to drop all
          // states in the old consumer. We don't need to worry about the performance because
          // is not a common path.
          reportDataLoss(failOnDataLoss, s"Cannot fetch offset $toFetchOffset", e)
          toFetchOffset = getEarliestAvailableOffsetBetween(toFetchOffset, untilOffset)

This happens because this method is converted to a function which is ran inside:

private def runUninterruptiblyIfPossible[T](body: => T): T

We should avoid using `return` in general, and here specifically as it is a hot path for applications
using Kafka.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message